For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo function

quantize_dynamic_scaled_fp4fp8

def quantize_dynamic_scaled_fp4fp8[out_dtype: DType, scales_dtype: DType, in_dtype: DType, //, *, SF_VECTOR_SIZE: Int = Int(16), num_max_threads: Int = Int(512)](ctx: DeviceContext, output_tile: TileTensor[out_dtype, Storage=output_tile.Storage, linear_idx_type=output_tile.linear_idx_type, element_size=output_tile.element_size], scales_tile: TileTensor[scales_dtype, Storage=scales_tile.Storage, linear_idx_type=scales_tile.linear_idx_type, element_size=scales_tile.element_size], input_tile: TileTensor[in_dtype, Storage=input_tile.Storage, linear_idx_type=input_tile.linear_idx_type, element_size=input_tile.element_size], num_cols: Int, num_cols_padded: Int, tensor_sf: Float32 = 1)