Skip to main content

Mojo function

quantize_dynamic_block_scaled

quantize_dynamic_block_scaled[out_dtype: DType, scales_dtype: DType, in_dtype: DType, //, *, SF_VECTOR_SIZE: Int, target: StringSlice[StaticConstantOrigin] = "cpu"](output_device: NDBuffer[out_dtype, MutAnyOrigin, output_device.shape, DimList.create_unknown[2]()], scales_device: NDBuffer[scales_dtype, MutAnyOrigin, scales_device.shape, DimList.create_unknown[5]()], input_device: NDBuffer[in_dtype, ImmutAnyOrigin, input_device.shape, DimList.create_unknown[2]()], tensor_sf: Float32, ctx: DeviceContext)

NDBuffer overload of quantize_dynamic_block_scaled. Converts to TileTensor and delegates.

quantize_dynamic_block_scaled[out_dtype: DType, scales_dtype: DType, in_dtype: DType, //, *, SF_VECTOR_SIZE: Int, target: StringSlice[StaticConstantOrigin] = "cpu"](output_device: TileTensor[out_dtype, output_device.LayoutType, output_device.origin, linear_idx_type=output_device.linear_idx_type, element_size=output_device.element_size], scales_device: TileTensor[scales_dtype, scales_device.LayoutType, scales_device.origin, linear_idx_type=scales_device.linear_idx_type, element_size=scales_device.element_size], input_device: TileTensor[in_dtype, input_device.LayoutType, input_device.origin, linear_idx_type=input_device.linear_idx_type, element_size=input_device.element_size], tensor_sf: Float32, ctx: DeviceContext)

Was this page helpful?