For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo function

quantize_static_scaled_fp8

def quantize_static_scaled_fp8[out_dtype: DType, in_dtype: DType, scale_is_inverted: Bool = True](out_tensor: TileTensor[out_dtype, Storage=out_tensor.Storage, address_space=out_tensor.address_space, linear_idx_type=out_tensor.linear_idx_type, element_size=out_tensor.element_size], in_tensor: TileTensor[in_dtype, Storage=in_tensor.Storage, address_space=in_tensor.address_space, linear_idx_type=in_tensor.linear_idx_type, element_size=in_tensor.element_size], scale: Float32, context: DeviceContext)

TileTensor implementation of static scaled FP8 quantization.