IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python class

QuantFormat

QuantFormat​

class max.nn.QuantFormat(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

source

Bases: Enum

Identifies the quantization format of a model checkpoint.

BLOCKSCALED_FP8​

BLOCKSCALED_FP8 = 'blockscaled-fp8'

source

FP8 quantization with block-level scaling.

COMPRESSED_TENSORS_FP8​

COMPRESSED_TENSORS_FP8 = 'compressed-tensors-fp8'

source

FP8 quantization using the compressed-tensors format.

FBGEMM_FP8​

FBGEMM_FP8 = 'fbgemm-fp8'

source

FP8 quantization using the FBGEMM format.

MXFP4​

MXFP4 = 'mxfp4'

source

Microscaling FP4 (MX) quantization format.

NVFP4​

NVFP4 = 'nvfp4'

source

NVIDIA FP4 quantization format.