Skip to main content

Python class

QuantFormat

QuantFormat

class max.nn.QuantFormat(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

source

Bases: Enum

Identifies the quantization format of a model checkpoint.

BLOCKSCALED_FP8

BLOCKSCALED_FP8 = 'blockscaled-fp8'

source

FP8 quantization with block-level scaling.

COMPRESSED_TENSORS_FP8

COMPRESSED_TENSORS_FP8 = 'compressed-tensors-fp8'

source

FP8 quantization using the compressed-tensors format.

FBGEMM_FP8

FBGEMM_FP8 = 'fbgemm-fp8'

source

FP8 quantization using the FBGEMM format.

MXFP4

MXFP4 = 'mxfp4'

source

Microscaling FP4 (MX) quantization format.

NVFP4

NVFP4 = 'nvfp4'

source

NVIDIA FP4 quantization format.