Python class
QuantFormat
QuantFormat
class max.nn.QuantFormat(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)
Bases: Enum
Identifies the quantization format of a model checkpoint.
BLOCKSCALED_FP8
BLOCKSCALED_FP8 = 'blockscaled-fp8'
FP8 quantization with block-level scaling.
COMPRESSED_TENSORS_FP8
COMPRESSED_TENSORS_FP8 = 'compressed-tensors-fp8'
FP8 quantization using the compressed-tensors format.
FBGEMM_FP8
FBGEMM_FP8 = 'fbgemm-fp8'
FP8 quantization using the FBGEMM format.
MXFP4
MXFP4 = 'mxfp4'
Microscaling FP4 (MX) quantization format.
NVFP4
NVFP4 = 'nvfp4'
NVIDIA FP4 quantization format.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!