Mojo module
mxfp4_dequant
MXFP4 dequantization kernel for H100 (SM90).
Converts packed MXFP4 weights (uint8, 2 FP4 values per byte) with E8M0 block scales into float8_e4m3fn or bfloat16.
Scales are in 2D layout [N, K/SF_VECTOR_SIZE] where each scale covers SF_VECTOR_SIZE (32) consecutive elements.
Functions
-
dequant_mxfp4: Dequantize MXFP4 packed weights to FP8 or BF16.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!