Mojo module
mxfp4_matmul_sm90
MXFP4 matmul on H100 (SM90) via dequant-to-FP8 + FP8 GEMM.
Dequantizes MXFP4 weights to FP8, then uses the SM90 warp-specialized FP8 GEMM. Activations (BF16) are cast to FP8 on-the-fly.
Functions
-
mxfp4_matmul_sm90: MXFP4 matmul: dequant B weights to FP8, cast A to FP8, SM90 FP8 GEMM.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!