Skip to main content

Mojo module

mxfp4_grouped_matmul_amd

Native MXFP4 grouped matmul on AMD CDNA4 via block-scaled MFMA.

Grouped matmul for Mixture of Experts (MoE): for i in range(num_active_experts): C[offsets[i]:offsets[i+1], :] = A[offsets[i]:offsets[i+1], :] @ B[expert_ids[i], :, :].T

Uses block_idx.z for expert dispatch and MXFP4MatmulAMD.run per-expert.

Entry point: mxfp4_grouped_matmul_amd()

Functionsโ€‹