Skip to main content

Mojo package

sm100

NVIDIA SM100 (Blackwell) attention kernels.

Covers MHA (flash-attention v4) and MLA (multi-head latent attention) for both prefill and decode, including FP8 and block-scaled quantization variants.

Modules

Was this page helpful?