For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo package
msa
SM100 (Blackwell) sparse multi-head attention (MSA) kernels.
Unified per-token prefill + decode fork (msa_1q). Kept as a sibling of the
nn package rather than nested inside it so the block-sparse MSA dispatch
surface stays self-contained.
Modulesβ
- β
k2q_csr: Host reverse-CSR builder for KV-block-major sparse MHA, SM100. - β
k2q_csr_device: Device (GPU) reverse-CSR builder for KV-block-major sparse MHA. - β
msa_1q: Unified per-token BLOCK-sparse MHA (MSA) prefill + decode kernel for SM100 (B200), BF16 Q/K/V, D=128. - β
msa_2q: KV-block-major sparse MHA (MSA) forward kernel for SM100 (B200), BF16, D=128. - β
msa_combine: MSA combine (LSE-merge) kernel for SM100 (B200). - β
msa_sm100_accum: MSA-private copy of the SM100 tcgen05 accumulator / operand machinery. - β
sparse_indexer_decode: Decode-path MiniMax-M3 sparse-attention (MSA) indexer (selection only). - β
sparse_indexer_prefill: Prefill-path MiniMax-M3 sparse-attention (MSA) indexer.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!