For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo module
msa_sm100_accum
MSA-private copy of the SM100 tcgen05 accumulator / operand machinery.
Duplicate of the dense SM100 MHA accumulator types from
nn.attention.gpu.nvidia.sm100.mha_1q, lifted here so the block-major forward
(msa_2q.mojo) can carry the num_m_mmas == 2 ping-pong without dragging the
dense MHA / MLA decode kernels through their full test matrix.
Structsβ
- β
MMAOperandOffsetFn: - β
MSASM100TensorAccumulatorSS: - β
MSASM100TensorAccumulatorTS: - β
RegisterAccumulatorDescription: - β
RegisterAccumulatorLayout: - β
TMemAccumulator: - β
TMemOperand: - β
UMMADescriptorSS: - β
UMMADescriptorTS:
Traitsβ
Functionsβ
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!