IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo function

mma_block_interleave

mma_block_interleave[N: Int](logical: Pipe[N], config: PipelineConfig) -> Pipe[N]

Interleave ops across MMA blocks for latency hiding.

Takes the logical iteration in causal order โ€” what one ping-pong half computes:

global_loads โ†’ fragment_loads โ†’ MMAs

Distributes them across MMA blocks so fragment loads and global loads execute during MMA stalls. Fragment loads are placed just before their first consumer MMA. Global loads fill remaining slots.

Block sizing from config: heavy blocks (new M-tile row) vs light blocks (continuation). Fragment ordering from config: B-before-A or A-before-B.

Returns:

Pipe[N]