For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo function
mma_block_interleave
mma_block_interleave[N: Int](logical: Pipe[N], config: PipelineConfig) -> Pipe[N]
Interleave ops across MMA blocks for latency hiding.
Takes the logical iteration in causal order โ what one ping-pong half computes:
global_loads โ fragment_loads โ MMAsDistributes them across MMA blocks so fragment loads and global loads execute during MMA stalls. Fragment loads are placed just before their first consumer MMA. Global loads fill remaining slots.
Block sizing from config: heavy blocks (new M-tile row) vs light blocks (continuation). Fragment ordering from config: B-before-A or A-before-B.
Returns:
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!