For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo function
emit_minimal_barrier_block
emit_minimal_barrier_block(block: MMABlockSpec, wrap_waits: Bool, global_before_frag: Bool = False) -> List[OpDesc]
Emit one block in the "minimal-barrier + cross-stage rotation" shape β for schedules that override build_explicit_blocks.
Layout (per block):
- Sync-group A:
[sched_barrier]entry_waitentry_wait_lgkm[sched_barrier]. Fences emitted iffwrap_waits=Trueand either entry-wait field is present. - Load section: frags + globals, in order controlled by
global_before_frag(False = ping-pong default; True = load-before-frag for kernels like 4-wave inline that benefit). - Sync-group B:
[sched_barrier]pre_sync[barrier post_barrier_lgkm][sched_barrier]. Fences emitted iffwrap_waits=Trueand either pre_sync or pre_mma_barrier is present. - Final
mma.
Reads wait values, frag/global ops, barrier flags from
block β typically populated by _construct_mma_blocks and
patched by derive_waits_from_blocks. Schedules consume the
derived structure and emit it in their preferred order, without
the conditional template branching of MMABlockSpec.expand.
Bypasses pre_mma_set_prio / post_mma_* / fused_mma / drain
flags β those don't apply under the minimal-barriers pattern.
Schedules with different needs should write their own emitter.
Args:
- βblock (
MMABlockSpec): The block spec to emit. - βwrap_waits (
Bool): Wrap each contiguous wait/barrier group withschedule_barrieron both sides. - βglobal_before_frag (
Bool): Load globals before frags inside the block.
Returns:
List[OpDesc]: Ordered op list for the block, ready to be appended to a
per-block emission override.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!