IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo trait

PipelineSchedule

Pipeline schedule definition.

2 required methods define the kernel-specific logic:

  • config(): pipeline structure (depth, MMA grid, etc.)
  • build_body(): the pipelined loop body

4 optional methods have defaults:

  • derive_edges(): dependency edges (default: inferred from ops + config)
  • schedule_config(): tuning knobs (default: ScheduleConfig())
  • transform_kernel(): post-process kernel entries (default: identity)
  • bootstrap_frags(): post-prologue frag-loads (default: empty)

The framework owns all phase derivation (prologue, kernel, epilogue, kernel deps). For single-buffer (depth<2): default functions + optional transform_kernel hook. For double-buffer (depth>=2): builds a PipelineProgram and derives all phases from its block structure.

Implemented traits​

AnyType

Required methods​

config​

config(self: _Self) -> PipelineConfig

Pipeline configuration (depth, MMA grid, etc.).

Returns:

PipelineConfig

build_body​

build_body(self: _Self) -> List[OpDesc]

Construct the pipelined loop body (raw data ops).

Returns:

List[OpDesc]

Provided methods​

derive_edges​

derive_edges(self: _Self, body: List[OpDesc]) -> List[DepEdge]

Derive LDG dependency edges. Default: inferred from op types + config.

Returns:

List[DepEdge]

schedule_config​

schedule_config(self: _Self) -> ScheduleConfig

Tuning knobs for scheduling and wait derivation.

Returns:

ScheduleConfig

transform_kernel​

transform_kernel(self: _Self, ker: List[ScheduleEntry], body: List[OpDesc]) -> List[ScheduleEntry]

Post-process kernel entries (e.g., append AMD schedule hints).

Default: identity (return entries unchanged). Override for kernel-specific transformations that don't fit in the framework.

Returns:

List[ScheduleEntry]

build_explicit_blocks​

build_explicit_blocks(self: _Self, body: List[OpDesc], program: PipelineProgram) -> List[List[OpDesc]]

Returns optional per-block explicit op lists, bypassing MMABlockSpec.expand's flag-driven template.

Return one List[OpDesc] per block (in block order). Empty entries fall back to the template; non-empty entries are emitted verbatim by PipelineProgram.expand_to_list.

Called after the framework has populated program.blocks (frag/load/mma grouping + wait derivation) so the schedule can read out the analyzed block structure when constructing its explicit op lists.

Use when a schedule's emission shape can't be expressed cleanly via the existing flag set (global_before_frag, barrier_before_pre_ops, wrap_waits_with_sched_barrier, etc.). The schedule constructs the exact op sequence it wants; the framework consumes it without templated reordering.

Default: empty list (= every block uses the flag-driven template, current behaviour).

Args:

  • ​body (List[OpDesc]): Loop body op list, as produced by build_body.
  • ​program (PipelineProgram): The framework-built PipelineProgram whose blocks have already been populated with frag/load/mma analysis.

Returns:

List[List[OpDesc]]: One List[OpDesc] per program block, or an empty list to fall back entirely to the template path.

bootstrap_frags​

bootstrap_frags(self: _Self) -> List[OpDesc]

Returns optional fragment loads to issue at the prologue tail.

Each bootstrap frag is emitted by the framework as wait_vm(N) + barrier + frag where N partial-drains the vmcnt down to leave exactly the prefetch this frag depends on completed (and the rest in flight). The i-th bootstrap frag targets the i-th prefetch in prologue order; per-frag drain values are derived from cumulative prefetch vm_cost.

Use for kernels whose first main-loop iter expects same-stage leading-quadrant frags pre-loaded β€” e.g. cross-stage rotation patterns where the body's sub=0 frags read the cross stage, so the same-stage versions need explicit bootstrap.

Only fires when ScheduleConfig.partial_prologue_drain=True.

Default: empty (no bootstrap).

Returns:

List[OpDesc]: Fragment-load ops emitted at the prologue tail, in prologue order.