IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo struct

DeclarativeSchedule

struct DeclarativeSchedule[is_fp8: Bool, lgkm_a: Int, lgkm_b: Int]

Constraint-based pipeline: algorithm declares ops, target supplies costs.

The algorithm specifies WHAT ops exist β€” just the tag and buffer metadata (stage, subtile, channel, k_offset). No resource kinds, no latencies, no roles.

The TargetProfile specifies HOW the hardware executes them β€” per-op costs (resource, latency, role) via the cost model, and pipeline structure (depth, MMA grid, buffer strategy) via the pipeline config. One factory call (e.g., mi355x_target()) provides everything.

The framework then:

  1. Annotates logical ops with the cost model (annotate_ops)
  2. Reorders into MMA-block-interleaved execution order (double_buffer_reorder)
  3. Derives optimal scheduling via CSP backtracking (optimal_schedule_with_halves)
  4. Derives wait counts from schedule order (derive_wait_counts)

Usage: comptime schedule = build_scheduleis_fp8, lgkm_a, lgkm_b

Implemented traits​

AnyType, ImplicitlyDeletable, PipelineSchedule

Methods​

__init__​

def __init__(out self, config: ScheduleConfig = ScheduleConfig(scheduling=SchedulingStrategy.CSP, sched_barrier_mask=Int(85), auto_waits=True, drain_lgkm_mask=Int(0), auto_drain=False, lds_contention_penalty=Int(0), wait_lgkm_first=Int(8), wait_vm_last=Int(6), lgkm_per_load_a=Int(0), lgkm_per_load_b=Int(0), lgkm_after_last=False, minimal_barriers=False, omit_mma_set_prio=False, max_vgpr=Int(999999), global_before_frag=False, barrier_before_pre_ops=False, inter_block_lgkm_drain=False, partial_prologue_drain=False, wrap_waits_with_sched_barrier=False), target: TargetProfile = mi355x_target(Int(4), Int(4), Int(1)))

def __init__(out self, config: ScheduleConfig, hw_config: PipelineConfig, cost_model: TargetCostModel)

config​

def config(self) -> PipelineConfig

Returns:

PipelineConfig

declare_ops​

def declare_ops(self) -> List[OpDesc]

Algorithm description: WHAT ops exist, with buffer metadata only.

Returns 24 logical ops (2 halves x 12 ops) from the ping-pong op table. No resource kinds, no latencies, no roles β€” those come from the TargetProfile's cost model. See _logical_half() for the table.

Returns:

List[OpDesc]

build_body​

def build_body(self) -> List[OpDesc]

Apply cost model to logical ops, then reorder for execution.

  1. declare_ops() β†’ logical ops (no hardware costs)
  2. annotate_ops() β†’ ops with resource/latency/role from cost model
  3. double_buffer_reorder() β†’ MMA-block-interleaved execution order

Returns:

List[OpDesc]

schedule_config​

def schedule_config(self) -> ScheduleConfig

Return tuning knobs with lgkm counts from type params.

Returns:

ScheduleConfig