For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo struct
ScheduleConfig
struct ScheduleConfig
Tunable parameters for schedule generation.
Controls the structural decisions in program builders: scheduling strategy, barrier placement, and drain behavior.
The default configuration auto-derives wait counts from the program structure (Halide-inspired: declare intent, derive consequences). Manual wait overrides are available for testing and experimentation but should not be needed for correct operation.
Fieldsβ
- βscheduling (
SchedulingStrategy): The strategy for op ordering (IDENTITY,GREEDY, orCSP). - βsched_barrier_mask (
Int): The bitmask of blocks that get trailingschedule_barrierfences. Default:0b01010101(blocks 0, 2, 4, 6). - βauto_waits (
Bool): Whether to auto-derive wait counts from schedule order. - βdrain_lgkm_mask (
Int): The per-block bitmask for selective LDS drains. - βauto_drain (
Bool): Whether to auto-derive the drain mask from channel analysis. - βlds_contention_penalty (
Int): The CSP solver penalty for LDS port overlap. - βwait_lgkm_first (
Int): The manualwait_lgkm(N)override (used whenauto_waits=False). - βwait_vm_last (
Int): The manualwait_vm(N)override (used whenauto_waits=False). - βlgkm_per_load_a (
Int): The number oflgkmcntops per channel-A load (for wait derivation). - βlgkm_per_load_b (
Int): The number oflgkmcntops per channel-B load (for wait derivation). - βlgkm_after_last (
Bool): Whether to insertwait_lgkm(0)after the last block barrier. - βminimal_barriers (
Bool): Whether to suppress per-blocks_barriers andset_priopairs and emits_barrieronly at top-of-half and the first cross-stage block. - βomit_mma_set_prio (
Bool): Whenminimal_barriers=True, whether to drop the pre-MMAs_setprio[1]entirely so the LLVM register allocator can reuse VGPRs across it. - βmax_vgpr (
Int): The hint for the cost model on the kernel's VGPR budget. Default is effectively unlimited. - βglobal_before_frag (
Bool): Whether to swap the in-block emission order of global loads and fragment loads. Default emits frags first then prefetches. - βbarrier_before_pre_ops (
Bool): Whether to move the per-blockpre_sync+ barrier section before the frag/prefetch section instead of between prefetch and MMA. - βinter_block_lgkm_drain (
Bool): Whether to populateentry_wait_lgkmon non-top, non-cross-stage blocks withwait_lgkm(0)so an inter-mini LDS drain fires between consecutive same-half MMAs. - βwrap_waits_with_sched_barrier (
Bool): Whether to wrap each contiguous wait/barrier group withschedule_barrier()on both sides to fence the LLVM machine scheduler. - βpartial_prologue_drain (
Bool): Whether to skip the framework prologue'swait_vm(0)drains and inter-stage barrier so prefetches stay in flight on entry to the kernel.
Implemented traitsβ
AnyType,
Copyable,
ImplicitlyCopyable,
ImplicitlyDeletable,
Movable
Methodsβ
__init__β
def __init__(out self, *, scheduling: SchedulingStrategy = SchedulingStrategy.IDENTITY, sched_barrier_mask: Int = 85, auto_waits: Bool = True, drain_lgkm_mask: Int = 0, auto_drain: Bool = False, lds_contention_penalty: Int = 0, wait_lgkm_first: Int = 8, wait_vm_last: Int = 6, lgkm_per_load_a: Int = 0, lgkm_per_load_b: Int = 0, lgkm_after_last: Bool = False, minimal_barriers: Bool = False, omit_mma_set_prio: Bool = False, max_vgpr: Int = 999999, global_before_frag: Bool = False, barrier_before_pre_ops: Bool = False, inter_block_lgkm_drain: Bool = False, partial_prologue_drain: Bool = False, wrap_waits_with_sched_barrier: Bool = False)
from_strategiesβ
static def from_strategies(*, scheduling: SchedulingStrategy = SchedulingStrategy.IDENTITY, max_vgpr: Int = 999999, lds_contention_penalty: Int = 0, minimal_barriers: Bool = False, omit_mma_set_prio: Bool = False, sched_barrier_mask: Int = 85, wrap_waits_with_sched_barrier: Bool = False, barrier_before_pre_ops: Bool = False, auto_waits: Bool = True, drain_lgkm_mask: Int = 0, auto_drain: Bool = False, wait_lgkm_first: Int = 8, wait_vm_last: Int = 6, lgkm_after_last: Bool = False, inter_block_lgkm_drain: Bool = False, partial_prologue_drain: Bool = False, global_before_frag: Bool = False, lgkm_per_load_a: Int = 0, lgkm_per_load_b: Int = 0) -> Self
Constructs a ScheduleConfig from grouped strategy values.
Equivalent to the flat-field constructor but groups related
flags by phase (barrier / wait / load). pipeline.strategies
provides named factories (BarrierStrategy.minimal_no_set_prio
etc.) that callers can spread into this constructor.
Existing flat-field callers continue to work unchanged.
Args:
- βscheduling (
SchedulingStrategy): CSP solver scheduling strategy. - βmax_vgpr (
Int): VGPR budget hint for the cost model. - βlds_contention_penalty (
Int): Penalty for LDS port overlap. - βminimal_barriers (
Bool): Suppress per-blocks_barriers andset_priopairs. - βomit_mma_set_prio (
Bool): Drop the pre-MMAs_setprio[1]whenminimal_barriers=True. - βsched_barrier_mask (
Int): Bitmask of which blocks get trailingschedule_barrierfences. - βwrap_waits_with_sched_barrier (
Bool): Wrap each contiguous wait/barrier group withschedule_barrier. - βbarrier_before_pre_ops (
Bool): Move pre_sync + barrier ahead of the frag/global section. - βauto_waits (
Bool): Auto-derive wait counts from program structure. - βdrain_lgkm_mask (
Int): Per-block bitmask for selective LDS drains. - βauto_drain (
Bool): Auto-derivedrain_lgkm_maskfrom channel analysis. - βwait_lgkm_first (
Int): Manualwait_lgkmoverride. - βwait_vm_last (
Int): Manualwait_vmoverride for the last block. - βlgkm_after_last (
Bool): Insertwait_lgkm(0)after the last block's barrier. - βinter_block_lgkm_drain (
Bool): Emitwait_lgkm(0)at non-top, non-cross interior block starts. - βpartial_prologue_drain (
Bool): Skipwait_vm(0)drains in the framework prologue. - βglobal_before_frag (
Bool): Emit globals before frags in each block. - βlgkm_per_load_a (
Int):lgkmcntentries per channel-A frag-load. - βlgkm_per_load_b (
Int):lgkmcntentries per channel-B frag-load.
Returns:
Self: A fully populated ScheduleConfig.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!