For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo function

optimal_schedule_with_halves

def optimal_schedule_with_halves(body: LoopBody, max_globals_per_block: Int = Int(0), lds_contention_penalty: Int = Int(0), max_vgpr: Int = Int(999999)) -> List[Int]

Optimal scheduler with half-isolation constraint.

Positions [0, N/2) draw from ops [0, N/2), positions [N/2, N) draw from ops [N/2, N). This mirrors the greedy_schedule() half-isolation for the ping-pong kernel but uses exhaustive backtracking instead of greedy selection.

Half isolation preserves the warp-group structure that build_program_from_ldg_ordered() requires: first N/2 ops map to blocks 0-3 (warp group 0), last N/2 to blocks 4-7 (warp group 1).

Pruning optimizations for compile-time feasibility:

Seed best_cost from greedy greedy_schedule() — provides a strong upper bound that prunes most branches immediately.
Early termination when best_cost == lower_bound and pressure is at the floor — provably optimal, stop searching.
Incremental VGPR pressure tracking — branches that exceed max_vgpr are pruned the moment they cross the budget.

When max_globals_per_block > 0, an additional structural constraint limits how many GLOBAL_LOAD ops can appear between consecutive COMPUTE (MMA) ops. This ensures uniform global load distribution across MMA blocks after build_double_buffer_program groups by MMA delimiters. For example, max_globals_per_block=1 yields a [1,1,1,1] distribution instead of the unconstrained [1,0,2,1].

When lds_contention_penalty > 0, the resource model adds a shared LDS port constraint: fragment loads (LDS reads) and global loads (LDS writes) contend on the same port, with the given penalty per overlap. This biases the solver toward separating LDS reads from LDS writes in the schedule.

When max_vgpr < 999999, orderings whose peak VGPR pressure exceeds the budget are rejected. Among equal-makespan solutions, prefers the one with lower peak pressure (occupancy-aware scheduling).

Returns:

List[Int]