Mojo module
pingpong_schedule
Ping-pong schedule for AMD GPU matmul kernels.
Kernel-specific schedule definition that builds on the generic framework in schedule_framework.mojo. The DeclarativeSchedule struct encapsulates the complete constraint-based pipeline: logical op declaration, cost model annotation, automatic edge derivation, and CSP-based optimal scheduling.
The schedule has 2 halves of 4 MMA blocks each, with double-buffered global→LDS→register data flow and warp staggering for latency hiding.
comptime values
COMPUTE
comptime COMPUTE = PingPongOps.COMPUTE.value
LOAD_A
comptime LOAD_A = PingPongOps.LOAD_A.value
LOAD_B
comptime LOAD_B = PingPongOps.LOAD_B.value
MMA
comptime MMA = PingPongOps.MMA.value
MMA_LOAD_A
comptime MMA_LOAD_A = PingPongOps.MMA_LOAD_A.value
MMA_LOAD_B
comptime MMA_LOAD_B = PingPongOps.MMA_LOAD_B.value
Structs
-
DeclarativeSchedule: Constraint-based pipeline: algorithm declares ops, target supplies costs. -
PingPongOps: Op tags for the ping-pong double-buffered matmul kernel.
Functions
-
build_schedule: Compile the declarative constraint-based schedule.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!