Skip to main content

Mojo module

pingpong_schedule

Ping-pong schedule for AMD GPU matmul kernels.

Kernel-specific schedule definition that builds on the generic framework in schedule_framework.mojo. The DeclarativeSchedule struct encapsulates the complete constraint-based pipeline: logical op declaration, cost model annotation, automatic edge derivation, and CSP-based optimal scheduling.

The schedule has 2 halves of 4 MMA blocks each, with double-buffered global→LDS→register data flow and warp staggering for latency hiding.

comptime values

COMPUTE

comptime COMPUTE = PingPongOps.COMPUTE.value

LOAD_A

comptime LOAD_A = PingPongOps.LOAD_A.value

LOAD_B

comptime LOAD_B = PingPongOps.LOAD_B.value

MMA

comptime MMA = PingPongOps.MMA.value

MMA_LOAD_A

comptime MMA_LOAD_A = PingPongOps.MMA_LOAD_A.value

MMA_LOAD_B

comptime MMA_LOAD_B = PingPongOps.MMA_LOAD_B.value

Structs

Functions

Was this page helpful?