Mojo module
pingpong_kernel
Structs
-
AMDPingPongMatmul: High-level ping-pong matmul implementation for AMD GPUs. -
KernelConfig: -
MmaOp: Encapsulates MMA register tiles and operations for matrix multiplication. -
TileBuffers: Double-buffered LDS tiles and TileLoaders for ping-pong matmul. -
TileLoaderLDS: Encapsulates load_to_lds with pre-computed thread positions and swizzle.
Functions
-
chiplet_transform_chunked: Transform work group ID for better chiplet locality. -
load_lds_fragment: Load LDS → registers with MMA access pattern. -
ping_pong_matmul:
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!