Mojo module
buffers_rdna
RDNA-specific buffer implementations for Wave32 WMMA attention.
This module provides buffer management optimized for AMD RDNA consumer GPUs (Radeon RX 7000/8000 series, gfx11xx/gfx12xx) using Wave32 execution.
Key differences from CDNA buffers:
- Wave size: 32 lanes (vs 64 for CDNA)
- MMA shape: 16x16x16 only (vs multiple shapes for CDNA)
- Fragment sizes: A/B = 16 elements (full K dimension), C/D = 8 elements per lane
- Wave-cooperative mode: lanes 0-15 provide unique data, lanes 16-31 replicate
- Memory access patterns optimized for 16-lane unique data distribution
comptime values
RDNA_AB_FRAG_SIZE
comptime RDNA_AB_FRAG_SIZE = 16
RDNA_CD_FRAG_SIZE
comptime RDNA_CD_FRAG_SIZE = 8
RDNA_MMA_K
comptime RDNA_MMA_K = 16
RDNA_MMA_M
comptime RDNA_MMA_M = 16
RDNA_MMA_N
comptime RDNA_MMA_N = 16
RDNA_WARP_SIZE
comptime RDNA_WARP_SIZE = 32
Structs
-
KBufferRDNA: RDNA-specific K buffer for Wave32 WMMA attention. -
OutputRegisterBufferRDNA: RDNA-specific output register buffer for Wave32 WMMA. -
PRegisterBufferRDNA: RDNA-specific P register buffer for Wave32 WMMA attention. -
QRegisterBufferRDNA: RDNA-specific Q register buffer for Wave32 WMMA attention. -
VBufferRDNA: RDNA-specific V buffer with transpose loads for Wave32 WMMA.
Functions
-
get_rdna_fragment_layout: Get the fragment layout for RDNA WMMA output fragments. -
get_rdna_warp_coords: Get warp coordinates for RDNA Wave32. -
get_rdna_warp_layout: Get the warp thread layout for RDNA WMMA operations.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!