Mojo module
attention
Attention struct for RDNA Wave32 MHA kernels (prefill + decode).
TileTensor throughout. Wave32 + 16x16x16 WMMA + wave-cooperative
fragments. Constructor surface matches amd_structured/Attention so
the dispatcher (nn/attention/gpu/mha.mojo) can branch on
_is_amd_rdna() without restructuring its call sites.
comptime valuesβ
RDNA_K_GROUP_SIZEβ
comptime RDNA_K_GROUP_SIZE = 1
Structsβ
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!