Mojo module
mha_structured
MHA prefill kernel for gfx950 with structured scheduling.
Supports depth=64, 128, 256. Uses TileTensor throughout for register and SMEM tile management, with TiledMmaOp for MMA dispatch.
Functionsβ
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!