Skip to main content

Mojo package

gpu

GPU multi-head attention (MHA), cross-attention, and multi-head latent attention (MLA) kernels. Vendor-specific implementations live under amd/ and nvidia/.

Packages​

  • ​amd_rdna: TileTensor-native attention kernels for AMD RDNA3+ (gfx11xx/gfx12xx).
  • ​amd_structured: TileTensor-native attention kernels for AMD gfx950 (MI355X).
  • ​nvidia: NVIDIA GPU attention kernels and tile-scheduling utilities.

Modules​