Skip to main content

Mojo package

gpu

GPU multi-head attention (MHA), cross-attention, and multi-head latent attention (MLA) kernels. Vendor-specific implementations live under amd/ and nvidia/.

Packages

  • amd: AMD GPU attention kernels for CDNA (GFX942/GFX950) and RDNA architectures.
  • amd_structured: Structured AMD GPU attention kernels (TileTensor hot path).
  • nvidia: NVIDIA GPU attention kernels and tile-scheduling utilities.

Modules

Was this page helpful?