IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo package

gpu

GPU multi-head attention (MHA), cross-attention, and multi-head latent attention (MLA) kernels. Vendor-specific implementations live under amd/ and nvidia/.

Packages​

  • ​amd_rdna: TileTensor-native attention kernels for AMD RDNA3+ (gfx11xx/gfx12xx).
  • ​amd_structured: TileTensor-native attention kernels for AMD gfx950 (MI355X).
  • ​apple: Apple (Metal) GPU attention kernels.
  • ​nvidia: NVIDIA GPU attention kernels and tile-scheduling utilities.

Modules​