Mojo package

gpu

GPU multi-head attention (MHA), cross-attention, and multi-head latent attention (MLA) kernels. Vendor-specific implementations live under amd/ and nvidia/.

Packages

amd: AMD GPU attention kernels for CDNA (GFX942/GFX950) and RDNA architectures.
amd_structured: Structured AMD GPU attention kernels (TileTensor hot path).
nvidia: NVIDIA GPU attention kernels and tile-scheduling utilities.

Modules

mha:
mha_cross:
mha_decode_partition_heuristic:
mla:
mla_graph:
mla_index_fp8: MLA FP8 index kernel for computing attention scores with paged KV cache.

Packages​

Modules​

Packages

Modules