Mojo package
compute
GPU compute operations package - MMA and tensor core operations.
This package provides GPU tensor core and matrix multiplication operations:
- mma: Unified warp matrix-multiply-accumulate (WMMA) operations
- mma_util: Utility functions for loading/storing MMA operands
- mma_operand_descriptor: Operand descriptor types for MMA
- tensor_ops: Tensor core-based reductions and operations
- tcgen05: 5th generation tensor core operations (Blackwell)
- arch/: Architecture-specific MMA implementations (internal)
mma_nvidia: NVIDIA tensor cores (SM70-SM90)mma_nvidia_sm100: NVIDIA Blackwell (SM100)mma_amd: AMD Matrix Cores (CDNA2/3/4)mma_amd_rdna: AMD WMMA (RDNA3/4)
Usage
Import compute operations directly:
from gpu.compute import mma
# Automatically dispatches to the correct GPU architecture
result = mma.mma(a, b, c)Architecture-specific implementations in arch/ are internal and should not
be imported directly by user code.
Packages
-
arch: Architecture-specific MMA implementations.
Modules
-
mma: This module includes utilities for working with the warp-matrix-matrix-multiplication (wmma) instructions. -
mma_operand_descriptor: -
mma_util: Matrix multiply accumulate (MMA) utilities for GPU tensor cores. -
tcgen05: This module includes utilities for working with the tensorcore 5th generation (tcgen05) instructions. -
tensor_ops: This module provides tensor core operations and utilities for GPU computation.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!