IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo struct

TiledMma

struct TiledMma[out_type: DType, in_type: DType, shape: IndexList[Int(3)], group_size: Int]

Stateless MMA computation on TileTensors.

Direct TileTensor port of TiledTensorCore.mma. Iterates group_size k-steps, indexes A/B register tiles per step, and calls gpu_mma. No register ownership, no SMEM loading β€” pure computation.

Parameters​

  • ​out_type (DType): Accumulator data type (typically float32).
  • ​in_type (DType): Input element data type (bfloat16 or float8).
  • ​shape (IndexList[Int(3)]): MMA instruction shape [M, N, K].
  • ​group_size (Int): Number of k-steps per mma() call.

Implemented traits​

AnyType, ImplicitlyDeletable

comptime members​

a_frag_size​

comptime a_frag_size = (Int((mul shape[Int(0)], shape[Int(2)])) // _resolve_warp_size())

c_frag_size​

comptime c_frag_size = (Int((mul shape[Int(0)], shape[Int(1)])) // _resolve_warp_size())

MMA_K​

comptime MMA_K = shape[Int(2)]

MMA_M​

comptime MMA_M = shape[Int(0)]

MMA_N​

comptime MMA_N = shape[Int(1)]

Methods​

mma​

static def mma[a_layout: TensorLayout, b_layout: TensorLayout, c_layout: TensorLayout](a_reg: TileTensor[in_type, a_layout, address_space=AddressSpace.LOCAL], b_reg: TileTensor[in_type, b_layout, address_space=AddressSpace.LOCAL], c_reg: TileTensor[out_type, c_layout, MutUntrackedOrigin, address_space=AddressSpace.LOCAL])

Execute group_size MMA operations across the K dimension.

Mirrors TiledTensorCore.mma: iterates group_size k-steps, tiles A/B registers per step via vectorize, and accumulates into C.

Parameters:

  • ​a_layout (TensorLayout): Inferred layout of A register tile.
  • ​b_layout (TensorLayout): Inferred layout of B register tile.
  • ​c_layout (TensorLayout): Inferred layout of C register tile.

Args: