Skip to main content

Mojo module

mma

Apple Silicon MMA operation struct for TileTensor.

Simdgroup-level, register-owning MMA abstraction following the AMD MmaOp pattern. Each simdgroup (32 threads) instantiates its own MmaOpApple.

Use mma() for interior tiles (caller guarantees in-bounds). Use mmabounded=True for edge tiles (zero-fills OOB elements). The kernel should check once per simdgroup, not per load.

Structs

Was this page helpful?