Mojo module
tmem
Tensor Memory (TMEM) abstractions for SM100 Blackwell GPUs.
TMEM is dedicated memory for MMA accumulators, separate from registers and shared memory. This module provides type-safe abstractions:
- TmemAllocation: Manages TMEM lifecycle (alloc/dealloc)
- TmemTensor: Layout-parameterized typed view over TMEM accumulators
- TmemStage: Represents a pipeline stage for accumulator buffering
- TmemAddress: Simple address wrapper for TMEM load operations
comptime valuesβ
TMEM_LOWER_ROW_OFFSETβ
comptime TMEM_LOWER_ROW_OFFSET = UInt32(1048576)
Structsβ
- β
BlockScaledTmem: TMEM region for block-scaled matmul with typed tile accessors. - β
TmemAddress: Simple TMEM address wrapper for load/store operations. - β
TmemAllocation: Handle to allocated Tensor Memory. - β
TmemArrayType: Array of tiles in Tensor Memory (TMEM). - β
TmemDeallocBarrier: TMEM deallocation synchronization barrier. - β
TmemFragments: Paired upper/lower accumulator fragments from TMEM. - β
TmemStage: A pipeline stage within TMEM for accumulator buffering. - β
TmemTensor: Typed tensor view over Tensor Memory (TMEM) for MMA accumulators.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!