Mojo struct

TmemAllocation

@register_passable(trivial) struct TmemAllocation[cta_group: Int, max_cols: Int = 512]

Handle to allocated Tensor Memory.

Lifecycle: allocate() → use → release_lock() → wait → deallocate()

Parameters

cta_group (Int): Cooperating CTAs (1 or 2).
max_cols (Int): TMEM columns (512 for SM100).

Fields

addr (UInt32):

Implemented traits

AnyType, Copyable, ImplicitlyCopyable, ImplicitlyDestructible, Movable

`comptime` members

`copyinitis_trivial`

comptime __copyinit__is_trivial = True

`delis_trivial`

comptime __del__is_trivial = True

`moveinitis_trivial`

comptime __moveinit__is_trivial = True

`SmemAddrStorage`

comptime SmemAddrStorage = SMemArrayType[UInt32, 1]

Methods

`init`

__init__(addr: UInt32, smem_ptr: LegacyUnsafePointer[UInt32, address_space=AddressSpace.SHARED]) -> Self

`allocate`

static allocate(smem_addr: SMemArrayType[UInt32, 1]) -> Self

Allocate TMEM (MMA warp). Address stored in smem for epilogue.

`from_shared`

static from_shared(smem_addr: SMemArrayType[UInt32, 1]) -> Self

Get handle from existing allocation (epilogue warp).

`release_lock`

release_lock(self)

Release allocation lock before waiting for epilogue.

`deallocate`

deallocate(self)

Free TMEM after epilogue completion.

Parameters​

Fields​

Implemented traits​

comptime members​

__copyinit__is_trivial​

__del__is_trivial​

__moveinit__is_trivial​

SmemAddrStorage​

Methods​

__init__​

allocate​

from_shared​

release_lock​

deallocate​