Skip to main content

Mojo struct

TmemAllocation

@register_passable(trivial) struct TmemAllocation[cta_group: Int, max_cols: Int = 512]

Handle to allocated Tensor Memory.

Lifecycle: allocate() → use → release_lock() → wait → deallocate()

Parameters

  • cta_group (Int): Cooperating CTAs (1 or 2).
  • max_cols (Int): TMEM columns (512 for SM100).

Fields

  • addr (UInt32):

Implemented traits

AnyType, Copyable, ImplicitlyCopyable, ImplicitlyDestructible, Movable

comptime members

__copyinit__is_trivial

comptime __copyinit__is_trivial = True

__del__is_trivial

comptime __del__is_trivial = True

__moveinit__is_trivial

comptime __moveinit__is_trivial = True

SmemAddrStorage

comptime SmemAddrStorage = SMemArrayType[UInt32, 1]

Methods

__init__

__init__(addr: UInt32, smem_ptr: LegacyUnsafePointer[UInt32, address_space=AddressSpace.SHARED]) -> Self

allocate

static allocate(smem_addr: SMemArrayType[UInt32, 1]) -> Self

Allocate TMEM (MMA warp). Address stored in smem for epilogue.

from_shared

static from_shared(smem_addr: SMemArrayType[UInt32, 1]) -> Self

Get handle from existing allocation (epilogue warp).

release_lock

release_lock(self)

Release allocation lock before waiting for epilogue.

deallocate

deallocate(self)

Free TMEM after epilogue completion.

Was this page helpful?