For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo struct
TmemDeallocBarrier
struct TmemDeallocBarrier[cta_group: Int]
TMEM deallocation synchronization barrier.
Handles cluster-aware synchronization patterns for TMEM deallocation, supporting both single-CTA and multi-CTA (cta_group=2) configurations.
Fieldsβ
- βbarrier (
TmemDeallocBarrier[cta_group].BarrierStorage):
Implemented traitsβ
AnyType,
Copyable,
ImplicitlyCopyable,
ImplicitlyDeletable,
Movable,
RegisterPassable,
TrivialRegisterPassable
comptime membersβ
BarrierStorageβ
comptime BarrierStorage = SMemArray[SharedMemBarrier, Int(1)]
Methodsβ
__init__β
def __init__(barrier: SMemArray[SharedMemBarrier, Int(1)]) -> Self
Initialize with shared memory barrier array.
signal_peerβ
def signal_peer(self)
Signal peer CTA in cluster (cta_group=2 only).
signal_selfβ
def signal_self(self)
Signal own arrival at barrier.
waitβ
def wait(self)
Wait for barrier completion.
complete_deallocβ
def complete_dealloc[max_cols: Int = Int(512)](self, tmem: TmemAllocation[cta_group, max_cols])
Complete TMEM deallocation sequence (MMA warp side).
Releases the allocation lock, waits for epilogue completion, then deallocates the TMEM.
signal_completeβ
def signal_complete(self)
Signal TMEM consumption complete (Epilogue warp side).
For cta_group=2, signals peer CTA first, then signals self.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!