Mojo struct
TmemDeallocBarrier
struct TmemDeallocBarrier[cta_group: Int]
TMEM deallocation synchronization barrier.
Handles cluster-aware synchronization patterns for TMEM deallocation, supporting both single-CTA and multi-CTA (cta_group=2) configurations.
Fieldsβ
- βbarrier (
TmemDeallocBarrier[cta_group].BarrierStorage):
Implemented traitsβ
AnyType,
Copyable,
ImplicitlyCopyable,
ImplicitlyDestructible,
Movable,
RegisterPassable,
TrivialRegisterPassable
comptime membersβ
BarrierStorageβ
comptime BarrierStorage = SMemArray[SharedMemBarrier, 1]
Methodsβ
__init__β
__init__(barrier: SMemArray[SharedMemBarrier, 1]) -> Self
Initialize with shared memory barrier array.
signal_peerβ
signal_peer(self)
Signal peer CTA in cluster (cta_group=2 only).
signal_selfβ
signal_self(self)
Signal own arrival at barrier.
waitβ
wait(self)
Wait for barrier completion.
complete_deallocβ
complete_dealloc[max_cols: Int = 512](self, tmem: TmemAllocation[cta_group, max_cols])
Complete TMEM deallocation sequence (MMA warp side).
Releases the allocation lock, waits for epilogue completion, then deallocates the TMEM.
signal_completeβ
signal_complete(self)
Signal TMEM consumption complete (Epilogue warp side).
For cta_group=2, signals peer CTA first, then signals self.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!