Mojo struct
TmemDeallocBarrier
@register_passable(trivial)
struct TmemDeallocBarrier[cta_group: Int]
TMEM deallocation synchronization barrier.
Handles cluster-aware synchronization patterns for TMEM deallocation, supporting both single-CTA and multi-CTA (cta_group=2) configurations.
Fields
- barrier (
SMemArrayType[SharedMemBarrier, 1]):
Implemented traits
AnyType,
Copyable,
ImplicitlyCopyable,
ImplicitlyDestructible,
Movable
comptime members
__copyinit__is_trivial
comptime __copyinit__is_trivial = True
__del__is_trivial
comptime __del__is_trivial = True
__moveinit__is_trivial
comptime __moveinit__is_trivial = True
Methods
__init__
__init__(barrier: SMemArrayType[SharedMemBarrier, 1]) -> Self
Initialize with shared memory barrier array.
signal_peer
signal_peer(self)
Signal peer CTA in cluster (cta_group=2 only).
signal_self
signal_self(self)
Signal own arrival at barrier.
wait
wait(self)
Wait for barrier completion.
complete_dealloc
complete_dealloc[max_cols: Int = 512](self, tmem: TmemAllocation[cta_group, max_cols])
Complete TMEM deallocation sequence (MMA warp side).
Releases the allocation lock, waits for epilogue completion, then deallocates the TMEM.
signal_complete
signal_complete(self)
Signal TMEM consumption complete (Epilogue warp side).
For cta_group=2, signals peer CTA first, then signals self.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!