Skip to main content

Mojo struct

TmemDeallocBarrier

@register_passable(trivial) struct TmemDeallocBarrier[cta_group: Int]

TMEM deallocation synchronization barrier.

Handles cluster-aware synchronization patterns for TMEM deallocation, supporting both single-CTA and multi-CTA (cta_group=2) configurations.

Fields

  • barrier (SMemArrayType[SharedMemBarrier, 1]):

Implemented traits

AnyType, Copyable, ImplicitlyCopyable, ImplicitlyDestructible, Movable

comptime members

__copyinit__is_trivial

comptime __copyinit__is_trivial = True

__del__is_trivial

comptime __del__is_trivial = True

__moveinit__is_trivial

comptime __moveinit__is_trivial = True

Methods

__init__

__init__(barrier: SMemArrayType[SharedMemBarrier, 1]) -> Self

Initialize with shared memory barrier array.

signal_peer

signal_peer(self)

Signal peer CTA in cluster (cta_group=2 only).

signal_self

signal_self(self)

Signal own arrival at barrier.

wait

wait(self)

Wait for barrier completion.

complete_dealloc

complete_dealloc[max_cols: Int = 512](self, tmem: TmemAllocation[cta_group, max_cols])

Complete TMEM deallocation sequence (MMA warp side).

Releases the allocation lock, waits for epilogue completion, then deallocates the TMEM.

signal_complete

signal_complete(self)

Signal TMEM consumption complete (Epilogue warp side).

For cta_group=2, signals peer CTA first, then signals self.

Was this page helpful?