Mojo struct
Signal
@register_passable(trivial)
struct Signal
A synchronization primitive for coordinating GPU thread blocks across multiple devices.
This struct provides counter-based synchronization between thread blocks on different GPUs. It maintains two sets of counters:
- self_counter: Used by blocks on the current GPU to signal their progress
- peer_counter: Used to track progress of blocks on other GPUs
Note: The counters use unsigned integers that may overflow, but this is safe since unsigned integer overflow has well-defined behavior.
Fields
- self_counter (
StaticTuple[StaticTuple[SIMD[uint32, 1], 8], 512]
): A 2D array of counters with shape (MAX_NUM_BLOCKS_UPPER_BOUND, MAX_GPUS). Each counter tracks the progress of a specific thread block on the current GPU. Thread blocks increment their corresponding counter to signal completion of a phase, allowing other GPUs to detect when synchronization points are reached. The counters use atomic operations to ensure proper synchronization across devices. - peer_counter (
StaticTuple[StaticTuple[StaticTuple[SIMD[uint32, 1], 8], 512], 2]
): A 3D array of counters with shape (2, MAX_NUM_BLOCKS_UPPER_BOUND, MAX_GPUS). Contains two sets of counters to handle two synchronization points safely. The dual counter design prevents race conditions where a peer block arrives at the second sync point before the current block passes the first sync point.
Implemented traits
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!