IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo struct

Signal

struct Signal

A synchronization primitive for coordinating GPU thread blocks across multiple devices.

This struct provides counter-based synchronization between thread blocks on different GPUs. It maintains two sets of counters:

  1. self_counter: Used by blocks on the current GPU to signal their progress
  2. peer_counter: Used to track progress of blocks on other GPUs

Note: The counters use unsigned integers that may overflow, but this is safe since unsigned integer overflow has well-defined behavior.

Fields​

  • ​self_counter (StaticTuple[StaticTuple[UInt32, Int(8)], Int(512)]): A 2D array of counters with shape (MAX_NUM_BLOCKS_UPPER_BOUND, MAX_GPUS). Each counter tracks the progress of a specific thread block on the current GPU. Thread blocks increment their corresponding counter to signal completion of a phase, allowing other GPUs to detect when synchronization points are reached. The counters use atomic operations to ensure proper synchronization across devices.
  • ​peer_counter (StaticTuple[StaticTuple[StaticTuple[UInt32, Int(8)], Int(512)], Int(2)]): A 3D array of counters with shape (2, MAX_NUM_BLOCKS_UPPER_BOUND, MAX_GPUS). Contains two sets of counters to handle two synchronization points safely. The dual counter design prevents race conditions where a peer block arrives at the second sync point before the current block passes the first sync point.

Implemented traits​

AnyType, ImplicitlyDeletable

comptime members​

flag_t​

comptime flag_t = DType.uint32