Mojo function

syncwarp

syncwarp(mask: Int = -1)

Synchronizes threads within a warp using a barrier.

This function creates a synchronization point where threads in a warp must wait until all threads specified by the mask reach this point. On NVIDIA GPUs, it uses warp-level synchronization primitives. On AMD GPUs, this is a no-op since threads execute in lock-step.

Note:

On NVIDIA GPUs, this maps to the nvvm.bar.warp.sync intrinsic.
On AMD GPUs, this is a no-op since threads execute in lock-step.
Threads not participating in the sync must still execute the instruction.

Args:

mask (Int): An integer bitmask specifying which lanes (threads) in the warp should be synchronized. Each bit corresponds to a lane, with bit i controlling lane i. A value of 1 means the lane participates in the sync, 0 means it does not. Default value of -1 (all bits set) synchronizes all lanes.