Mojo function
syncwarp
syncwarp(mask: Int = -1)
Synchronizes threads within a warp using a barrier.
This function creates a synchronization point where threads in a warp must wait until all threads specified by the mask reach this point. On NVIDIA GPUs, it uses warp-level synchronization primitives. On AMD GPUs, this is a no-op since threads execute in lock-step.
Note: - On NVIDIA GPUs, this maps to the nvvm.bar.warp.sync intrinsic. - On AMD GPUs, this is a no-op since threads execute in lock-step. - Threads not participating in the sync must still execute the instruction.
Args:
- mask (
Int
): An integer bitmask specifying which lanes (threads) in the warp should be synchronized. Each bit corresponds to a lane, with bit i controlling lane i. A value of 1 means the lane participates in the sync, 0 means it does not. Default value of -1 (all bits set) synchronizes all lanes.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!