Mojo function
shuffle_xor
shuffle_xor[type: DType, simd_width: Int, //](val: SIMD[type, simd_width], offset: SIMD[uint32, 1]) -> SIMD[type, simd_width]
Exchanges values between threads in a warp using a butterfly pattern.
Performs a butterfly exchange pattern where each thread swaps values with another thread whose lane ID differs by a bitwise XOR with the given offset. This creates a butterfly communication pattern useful for parallel reductions and scans.
Parameters:
- type (
DType
): The data type of the SIMD elements (e.g. float32, int32). - simd_width (
Int
): The number of elements in each SIMD vector.
Args:
- val (
SIMD[type, simd_width]
): The SIMD value to be exchanged with another thread. - offset (
SIMD[uint32, 1]
): The lane offset to XOR with the current thread's lane ID to determine the exchange partner. Common values are powers of 2 for butterfly patterns.
Returns:
The SIMD value from the thread at lane (current_lane XOR offset).
shuffle_xor[type: DType, simd_width: Int, //](mask: UInt, val: SIMD[type, simd_width], offset: SIMD[uint32, 1]) -> SIMD[type, simd_width]
Exchanges values between threads in a warp using a butterfly pattern with masking.
Performs a butterfly exchange pattern where each thread swaps values with another thread whose lane ID differs by a bitwise XOR with the given offset. The mask parameter allows controlling which threads participate in the exchange.
Example:
```mojo
from gpu.warp import shuffle_xor
# Exchange values between even-numbered threads 4 lanes apart
mask = 0xAAAAAAAA # Even threads only
var val = SIMD[DType.float32, 16](42.0) # Example value
result = shuffle_xor(mask, val, 4.0)
```
.
```mojo
from gpu.warp import shuffle_xor
# Exchange values between even-numbered threads 4 lanes apart
mask = 0xAAAAAAAA # Even threads only
var val = SIMD[DType.float32, 16](42.0) # Example value
result = shuffle_xor(mask, val, 4.0)
```
.
Parameters:
- type (
DType
): The data type of the SIMD elements (e.g. float32, int32). - simd_width (
Int
): The number of elements in each SIMD vector.
Args:
- mask (
UInt
): A bit mask specifying which threads participate in the exchange. Only threads with their corresponding bit set in the mask will exchange values. - val (
SIMD[type, simd_width]
): The SIMD value to be exchanged with another thread. - offset (
SIMD[uint32, 1]
): The lane offset to XOR with the current thread's lane ID to determine the exchange partner. Common values are powers of 2 for butterfly patterns.
Returns:
The SIMD value from the thread at lane (current_lane XOR offset) if both threads are enabled by the mask, otherwise the original value is preserved.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!