Skip to main content
Log in

Mojo function

shuffle_xor

shuffle_xor[type: DType, simd_width: Int, //](val: SIMD[type, simd_width], offset: SIMD[uint32, 1]) -> SIMD[type, simd_width]

Exchanges values between threads in a warp using a butterfly pattern.

Performs a butterfly exchange pattern where each thread swaps values with another thread whose lane ID differs by a bitwise XOR with the given offset. This creates a butterfly communication pattern useful for parallel reductions and scans.

Parameters:

  • type (DType): The data type of the SIMD elements (e.g. float32, int32).
  • simd_width (Int): The number of elements in each SIMD vector.

Args:

  • val (SIMD[type, simd_width]): The SIMD value to be exchanged with another thread.
  • offset (SIMD[uint32, 1]): The lane offset to XOR with the current thread's lane ID to determine the exchange partner. Common values are powers of 2 for butterfly patterns.

Returns:

The SIMD value from the thread at lane (current_lane XOR offset).

shuffle_xor[type: DType, simd_width: Int, //](mask: UInt, val: SIMD[type, simd_width], offset: SIMD[uint32, 1]) -> SIMD[type, simd_width]

Exchanges values between threads in a warp using a butterfly pattern with masking.

Performs a butterfly exchange pattern where each thread swaps values with another thread whose lane ID differs by a bitwise XOR with the given offset. The mask parameter allows controlling which threads participate in the exchange.

Example:

```mojo
from gpu.warp import shuffle_xor

# Exchange values between even-numbered threads 4 lanes apart
mask = 0xAAAAAAAA # Even threads only
var val = SIMD[DType.float32, 16](42.0) # Example value
result = shuffle_xor(mask, val, 4.0)
```
.
```mojo
from gpu.warp import shuffle_xor

# Exchange values between even-numbered threads 4 lanes apart
mask = 0xAAAAAAAA # Even threads only
var val = SIMD[DType.float32, 16](42.0) # Example value
result = shuffle_xor(mask, val, 4.0)
```
.

Parameters:

  • type (DType): The data type of the SIMD elements (e.g. float32, int32).
  • simd_width (Int): The number of elements in each SIMD vector.

Args:

  • mask (UInt): A bit mask specifying which threads participate in the exchange. Only threads with their corresponding bit set in the mask will exchange values.
  • val (SIMD[type, simd_width]): The SIMD value to be exchanged with another thread.
  • offset (SIMD[uint32, 1]): The lane offset to XOR with the current thread's lane ID to determine the exchange partner. Common values are powers of 2 for butterfly patterns.

Returns:

The SIMD value from the thread at lane (current_lane XOR offset) if both threads are enabled by the mask, otherwise the original value is preserved.