Mojo function
lane_group_reduce
lane_group_reduce[val_type: DType, simd_width: Int, //, shuffle: fn[DType, Int](val: SIMD[$0, $1], offset: SIMD[uint32, 1]) -> SIMD[$0, $1], func: fn[DType, Int](SIMD[$0, $1], SIMD[$0, $1]) capturing -> SIMD[$0, $1], num_lanes: Int, *, stride: Int = 1](val: SIMD[val_type, simd_width]) -> SIMD[val_type, simd_width]
Performs a generic warp-level reduction operation using shuffle operations.
This function implements a parallel reduction across threads in a warp using a butterfly pattern. It allows customizing both the shuffle operation and reduction function.
Example:
```mojo
from gpu.warp import lane_group_reduce, shuffle_down
# Compute sum across 16 threads using shuffle down
@parameter
fn add[type: DType, width: Int](x: SIMD[type, width], y: SIMD[type, width]) -> SIMD[type, width]:
return x + y
var val = SIMD[DType.float32, 16](42.0)
var result = lane_group_reduce[shuffle_down, add, num_lanes=16](val)
```
.
```mojo
from gpu.warp import lane_group_reduce, shuffle_down
# Compute sum across 16 threads using shuffle down
@parameter
fn add[type: DType, width: Int](x: SIMD[type, width], y: SIMD[type, width]) -> SIMD[type, width]:
return x + y
var val = SIMD[DType.float32, 16](42.0)
var result = lane_group_reduce[shuffle_down, add, num_lanes=16](val)
```
.
Parameters:
- val_type (
DType
): The data type of the SIMD elements (e.g. float32, int32). - simd_width (
Int
): The number of elements in the SIMD vector. - shuffle (
fn[DType, Int](val: SIMD[$0, $1], offset: SIMD[uint32, 1]) -> SIMD[$0, $1]
): A function that performs the warp shuffle operation. Takes a SIMD value and offset and returns the shuffled result. - func (
fn[DType, Int](SIMD[$0, $1], SIMD[$0, $1]) capturing -> SIMD[$0, $1]
): A binary function that combines two SIMD values during reduction. This defines the reduction operation (e.g. add, max, min). - num_lanes (
Int
): The number of lanes in a group. The reduction is done within each group. Must be a power of 2. - stride (
Int
): The stride between lanes participating in the reduction.
Args:
- val (
SIMD[val_type, simd_width]
): The SIMD value to reduce. Each lane contributes its value.
Returns:
A SIMD value containing the reduction result.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!