Mojo function
reduce
reduce[val_type: DType, simd_width: Int, //, shuffle: fn[DType, Int](val: SIMD[$0, $1], offset: SIMD[uint32, 1]) -> SIMD[$0, $1], func: fn[DType, Int](SIMD[$0, $1], SIMD[$0, $1]) capturing -> SIMD[$0, $1]](val: SIMD[val_type, simd_width]) -> SIMD[val_type, simd_width]
Performs a generic warp-wide reduction operation using shuffle operations.
This is a convenience wrapper around lane_group_reduce that operates on the entire warp. It allows customizing both the shuffle operation and reduction function.
Example:
from gpu.warp import reduce, shuffle_down
# Compute warp-wide sum using shuffle down
@parameter
fn add[type: DType, width: Int](x: SIMD[type, width], y: SIMD[type, width]) capturing -> SIMD[type, width]:
return x + y
val = SIMD[DType.float32, 4](2.0, 4.0, 6.0, 8.0)
result = reduce[shuffle_down, add](val)
from gpu.warp import reduce, shuffle_down
# Compute warp-wide sum using shuffle down
@parameter
fn add[type: DType, width: Int](x: SIMD[type, width], y: SIMD[type, width]) capturing -> SIMD[type, width]:
return x + y
val = SIMD[DType.float32, 4](2.0, 4.0, 6.0, 8.0)
result = reduce[shuffle_down, add](val)
.
Parameters:
- val_type (
DType
): The data type of the SIMD elements (e.g. float32, int32). - simd_width (
Int
): The number of elements in the SIMD vector. - shuffle (
fn[DType, Int](val: SIMD[$0, $1], offset: SIMD[uint32, 1]) -> SIMD[$0, $1]
): A function that performs the warp shuffle operation. Takes a SIMD value and offset and returns the shuffled result. - func (
fn[DType, Int](SIMD[$0, $1], SIMD[$0, $1]) capturing -> SIMD[$0, $1]
): A binary function that combines two SIMD values during reduction. This defines the reduction operation (e.g. add, max, min).
Args:
- val (
SIMD[val_type, simd_width]
): The SIMD value to reduce. Each lane contributes its value.
Returns:
A SIMD value containing the reduction result broadcast to all lanes in the warp.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!