Mojo function
prefix_sum
prefix_sum[type: DType, //, intermediate_type: DType = type, *, output_type: DType = type, exclusive: Bool = False](x: SIMD[type, 1]) -> SIMD[output_type, 1]
Computes a warp-level prefix sum (scan) operation.
Performs an inclusive or exclusive prefix sum across threads in a warp using a parallel scan algorithm with warp shuffle operations. This implements an efficient parallel scan with logarithmic complexity.
For example, if we have a warp with the following elements:
The prefix sum is:
Parameters:
- type (
DType
): The data type of the input SIMD elements. - intermediate_type (
DType
): Type used for intermediate calculations (defaults to input type). - output_type (
DType
): The desired output data type (defaults to input type). - exclusive (
Bool
): If True, performs exclusive scan where each thread receives the sum of all previous threads. If False (default), performs inclusive scan where each thread receives the sum including its own value.
Args:
- x (
SIMD[type, 1]
): The SIMD value to include in the prefix sum.
Returns:
A scalar containing the prefix sum at the current thread's position in the warp, cast to the specified output type.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!