Mojo function
sum
sum[type: DType, width: Int, //, *, block_size: Int, broadcast: Bool = True](val: SIMD[type, width]) -> SIMD[type, width]
Computes the sum of values across all threads in a block.
Performs a parallel reduction using warp-level operations and shared memory to find the global sum across all threads in the block.
Parameters:
- type (
DType
): The data type of the SIMD elements. - width (
Int
): The number of elements in each SIMD vector. - block_size (
Int
): The total number of threads in the block. - broadcast (
Bool
): If True, the final sum is broadcast to all threads in the block. If False, only the first thread will have the complete sum.
Args:
- val (
SIMD[type, width]
): The SIMD value to reduce. Each thread contributes its value to the sum.
Returns:
If broadcast is True, each thread in the block will receive the final sum. Otherwise, only the first thread will have the complete sum.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!