Mojo function
block_reduce_sum_and_max
block_reduce_sum_and_max[dtype: DType, max_warps_per_block: Int](sum_val: Scalar[dtype], max_val: Scalar[dtype]) -> Tuple[Scalar[dtype], Scalar[dtype]]
Combined block reduction for sum and max in a single barrier pass.
Performs both sum and max reductions across the block using only 2 barriers (vs 4 for separate block.sum + block.max with broadcast).
Returns:
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!