Mojo function
prefix_sum
prefix_sum[dtype: DType, //, *, block_size: Int, exclusive: Bool = False](val: Scalar[dtype]) -> Scalar[dtype]
Performs a prefix sum (scan) operation across all threads in a 1D block.
This function implements a block-level inclusive or exclusive scan, efficiently computing the cumulative sum for each thread based on thread indices.
Parameters:
- dtype (
DType): The data type of the Scalar elements. - block_size (
Int): The total number of threads in the block. - exclusive (
Bool): If True, perform exclusive scan instead of inclusive.
Args:
- val (
Scalar): The Scalar value from each thread to include in the scan.
Returns:
Scalar: A Scalar value containing the result of the scan operation for each
thread.
prefix_sum[dtype: DType, //, *, block_dim_x: Int, block_dim_y: Int, block_dim_z: Int = 1, exclusive: Bool = False](val: Scalar[dtype]) -> Scalar[dtype]
Performs a prefix sum (scan) operation across all threads in a multi-dimensional block.
This function implements a block-level inclusive or exclusive scan for 2D
and 3D thread blocks. Thread IDs are linearized in row-major order:
x + y * dim_x + z * dim_x * dim_y.
Parameters:
- dtype (
DType): The data type of the Scalar elements. - block_dim_x (
Int): The number of threads along the X dimension. - block_dim_y (
Int): The number of threads along the Y dimension. - block_dim_z (
Int): The number of threads along the Z dimension (default: 1). - exclusive (
Bool): If True, perform exclusive scan instead of inclusive.
Args:
- val (
Scalar): The Scalar value from each thread to include in the scan.
Returns:
Scalar: A Scalar value containing the result of the scan operation for each
thread.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!