Reductions

Module

Implements SIMD reductions.

all_true

all_true[simd_width: Int, size: Dim, type: DType](src: Buffer[size, type]) -> Bool

Returns True if all the elements in a buffer are True and False otherwise.

Parameters:

  • simd_width (Int): The vector width for the computation.
  • size (Dim): The buffer size.
  • type (DType): The buffer elements dtype.

Args:

  • src (Buffer[size, type]): The buffer.

Returns:

True if all of the elements of the buffer are True and False otherwise.

any_true

any_true[simd_width: Int, size: Dim, type: DType](src: Buffer[size, type]) -> Bool

Returns True if any the elements in a buffer are True and False otherwise.

Parameters:

  • simd_width (Int): The vector width for the computation.
  • size (Dim): The buffer size.
  • type (DType): The buffer elements dtype.

Args:

  • src (Buffer[size, type]): The buffer.

Returns:

True if any of the elements of the buffer are True and False otherwise.

map_reduce

map_reduce[simd_width: Int, size: Dim, type: DType, acc_type: DType, input_gen_fn: fn[DType, Int](Int) capturing -> SIMD[*(0,0), *(0,1)], reduce_vec_to_vec_fn: fn[Int, DType, DType](SIMD[*(0,1), *(0,0)], SIMD[*(0,2), *(0,0)]) capturing -> SIMD[*(0,1), *(0,0)], reduce_vec_to_scalar_fn: fn[Int, DType](SIMD[*(0,1), *(0,0)]) -> SIMD[*(0,1), 1]](dst: Buffer[size, type], init: SIMD[acc_type, 1]) -> SIMD[acc_type, 1]

Store the result of calling input_gen_fn in dst and simultaneously reduce the result using a custom reduction function.

Parameters:

  • simd_width (Int): The vector width for the computation.
  • size (Dim): The buffer size.
  • type (DType): The buffer elements dtype.
  • acc_type (DType): The dtype of the reduction accumulator.
  • input_gen_fn (fn[DType, Int](Int) capturing -> SIMD[*(0,0), *(0,1)]): A function that generates inputs to reduce.
  • reduce_vec_to_vec_fn (fn[Int, DType, DType](SIMD[*(0,1), *(0,0)], SIMD[*(0,2), *(0,0)]) capturing -> SIMD[*(0,1), *(0,0)]): A mapping function. This function is used to combine (accumulate) two chunks of input data: e.g. we load two 8xf32 vectors of elements and need to reduce them into a single 8xf32 vector.
  • reduce_vec_to_scalar_fn (fn[Int, DType](SIMD[*(0,1), *(0,0)]) -> SIMD[*(0,1), 1]): A reduction function. This function is used to reduce a vector to a scalar. E.g. when we got 8xf32 vector and want to reduce it to an f32 scalar.

Args:

  • dst (Buffer[size, type]): The output buffer.
  • init (SIMD[acc_type, 1]): The initial value to use in accumulator.

Returns:

The computed reduction value.

max

max[simd_width: Int, size: Dim, type: DType](src: Buffer[size, type]) -> SIMD[type, 1]

Computes the max element in a buffer.

Parameters:

  • simd_width (Int): The vector width for the computation.
  • size (Dim): The buffer size.
  • type (DType): The buffer elements dtype.

Args:

  • src (Buffer[size, type]): The buffer.

Returns:

The maximum of the buffer elements.

max[simd_width: Int, rank: Int, input_shape: DimList, output_shape: DimList, type: DType, reduce_axis: Int](src: NDBuffer[rank, input_shape, type], dst: NDBuffer[rank, output_shape, type])

Computes the max across reduce_axis of an NDBuffer.

Parameters:

  • simd_width (Int): The vector width for the computation.
  • rank (Int): The rank of the input/output buffers.
  • input_shape (DimList): The input buffer shape.
  • output_shape (DimList): The output buffer shape.
  • type (DType): The buffer elements dtype.
  • reduce_axis (Int): The axis to reduce across.

Args:

  • src (NDBuffer[rank, input_shape, type]): The input buffer.
  • dst (NDBuffer[rank, output_shape, type]): The output buffer.

mean

mean[simd_width: Int, size: Dim, type: DType](src: Buffer[size, type]) -> SIMD[type, 1]

Computes the mean value of the elements in a buffer.

Parameters:

  • simd_width (Int): The width of the output SIMD vector.
  • size (Dim): The size of the input buffer..
  • type (DType): The type of the elements of the input buffer and output SIMD vector.

Args:

  • src (Buffer[size, type]): The buffer of elements for which the mean is computed.

Returns:

The mean value of the elements in the given buffer.

mean[simd_width: Int, rank: Int, input_shape: DimList, output_shape: DimList, type: DType, reduce_axis: Int](src: NDBuffer[rank, input_shape, type], dst: NDBuffer[rank, output_shape, type])

Computes the mean across reduce_axis of an NDBuffer.

Parameters:

  • simd_width (Int): The vector width for the computation.
  • rank (Int): The rank of the input/output buffers.
  • input_shape (DimList): The input buffer shape.
  • output_shape (DimList): The output buffer shape.
  • type (DType): The buffer elements dtype.
  • reduce_axis (Int): The axis to reduce across.

Args:

  • src (NDBuffer[rank, input_shape, type]): The input buffer.
  • dst (NDBuffer[rank, output_shape, type]): The output buffer.

min

min[simd_width: Int, size: Dim, type: DType](src: Buffer[size, type]) -> SIMD[type, 1]

Computes the min element in a buffer.

Parameters:

  • simd_width (Int): The vector width for the computation.
  • size (Dim): The buffer size.
  • type (DType): The buffer elements dtype.

Args:

  • src (Buffer[size, type]): The buffer.

Returns:

The minimum of the buffer elements.

min[simd_width: Int, rank: Int, input_shape: DimList, output_shape: DimList, type: DType, reduce_axis: Int](src: NDBuffer[rank, input_shape, type], dst: NDBuffer[rank, output_shape, type])

Computes the min across reduce_axis of an NDBuffer.

Parameters:

  • simd_width (Int): The vector width for the computation.
  • rank (Int): The rank of the input/output buffers.
  • input_shape (DimList): The input buffer shape.
  • output_shape (DimList): The output buffer shape.
  • type (DType): The buffer elements dtype.
  • reduce_axis (Int): The axis to reduce across.

Args:

  • src (NDBuffer[rank, input_shape, type]): The input buffer.
  • dst (NDBuffer[rank, output_shape, type]): The output buffer.

none_true

none_true[simd_width: Int, size: Dim, type: DType](src: Buffer[size, type]) -> Bool

Returns True if none of the elements in a buffer are True and False otherwise.

Parameters:

  • simd_width (Int): The vector width for the computation.
  • size (Dim): The buffer size.
  • type (DType): The buffer elements dtype.

Args:

  • src (Buffer[size, type]): The buffer.

Returns:

True if none of the elements of the buffer are True and False otherwise.

product

product[simd_width: Int, size: Dim, type: DType](src: Buffer[size, type]) -> SIMD[type, 1]

Computes the product of the buffer elements.

Parameters:

  • simd_width (Int): The vector width for the computation.
  • size (Dim): The buffer size.
  • type (DType): The buffer elements dtype.

Args:

  • src (Buffer[size, type]): The buffer.

Returns:

The product of the buffer elements.

product[simd_width: Int, rank: Int, input_shape: DimList, output_shape: DimList, type: DType, reduce_axis: Int](src: NDBuffer[rank, input_shape, type], dst: NDBuffer[rank, output_shape, type])

Computes the product across reduce_axis of an NDBuffer.

Parameters:

  • simd_width (Int): The vector width for the computation.
  • rank (Int): The rank of the input/output buffers.
  • input_shape (DimList): The input buffer shape.
  • output_shape (DimList): The output buffer shape.
  • type (DType): The buffer elements dtype.
  • reduce_axis (Int): The axis to reduce across.

Args:

  • src (NDBuffer[rank, input_shape, type]): The input buffer.
  • dst (NDBuffer[rank, output_shape, type]): The output buffer.

reduce

reduce[simd_width: Int, size: Dim, type: DType, acc_type: DType, map_fn: fn[Int, DType, DType](SIMD[*(0,1), *(0,0)], SIMD[*(0,2), *(0,0)]) capturing -> SIMD[*(0,1), *(0,0)], reduce_fn: fn[Int, DType](SIMD[*(0,1), *(0,0)]) -> SIMD[*(0,1), 1]](src: Buffer[size, type], init: SIMD[acc_type, 1]) -> SIMD[acc_type, 1]

Compute a custom reduction of buffer elements.

Parameters:

  • simd_width (Int): The vector width for the computation.
  • size (Dim): The buffer size.
  • type (DType): The buffer elements dtype.
  • acc_type (DType): The dtype of the reduction accumulator.
  • map_fn (fn[Int, DType, DType](SIMD[*(0,1), *(0,0)], SIMD[*(0,2), *(0,0)]) capturing -> SIMD[*(0,1), *(0,0)]): A mapping function. This function is used when to combine (accumulate) two chunks of input data: e.g. we load two 8xf32 vectors of elements and need to reduce them to a single 8xf32 vector.
  • reduce_fn (fn[Int, DType](SIMD[*(0,1), *(0,0)]) -> SIMD[*(0,1), 1]): A reduction function. This function is used to reduce a vector to a scalar. E.g. when we got 8xf32 vector and want to reduce it to 1xf32.

Args:

  • src (Buffer[size, type]): The input buffer.
  • init (SIMD[acc_type, 1]): The initial value to use in accumulator.

Returns:

The computed reduction value.

reduce[simd_width: Int, rank: Int, input_shape: DimList, output_shape: DimList, type: DType, acc_type: DType, map_fn: fn[Int, DType, DType](SIMD[*(0,1), *(0,0)], SIMD[*(0,2), *(0,0)]) capturing -> SIMD[*(0,1), *(0,0)], reduce_fn: fn[Int, DType](SIMD[*(0,1), *(0,0)]) -> SIMD[*(0,1), 1], reduce_axis: Int](src: NDBuffer[rank, input_shape, type], dst: NDBuffer[rank, output_shape, acc_type], init: SIMD[acc_type, 1])

Performs a reduction across reduce_axis of an NDBuffer (src) and stores the result in an NDBuffer (dst).

First src is reshaped into a 3D tensor. Without loss of generality, the three axes will be referred to as [H,W,C], where the axis to reduce across is W, the axes before the reduce axis are packed into H, and the axes after the reduce axis are packed into C. i.e. a tensor with dims [D1, D2, …, Di, …, Dn] reducing across axis i gets packed into a 3D tensor with dims [H, W, C], where H=prod(D1,…,Di-1), W = Di, and C = prod(Di+1,…,Dn).

Parameters:

  • simd_width (Int): The vector width for the computation.
  • rank (Int): The rank of the input/output buffers.
  • input_shape (DimList): The input buffer shape.
  • output_shape (DimList): The output buffer shape.
  • type (DType): The buffer elements dtype.
  • acc_type (DType): The dtype of the reduction accumulator.
  • map_fn (fn[Int, DType, DType](SIMD[*(0,1), *(0,0)], SIMD[*(0,2), *(0,0)]) capturing -> SIMD[*(0,1), *(0,0)]): A mapping function. This function is used when to combine (accumulate) two chunks of input data: e.g. we load two 8xf32 vectors of elements and need to reduce them to a single 8xf32 vector.
  • reduce_fn (fn[Int, DType](SIMD[*(0,1), *(0,0)]) -> SIMD[*(0,1), 1]): A reduction function. This function is used to reduce a vector to a scalar. E.g. when we got 8xf32 vector and want to reduce it to 1xf32.
  • reduce_axis (Int): The axis to reduce across.

Args:

  • src (NDBuffer[rank, input_shape, type]): The input buffer.
  • dst (NDBuffer[rank, output_shape, acc_type]): The output buffer.
  • init (SIMD[acc_type, 1]): The initial value to use in accumulator.

reduce_boolean

reduce_boolean[simd_width: Int, size: Dim, type: DType, reduce_fn: fn[Int, DType](SIMD[*(0,1), *(0,0)]) capturing -> Bool, continue_fn: fn(Bool) capturing -> Bool](src: Buffer[size, type], init: Bool) -> Bool

Compute a bool reduction of buffer elements. The reduction will early exit if the continue_fn returns False.

Parameters:

  • simd_width (Int): The vector width for the computation.
  • size (Dim): The buffer size.
  • type (DType): The buffer elements dtype.
  • reduce_fn (fn[Int, DType](SIMD[*(0,1), *(0,0)]) capturing -> Bool): A boolean reduction function. This function is used to reduce a vector to a scalar. E.g. when we got 8xf32 vector and want to reduce it to a bool.
  • continue_fn (fn(Bool) capturing -> Bool): A function to indicate whether we want to continue processing the rest of the iterations. This takes the result of the reduce_fn and returns True to continue processing and False to early exit.

Args:

  • src (Buffer[size, type]): The input buffer.
  • init (Bool): The initial value to use.

Returns:

The computed reduction value.

sum

sum[simd_width: Int, size: Dim, type: DType](src: Buffer[size, type]) -> SIMD[type, 1]

Computes the sum of buffer elements.

Parameters:

  • simd_width (Int): The vector width for the computation.
  • size (Dim): The buffer size.
  • type (DType): The buffer elements dtype.

Args:

  • src (Buffer[size, type]): The buffer.

Returns:

The sum of the buffer elements.

sum[simd_width: Int, rank: Int, input_shape: DimList, output_shape: DimList, type: DType, reduce_axis: Int](src: NDBuffer[rank, input_shape, type], dst: NDBuffer[rank, output_shape, type])

Computes the sum across reduce_axis of an NDBuffer.

Parameters:

  • simd_width (Int): The vector width for the computation.
  • rank (Int): The rank of the input/output buffers.
  • input_shape (DimList): The input buffer shape.
  • output_shape (DimList): The output buffer shape.
  • type (DType): The buffer elements dtype.
  • reduce_axis (Int): The axis to reduce across.

Args:

  • src (NDBuffer[rank, input_shape, type]): The input buffer.
  • dst (NDBuffer[rank, output_shape, type]): The output buffer.

variance

variance[simd_width: Int, size: Dim, type: DType](src: Buffer[size, type], mean_value: SIMD[type, 1], correction: Int) -> SIMD[type, 1]

Given a mean, computes the variance of elements in a buffer.

The mean value is used to avoid a second pass over the data:

variance = sum((x - E(x))^2) / (size - correction)

Parameters:

  • simd_width (Int): The vector width for the computation.
  • size (Dim): The buffer size.
  • type (DType): The buffer elements dtype.

Args:

  • src (Buffer[size, type]): The buffer.
  • mean_value (SIMD[type, 1]): The mean value of the buffer.
  • correction (Int): Normalize variance by size - correction.

Returns:

The variance value of the elements in a buffer.

variance[simd_width: Int, size: Dim, type: DType](src: Buffer[size, type], correction: Int) -> SIMD[type, 1]

Computes the variance value of the elements in a buffer.

variance(src) = sum((x - E(x))^2) / (size - correction)

Parameters:

  • simd_width (Int): The vector width for the computation.
  • size (Dim): The buffer size.
  • type (DType): The buffer elements dtype.

Args:

  • src (Buffer[size, type]): The buffer.
  • correction (Int): Normalize variance by size - correction (Default=1).

Returns:

The variance value of the elements in a buffer.