Skip to main content

Mojo function

max

max[dtype: DType, width: Int, //, *, block_size: Int, broadcast: Bool = True](val: SIMD[dtype, width]) -> SIMD[dtype, width]

Computes the maximum value across all threads in a block.

Performs a parallel reduction using warp-level operations and shared memory to find the global maximum across all threads in the block.

Parameters:

  • dtype (DType): The data type of the SIMD elements.
  • width (Int): The number of elements in each SIMD vector.
  • block_size (Int): The total number of threads in the block.
  • broadcast (Bool): If True, the final reduced value is broadcast to all threads in the block. If False, only the first thread will have the complete result.

Args:

  • val (SIMD): The SIMD value to reduce. Each thread contributes its value to find the maximum.

Returns:

SIMD: If broadcast is True, each thread in the block will receive the maximum value across the entire block. Otherwise, only the first thread will have the complete result.

max[dtype: DType, width: Int, //, *, block_dim_x: Int, block_dim_y: Int, block_dim_z: Int = 1, broadcast: Bool = True](val: SIMD[dtype, width]) -> SIMD[dtype, width]

Computes the maximum value across all threads in a multi-dimensional block.

Performs a parallel reduction using warp-level operations and shared memory to find the global maximum across all threads in the block. Thread IDs are linearized in row-major order: x + y * dim_x + z * dim_x * dim_y.

Parameters:

  • dtype (DType): The data type of the SIMD elements.
  • width (Int): The number of elements in each SIMD vector.
  • block_dim_x (Int): The number of threads along the X dimension.
  • block_dim_y (Int): The number of threads along the Y dimension.
  • block_dim_z (Int): The number of threads along the Z dimension (default: 1).
  • broadcast (Bool): If True, the final reduced value is broadcast to all threads in the block. If False, only the first thread will have the complete result.

Args:

  • val (SIMD): The SIMD value to reduce. Each thread contributes its value to find the maximum.

Returns:

SIMD: If broadcast is True, each thread in the block will receive the maximum value across the entire block. Otherwise, only the first thread will have the complete result.

Was this page helpful?