Mojo function
tc_reduce_gevm_8x
tc_reduce_gevm_8x[out_type: DType, in_type: DType, simd_width: Int](val1: SIMD[in_type, simd_width], val2: SIMD[in_type, simd_width]) -> SIMD[out_type, simd_width]
Performs an 8x GEVM reduction using tensor cores.
Note: Currently only supports bfloat16 input to float32 output conversion. Uses tensor core matrix multiply-accumulate (MMA) operations for reduction.
Parameters:
- out_type (
DType
): The output data type for the reduction result (must be float32). - in_type (
DType
): The input data type of the vectors to reduce (must be bfloat16). - simd_width (
Int
): The width of the SIMD vectors.
Args:
- val1 (
SIMD[in_type, simd_width]
): First input SIMD vector to reduce. - val2 (
SIMD[in_type, simd_width]
): Second input SIMD vector to reduce.
Returns:
SIMD vector containing the reduced result.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!