Skip to main content

Mojo function

rms_norm_gpu_block

rms_norm_gpu_block[dtype: DType, //, simd_width: Int, max_warps_per_block: Int, input_fn: fn[Int](row: Int, col: Int) capturing -> SIMD[dtype, $0], output_fn: fn[Int, Int](row: Int, col: Int, val: SIMD[dtype, $0]) capturing -> None, multiply_before_cast: Bool](gamma: NDBuffer[dtype, 1, MutableAnyOrigin], epsilon: SIMD[dtype, 1], weight_offset: SIMD[dtype, 1], num_cols: Int)

Was this page helpful?