Mojo function

rms_norm_gpu_block

rms_norm_gpu_block[mut: Bool, LayoutType: TensorLayout, origin: Origin[mut=mut], dtype: DType, //, simd_width: Int, max_warps_per_block: Int, input_fn: def[width: Int](row: Int, col: Int) capturing -> SIMD[dtype, width], output_fn: def[width: Int, alignment: Int](row: Int, col: Int, val: SIMD[dtype, width]) capturing -> None, multiply_before_cast: Bool](gamma: TileTensor[dtype, LayoutType, origin], epsilon: Scalar[dtype], weight_offset: Scalar[dtype], num_cols: Int)

View source

Was this page helpful?

Thank you! We'll create more content like this.

Thank you for helping us improve!