Skip to main content

Mojo function

rms_norm_fused_residual_add

rms_norm_fused_residual_add[dtype: DType, rank: Int, //, input_0_fn: fn[width: Int, rank: Int](IndexList[rank]) capturing -> SIMD[dtype, width], input_1_fn: fn[width: Int, rank: Int](IndexList[rank]) capturing -> SIMD[dtype, width], output_0_fn: fn[width: Int, rank: Int, alignment: Int](idx: IndexList[rank], val: SIMD[dtype, width]) capturing -> None, output_residual_fn: fn[width: Int, rank: Int, alignment: Int](IndexList[rank], SIMD[dtype, width]) capturing -> None, /, target: StringSlice[StaticConstantOrigin] = "cpu", multiply_before_cast: Bool = True](shape: IndexList[rank], gamma1: TileTensor[dtype, LayoutType, origin, address_space=address_space, linear_idx_type=linear_idx_type, element_shape_types=element_shape_types], epsilon1: Scalar[dtype], weight_offset1: Scalar[dtype], gamma2: TileTensor[dtype, LayoutType, origin, address_space=address_space, linear_idx_type=linear_idx_type, element_shape_types=element_shape_types], epsilon2: Scalar[dtype], weight_offset2: Scalar[dtype], ctx: DeviceContextPtr)

Was this page helpful?