For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo struct

ReduceRMSNormFusedResidualAdd

struct ReduceRMSNormFusedResidualAdd

Registers the mo.composite.rms_norm_fused_residual_add graph op with the graph compiler.

Implemented traits

AnyType, ImplicitlyDeletable

Methods

`execute`

static def execute[dtype: DType, rank: Int, target: StringSlice[ImmStaticOrigin], multiply_before_cast: Bool = True](output: ManagedTensorSlice[IOSpec[_, _].Output, static_spec=output.static_spec], residual_output: ManagedTensorSlice[IOSpec[_, _].Output, static_spec=residual_output.static_spec], input: ManagedTensorSlice[IOSpec[_, _].FusedInput, static_spec=input.static_spec], residual_input: ManagedTensorSlice[IOSpec[_, _].FusedInput, static_spec=residual_input.static_spec], gamma1: ManagedTensorSlice[IOSpec[_, _].Input, static_spec=gamma1.static_spec], gamma2: ManagedTensorSlice[IOSpec[_, _].Input, static_spec=gamma2.static_spec], epsilon1: Float32, epsilon2: Float32, weight_offset1: Scalar[dtype], weight_offset2: Scalar[dtype], ctx: DeviceContext)

Executes the mo.composite.rms_norm_fused_residual_add graph op.

Parameters:

dtype (DType): Element type of the input and output tensors.
rank (Int): Tensor rank of the input and output tensors.
target (StringSlice[ImmStaticOrigin]): Compilation target string.
multiply_before_cast (Bool): See the graph op signature.

Args:

output (ManagedTensorSlice[IOSpec[_, _].Output, static_spec=output.static_spec]): Output tensor receiving the result.
residual_output (ManagedTensorSlice[IOSpec[_, _].Output, static_spec=residual_output.static_spec]): See the graph op signature.
input (ManagedTensorSlice[IOSpec[_, _].FusedInput, static_spec=input.static_spec]): Input tensor to reduce.
residual_input (ManagedTensorSlice[IOSpec[_, _].FusedInput, static_spec=residual_input.static_spec]): Residual tensor added to the normalized input.
gamma1 (ManagedTensorSlice[IOSpec[_, _].Input, static_spec=gamma1.static_spec]): Scale weights for the first normalization.
gamma2 (ManagedTensorSlice[IOSpec[_, _].Input, static_spec=gamma2.static_spec]): Scale weights for the second normalization.
epsilon1 (Float32): Stability constant for the first normalization.
epsilon2 (Float32): Stability constant for the second normalization.
weight_offset1 (Scalar[dtype]): Scalar offset for the first weight.
weight_offset2 (Scalar[dtype]): Scalar offset for the second weight.
ctx (DeviceContext): Device context used to enqueue the kernel.

Raises:

Error: If the operation parameters are invalid.

Implemented traits​

Methods​

execute​

Implemented traits

Methods

`execute`