IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo struct

RowMeanOfSquaresQK

struct RowMeanOfSquaresQK

Fused per-row mean of squares for two operands Q and K.

For q of shape [M, Nq] and k of shape [M, Nk] (sharing rows but with possibly different column counts), computes out[m, 0] = mean_n(q[m,n]^2) and out[m, 1] = mean_n(k[m,n]^2) into a [M, 2] output. The square and accumulation always run in float32. This is a single-launch fusion of two mo.reduce.row_mean_of_squares ops plus a concat, used for cross-head QK-RMSNorm statistics under tensor parallelism.

Implemented traitsโ€‹

AnyType, ImplicitlyDeletable

Methodsโ€‹

executeโ€‹

static def execute[target: StringSlice[StaticConstantOrigin]](output: ManagedTensorSlice[Output, static_spec=output.static_spec], q: ManagedTensorSlice[Input, static_spec=q.static_spec], k: ManagedTensorSlice[Input, static_spec=k.static_spec], ctx: DeviceContext)