IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python class

LayerNorm

LayerNorm​

class max.experimental.nn.norm.LayerNorm(dim, eps=1e-05, *, keep_dtype=True, elementwise_affine=True, use_bias=True)

source

Bases: Module

Layer normalization over the last dimension of the input.

Takes an integer dim and always reduces over the last axis. By default the reduction runs in the input dtype. Pass keep_dtype=False to upcast to float32 for the reduction and cast back, which trades a small amount of throughput for numerical stability on float16 or bfloat16 inputs.

For example:

from max.dtype import DType
from max.experimental.nn.norm import LayerNorm
from max.experimental.realization_context import (
    GraphRealizationContext,
    realization_context,
)
from max.experimental.tensor import Tensor
from max.graph import DeviceRef, Graph, TensorType

graph = Graph(
    "ln",
    input_types=[
        TensorType(DType.float32, ("batch", "seq", 2048), DeviceRef.GPU()),
    ],
)
ctx = GraphRealizationContext(graph)
with realization_context(ctx), ctx:
    x = Tensor.from_graph_value(graph.inputs[0])
    norm = LayerNorm(2048)
    y = norm(x)
    graph.output(y)

Parameters:

  • dim (int) – The size of the last dimension of the input.
  • eps (float) – A small positive constant added to the variance for numerical stability. Defaults to 1e-5.
  • keep_dtype (bool) – Whether to run the reduction in the input dtype. Pass False to upcast to float32 for the reduction and cast back. Defaults to True.
  • elementwise_affine (bool) – Whether to learn a per-element scale (and optional bias). When False, no parameters are created and the normalized output is returned directly. Defaults to True.
  • use_bias (bool) – Whether to learn an additive bias. Only effective when elementwise_affine is True. Defaults to True.

bias​

bias: Tensor | None

source

The learned per-element bias of shape [dim], or None when elementwise_affine is False or use_bias is False.

forward()​

forward(x)

source

Returns x normalized over its last dimension.

Parameters:

x (Tensor)

Return type:

Tensor

weight​

weight: Tensor | None

source

The learned per-element scale of shape [dim], or None when elementwise_affine is False.