For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Basic operations

When you build a neural network model, you define the computations at each step: multiplying inputs by weights, applying activation functions, reducing across batch dimensions. Operations are the functions that perform these computations on tensors.

This page covers how to use operations with the max.experimental API. Be aware that you cannot use these operations with Graph objects directly. See Graph overview for more information.

How to call operations

MAX provides three ways to call operations on tensors:

Python operators: Use standard operators like +, -, *, /, @, and ** for common arithmetic and linear algebra operations.
Tensor methods: Call operations directly on Tensor objects, like x.sum(axis=0), x.reshape([2, 3]), or x.transpose(0, 1).
Functional API: Call operations from max.experimental.functional that take tensors as input, such as F.relu(x) or F.concat([a, b]).

When to use the functional API

Python operations and tensor methods cover most arithmetic and shape operations. You'll need the functional API for activation functions, multi-tensor operations, and operations with no tensor method equivalent. Here are a few examples of each:

Activation functions: F.relu(), F.sigmoid(), and F.tanh() don't have tensor method equivalents.
Multi-tensor operations: F.concat() operates on multiple tensors at once.
Selected reductions: F.min() is functional-only; sum, mean, and max have both tensor method and functional forms.

Perform arithmetic operations

Device and dtype defaults

By default, MAX uses float32 on CPU and bfloat16 on GPU (if one is available). The expected output shown below reflects a CPU environment; on a GPU, dtype will be bfloat16 and device will show the GPU.

The +, -, *, and / operators perform element-wise operations on tensors of matching shapes. The following example adds and multiplies two 1-D tensors:

from max.experimental.tensor import Tensor

a = Tensor([1.0, 2.0, 3.0])
b = Tensor([4.0, 5.0, 6.0])

addition       = a + b
subtraction    = a - b
multiplication = a * b
division       = a / b

print(addition)
print(multiplication)

The expected output is:

Tensor([5 7 9], dtype=DType.float32, device=Device(type=cpu,id=0))
Tensor([ 4 10 18], dtype=DType.float32, device=Device(type=cpu,id=0))

For operations without built-in Python equivalents, use the functional API. abs() finds the absolute value and ** performs exponentiation. F.sqrt() uses the functional API since there's no built-in function or tensor method for square root:

import max.experimental.functional as F
from max.experimental.tensor import Tensor

x = Tensor([1.0, -4.0, 9.0, -16.0])

absolute     = abs(x)
power        = x ** 2
square_root  = F.sqrt(abs(x))

print(f"Absolute value: {absolute}")
print(f"Power (x**2): {power}")
print(f"Square root: {square_root}")

The expected output is:

Absolute value: Tensor([ 1  4  9 16], dtype=DType.float32, device=Device(type=cpu,id=0))
Power (x**2): Tensor([  1  16  81 256], dtype=DType.float32, device=Device(type=cpu,id=0))
Square root: Tensor([1 2 3 4], dtype=DType.float32, device=Device(type=cpu,id=0))

Manipulate tensor shapes

Shape operations reorganize tensor data without changing the underlying values. These operations are essential for preparing data for different layers in neural networks.

Reshape tensors

reshape() changes the shape of a tensor while preserving the total number of elements. The following example transforms a 12-element vector into three different layouts:

from max.experimental.tensor import Tensor

x = Tensor([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
print(f"Original shape: {x.shape}")

matrix = x.reshape([3, 4])
print(f"Reshaped to 3x4: {matrix.shape}")
print(matrix)

cube = x.reshape([2, 2, 3])
print(f"Reshaped to 2x2x3: {cube.shape}")

The expected output is:

Original shape: [Dim(12)]
Reshaped to 3x4: [Dim(3), Dim(4)]
Tensor([ 1  2  3  4
  5  6  7  8
  9 10 11 12], dtype=DType.float32, device=Device(type=cpu,id=0))
Reshaped to 2x2x3: [Dim(2), Dim(2), Dim(3)]

Transpose tensors

transpose() swaps two dimensions of a tensor. The following example converts a 2×3 matrix to 3×2:

from max.experimental.tensor import Tensor

x = Tensor([[1, 2, 3], [4, 5, 6]])
print(f"Original shape: {x.shape}")
print(x)

y = x.transpose(0, 1)
print(f"Transposed shape: {y.shape}")
print(y)

The expected output is:

Original shape: [Dim(2), Dim(3)]
Tensor([1 2 3
 4 5 6], dtype=DType.float32, device=Device(type=cpu,id=0))
Transposed shape: [Dim(3), Dim(2)]
Tensor([1 4
 2 5
 3 6], dtype=DType.float32, device=Device(type=cpu,id=0))

The element at position [i, j] in the original tensor moves to position [j, i] in the transposed tensor.

For the common case of transposing the last two dimensions, use the .T property:

from max.experimental.tensor import Tensor

x = Tensor([[1, 2, 3], [4, 5, 6]])

y = x.T
print(f"Transposed shape: {y.shape}")
print(y)

The expected output is:

Transposed shape: [Dim(3), Dim(2)]
Tensor([1 4
 2 5
 3 6], dtype=DType.float32, device=Device(type=cpu,id=0))

.T is equivalent to transpose(-1, -2) and works on tensors of any rank.

When you need to rearrange more than two dimensions, permute() specifies a new order for all dimensions as a list of indices. The following example converts a (batch, channels, length) tensor to (batch, length, channels):

from max.experimental.tensor import Tensor

# (batch=2, channels=3, length=4)
x = Tensor([[[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]],
            [[13, 14, 15, 16], [17, 18, 19, 20], [21, 22, 23, 24]]])
print(f"Original shape: {x.shape}")

# Rearrange to (batch, length, channels)
y = x.permute([0, 2, 1])
print(f"Permuted shape: {y.shape}")

The expected output is:

Original shape: [Dim(2), Dim(3), Dim(4)]
Permuted shape: [Dim(2), Dim(4), Dim(3)]

Concatenate tensors

F.concat() joins multiple tensors along a specified dimension. The following example concatenates two 2×2 matrices along each axis:

import max.experimental.functional as F
from max.experimental.tensor import Tensor

a = Tensor([[1, 2], [3, 4]])
b = Tensor([[5, 6], [7, 8]])

vertical   = F.concat([a, b], axis=0)
horizontal = F.concat([a, b], axis=1)

print(f"Concatenated along axis 0: {vertical.shape}")
print(vertical)

print(f"Concatenated along axis 1: {horizontal.shape}")
print(horizontal)

The expected output is:

Concatenated along axis 0: [Dim(4), Dim(2)]
Tensor([1 2
 3 4
 5 6
 7 8], dtype=DType.float32, device=Device(type=cpu,id=0))
Concatenated along axis 1: [Dim(2), Dim(4)]
Tensor([1 2 5 6
 3 4 7 8], dtype=DType.float32, device=Device(type=cpu,id=0))

Axis 0 stacks tensors vertically (adding rows); axis 1 joins them horizontally (adding columns). F.concat() requires the functional API since it operates on multiple tensors.

Apply reduction operations

Reduction operations aggregate values along a specified dimension, producing a smaller tensor. All reductions require an explicit axis argument; without one, the reduction applies along the last axis by default. All reductions keep the reduced dimension in the output shape (unlike NumPy and PyTorch, which drop it by default).

The following example shows sum, mean, max, and min along axis 0, summing across rows to get per-column results:

import max.experimental.functional as F
from max.experimental.tensor import Tensor

x = Tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])

sum_axis0 = x.sum(axis=0)   # sum each column across rows
sum_axis1 = x.sum(axis=1)   # sum each row across columns

print(f"Sum along axis 0: {sum_axis0}")
print(f"Sum along axis 1: {sum_axis1}")

mean_val = x.mean(axis=0)
max_val  = x.max(axis=0)
min_val  = F.min(x, axis=0)

print(f"Mean along axis 0: {mean_val}")
print(f"Max along axis 0: {max_val}")
print(f"Min along axis 0: {min_val}")

The expected output is:

Sum along axis 0: Tensor([5 7 9], dtype=DType.float32, device=Device(type=cpu,id=0))
Sum along axis 1: Tensor([ 6
 15], dtype=DType.float32, device=Device(type=cpu,id=0))
Mean along axis 0: Tensor([2.5 3.5 4.5], dtype=DType.float32, device=Device(type=cpu,id=0))
Max along axis 0: Tensor([4 5 6], dtype=DType.float32, device=Device(type=cpu,id=0))
Min along axis 0: Tensor([1 2 3], dtype=DType.float32, device=Device(type=cpu,id=0))

sum(axis=0) and sum(axis=1) return tensors with shape [Dim(1), Dim(3)] and [Dim(2), Dim(1)] respectively, with the reduced dimension kept at size 1. F.min() is functional-only; sum, mean, and max are also available as F.sum(), F.mean(), and F.max().

Perform matrix operations

The @ operator multiplies two matrices. The following example multiplies a 2×2 input matrix by a 2×2 weight matrix:

from max.experimental.tensor import Tensor

x = Tensor([[1.0, 2.0], [3.0, 4.0]])
w = Tensor([[5.0, 6.0], [7.0, 8.0]])

result = x @ w
print("Matrix multiplication result:")
print(result)

The expected output is:

Matrix multiplication result:
Tensor([19 22
 43 50], dtype=DType.float32, device=Device(type=cpu,id=0))

Matrix multiplication is also available as F.matmul().

Add activation functions

Activation functions are available through the functional API only. F.relu() sets negative values to zero, F.sigmoid() maps values to (0, 1), and F.tanh() maps values to (-1, 1):

import max.experimental.functional as F
from max.experimental.tensor import Tensor

x = Tensor([[-2.0, -1.0, 0.0], [1.0, 2.0, 3.0]])

relu_output    = F.relu(x)
sigmoid_output = F.sigmoid(x)
tanh_output    = F.tanh(x)

print(f"ReLU: {relu_output}")
print(f"Sigmoid: {sigmoid_output}")
print(f"Tanh: {tanh_output}")

The expected output is:

ReLU: Tensor([0 0 0
 1 2 3], dtype=DType.float32, device=Device(type=cpu,id=0))
Sigmoid: Tensor([0.1192 0.2689    0.5
 0.7311 0.8808 0.9526], dtype=DType.float32, device=Device(type=cpu,id=0))
Tanh: Tensor([ -0.964 -0.7616       0
  0.7616   0.964  0.9951], dtype=DType.float32, device=Device(type=cpu,id=0))

Generate random tensors

The max.experimental.random module provides functions for creating tensors with random values. random.uniform() generates values uniformly distributed over a range; random.normal() generates values from a Gaussian distribution with the specified mean and standard deviation. Values vary each run:

from max.experimental import random

uniform_tensor = random.uniform([3, 3], range=(0.0, 1.0))
normal_tensor  = random.normal([3, 3], mean=0.0, std=1.0)

print("Uniform distribution:")
print(uniform_tensor)

print("\nNormal distribution:")
print(normal_tensor)

Build layers

You can combine operations to implement neural network layers. The following example shows a linear layer using @ for matrix multiplication, + for bias addition, and F.relu() for the activation step. Pre-built layers like nn.Linear work this way internally. Understanding the operations gives you the foundation for custom layers when standard ones don't fit:

import max.experimental.functional as F
from max.experimental import random
from max.experimental.tensor import Tensor


def linear_layer(x: Tensor, weights: Tensor, bias: Tensor) -> Tensor:
    return F.relu(x @ weights + bias)


# input: (batch=2, features=4)
x       = Tensor([[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0]])
weights = random.normal([4, 3])  # (features=4, output=3)
bias    = Tensor.zeros([3])

output = linear_layer(x, weights, bias)
print(f"Output shape: {output.shape}")

The expected output is (weight values vary):

Output shape: [Dim(2), Dim(3)]

Next steps

Now that you understand tensor operations, continue learning with these topics:

Build a model graph with Module: Use operations in your model's __call__() method to define computation.
Neural network modules: Explore pre-built layers in max.nn such as Linear, Conv2d, and ReLU.
Custom operations: Implement your own operations in Mojo when built-in operations don't meet your needs.

How to call operations​

When to use the functional API​

Perform arithmetic operations​

Manipulate tensor shapes​

Reshape tensors​

Transpose tensors​

Concatenate tensors​

Apply reduction operations​

Perform matrix operations​

Add activation functions​

Generate random tensors​

Build layers​

Next steps​

How to call operations

When to use the functional API

Perform arithmetic operations

Manipulate tensor shapes

Reshape tensors

Transpose tensors

Concatenate tensors

Apply reduction operations

Perform matrix operations

Add activation functions

Generate random tensors

Build layers

Next steps