Basic operations
When you build a neural network model, you define the computations at each step: multiplying inputs by weights, applying activation functions, reducing across batch dimensions. Operations are the functions that perform these computations on tensors.
MAX provides three ways to call operations on tensors:
- Python operators: Use standard operators like
+,-,*,/,@, and**for common arithmetic and linear algebra operations. - Tensor methods: Call operations directly on
Tensorobjects, likex.sum(axis=0),x.reshape([2, 3]), orx.transpose(0, 1). - Functional API: Call operations from
max.experimental.functionalthat take tensors as input, such asF.relu(x)orF.concat([a, b]).
When to use the functional API
Tensor methods cover most arithmetic and shape operations. You'll need the functional API for activation functions, multi-tensor operations, and operations with no tensor method equivalent:
- Activation functions:
F.relu(),F.sigmoid(), andF.tanh()don't have tensor method equivalents. - Multi-tensor operations:
F.concat()operates on multiple tensors at once. - Selected reductions:
F.min()is functional-only;sum,mean, andmaxhave both tensor method and functional forms.
Perform arithmetic operations
The +, -, *, and / operators perform element-wise operations on
tensors of matching shapes. The following example adds and multiplies two
1-D tensors:
from max.experimental.tensor import Tensor
a = Tensor([1.0, 2.0, 3.0])
b = Tensor([4.0, 5.0, 6.0])
addition = a + b
subtraction = a - b
multiplication = a * b
division = a / b
print(addition)
print(multiplication)The expected output is:
Tensor([5 7 9], dtype=DType.float32, device=Device(type=cpu,id=0))
Tensor([ 4 10 18], dtype=DType.float32, device=Device(type=cpu,id=0))For operations without built-in Python equivalents, use the functional API.
abs() finds the absolute value and ** performs exponentiation.
F.sqrt()
uses the functional API since there's no built-in function or tensor method for
square root:
import max.experimental.functional as F
from max.experimental.tensor import Tensor
x = Tensor([1.0, -4.0, 9.0, -16.0])
absolute = abs(x)
power = x ** 2
square_root = F.sqrt(abs(x))
print(f"Absolute value: {absolute}")
print(f"Power (x**2): {power}")
print(f"Square root: {square_root}")The expected output is:
Absolute value: Tensor([ 1 4 9 16], dtype=DType.float32, device=Device(type=cpu,id=0))
Power (x**2): Tensor([ 1 16 81 256], dtype=DType.float32, device=Device(type=cpu,id=0))
Square root: Tensor([1 2 3 4], dtype=DType.float32, device=Device(type=cpu,id=0))Manipulate tensor shapes
Shape operations reorganize tensor data without changing the underlying values. These operations are essential for preparing data for different layers in neural networks.
Reshape tensors
reshape()
changes the shape of a tensor while preserving the total number of elements. The
following example transforms a 12-element vector into three different layouts:
from max.experimental.tensor import Tensor
x = Tensor([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
print(f"Original shape: {x.shape}")
matrix = x.reshape([3, 4])
print(f"Reshaped to 3x4: {matrix.shape}")
print(matrix)
cube = x.reshape([2, 2, 3])
print(f"Reshaped to 2x2x3: {cube.shape}")The expected output is:
Original shape: [Dim(12)]
Reshaped to 3x4: [Dim(3), Dim(4)]
Tensor([ 1 2 3 4
5 6 7 8
9 10 11 12], dtype=DType.float32, device=Device(type=cpu,id=0))
Reshaped to 2x2x3: [Dim(2), Dim(2), Dim(3)]Transpose tensors
transpose()
swaps two dimensions of a tensor. The following example converts a 2×3 matrix to
3×2:
from max.experimental.tensor import Tensor
x = Tensor([[1, 2, 3], [4, 5, 6]])
print(f"Original shape: {x.shape}")
print(x)
y = x.transpose(0, 1)
print(f"Transposed shape: {y.shape}")
print(y)The expected output is:
Original shape: [Dim(2), Dim(3)]
Tensor([1 2 3
4 5 6], dtype=DType.float32, device=Device(type=cpu,id=0))
Transposed shape: [Dim(3), Dim(2)]
Tensor([1 4
2 5
3 6], dtype=DType.float32, device=Device(type=cpu,id=0))The element at position [i, j] in the original tensor moves to position
[j, i] in the transposed tensor.
For the common case of transposing the last two dimensions, use the .T
property:
from max.experimental.tensor import Tensor
x = Tensor([[1, 2, 3], [4, 5, 6]])
y = x.T
print(f"Transposed shape: {y.shape}")
print(y)The expected output is:
Transposed shape: [Dim(3), Dim(2)]
Tensor([1 4
2 5
3 6], dtype=DType.float32, device=Device(type=cpu,id=0)).T is equivalent to transpose(-1, -2) and works on tensors of any rank.
When you need to rearrange more than two dimensions,
permute()
specifies a new order for all dimensions as a list of indices. The following
example converts a (batch, channels, length) tensor to
(batch, length, channels):
from max.experimental.tensor import Tensor
# (batch=2, channels=3, length=4)
x = Tensor([[[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]],
[[13, 14, 15, 16], [17, 18, 19, 20], [21, 22, 23, 24]]])
print(f"Original shape: {x.shape}")
# Rearrange to (batch, length, channels)
y = x.permute([0, 2, 1])
print(f"Permuted shape: {y.shape}")The expected output is:
Original shape: [Dim(2), Dim(3), Dim(4)]
Permuted shape: [Dim(2), Dim(4), Dim(3)]Concatenate tensors
F.concat()
joins multiple tensors along a specified dimension. The following example
concatenates two 2×2 matrices along each axis:
import max.experimental.functional as F
from max.experimental.tensor import Tensor
a = Tensor([[1, 2], [3, 4]])
b = Tensor([[5, 6], [7, 8]])
vertical = F.concat([a, b], axis=0)
horizontal = F.concat([a, b], axis=1)
print(f"Concatenated along axis 0: {vertical.shape}")
print(vertical)
print(f"Concatenated along axis 1: {horizontal.shape}")
print(horizontal)The expected output is:
Concatenated along axis 0: [Dim(4), Dim(2)]
Tensor([1 2
3 4
5 6
7 8], dtype=DType.float32, device=Device(type=cpu,id=0))
Concatenated along axis 1: [Dim(2), Dim(4)]
Tensor([1 2 5 6
3 4 7 8], dtype=DType.float32, device=Device(type=cpu,id=0))Axis 0 stacks tensors vertically (adding rows); axis 1 joins them
horizontally (adding columns). F.concat() requires the functional API since it
operates on multiple tensors.
Apply reduction operations
Reduction operations aggregate values along a specified dimension, producing a
smaller tensor. All reductions require an explicit axis argument; without
one, the reduction applies along the last axis by default. All reductions keep
the reduced dimension in the output shape (unlike NumPy and PyTorch, which
drop it by default).
The following example shows sum, mean, max, and min along axis 0,
summing across rows to get per-column results:
import max.experimental.functional as F
from max.experimental.tensor import Tensor
x = Tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
sum_axis0 = x.sum(axis=0) # sum each column across rows
sum_axis1 = x.sum(axis=1) # sum each row across columns
print(f"Sum along axis 0: {sum_axis0}")
print(f"Sum along axis 1: {sum_axis1}")
mean_val = x.mean(axis=0)
max_val = x.max(axis=0)
min_val = F.min(x, axis=0)
print(f"Mean along axis 0: {mean_val}")
print(f"Max along axis 0: {max_val}")
print(f"Min along axis 0: {min_val}")The expected output is:
Sum along axis 0: Tensor([5 7 9], dtype=DType.float32, device=Device(type=cpu,id=0))
Sum along axis 1: Tensor([ 6
15], dtype=DType.float32, device=Device(type=cpu,id=0))
Mean along axis 0: Tensor([2.5 3.5 4.5], dtype=DType.float32, device=Device(type=cpu,id=0))
Max along axis 0: Tensor([4 5 6], dtype=DType.float32, device=Device(type=cpu,id=0))
Min along axis 0: Tensor([1 2 3], dtype=DType.float32, device=Device(type=cpu,id=0))sum(axis=0) and sum(axis=1) return tensors with shape [Dim(1), Dim(3)] and
[Dim(2), Dim(1)] respectively, with the reduced dimension kept at size 1.
F.min() is functional-only; sum, mean, and max are also available as
F.sum(),
F.mean(),
and
F.max().
Perform matrix operations
The @ operator multiplies two matrices. The following example multiplies a
2×2 input matrix by a 2×2 weight matrix:
from max.experimental.tensor import Tensor
x = Tensor([[1.0, 2.0], [3.0, 4.0]])
w = Tensor([[5.0, 6.0], [7.0, 8.0]])
result = x @ w
print("Matrix multiplication result:")
print(result)The expected output is:
Matrix multiplication result:
Tensor([19 22
43 50], dtype=DType.float32, device=Device(type=cpu,id=0))Matrix multiplication is also available as
F.matmul().
Add activation functions
Activation functions are available through the functional API only.
F.relu()
sets negative values to zero,
F.sigmoid()
maps values to (0, 1), and
F.tanh()
maps values to (-1, 1):
import max.experimental.functional as F
from max.experimental.tensor import Tensor
x = Tensor([[-2.0, -1.0, 0.0], [1.0, 2.0, 3.0]])
relu_output = F.relu(x)
sigmoid_output = F.sigmoid(x)
tanh_output = F.tanh(x)
print(f"ReLU: {relu_output}")
print(f"Sigmoid: {sigmoid_output}")
print(f"Tanh: {tanh_output}")The expected output is:
ReLU: Tensor([0 0 0
1 2 3], dtype=DType.float32, device=Device(type=cpu,id=0))
Sigmoid: Tensor([0.1192 0.2689 0.5
0.7311 0.8808 0.9526], dtype=DType.float32, device=Device(type=cpu,id=0))
Tanh: Tensor([ -0.964 -0.7616 0
0.7616 0.964 0.9951], dtype=DType.float32, device=Device(type=cpu,id=0))Generate random tensors
The max.experimental.random module
provides functions for creating tensors with random values.
random.uniform()
generates values uniformly distributed over a range;
random.normal()
generates values from a Gaussian distribution with the specified mean and
standard deviation. Values vary each run:
from max.experimental import random
uniform_tensor = random.uniform([3, 3], range=(0.0, 1.0))
normal_tensor = random.normal([3, 3], mean=0.0, std=1.0)
print("Uniform distribution:")
print(uniform_tensor)
print("\nNormal distribution:")
print(normal_tensor)Build layers
You can combine operations to implement neural network layers. The following
example shows a linear layer using @ for matrix multiplication, + for bias
addition, and
F.relu()
for the activation step. Pre-built layers like
nn.Linear work this way internally.
Understanding the operations gives you the foundation for custom layers when
standard ones don't fit:
import max.experimental.functional as F
from max.experimental import random
from max.experimental.tensor import Tensor
def linear_layer(x: Tensor, weights: Tensor, bias: Tensor) -> Tensor:
return F.relu(x @ weights + bias)
# input: (batch=2, features=4)
x = Tensor([[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0]])
weights = random.normal([4, 3]) # (features=4, output=3)
bias = Tensor.zeros([3])
output = linear_layer(x, weights, bias)
print(f"Output shape: {output.shape}")The expected output is (weight values vary):
Output shape: [Dim(2), Dim(3)]Next steps
Now that you understand tensor operations, continue learning with these topics:
- Build a model graph with Module: Use operations
in your model's
__call__()method to define computation. - Neural network modules: Explore pre-built layers in
max.nnsuch asLinear,Conv2d, andReLU. - Custom operations: Implement your own operations in Mojo when built-in operations don't meet your needs.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!