Skip to main content



APIs to build inference graphs for MAX Engine.

The MAX Graph API provides a low-level programming interface for high-performance inference graphs written in Mojo. It's an API for graph-building only, and it does not implement support for training.

To get started, you need to instantiate a Graph and specify its input and output shapes. Then build a sequence of ops, using ops provided in the graph.ops package or using your own custom ops, and add them to the graph by setting the output op(s) with Graph.output().

For example:

from max.graph import Graph, TensorType, ops
from tensor import Tensor, TensorShape

def build_model() -> Graph:
var graph = Graph(
in_types=TensorType(DType.float32, 2, 6),
out_types=TensorType(DType.float32, 2, 1),

var matmul_constant_value = Tensor[DType.float32](TensorShape(6, 1), 0.15)
var matmul_constant = graph.constant(matmul_constant_value)

var matmul = graph[0] @ matmul_constant
var relu = ops.elementwise.relu(matmul)
var softmax = ops.softmax(relu)

return graph

You can then load the Graph into MAX Engine with InferenceSession.load().

For more detail, see the tutorial about how to build a graph with MAX Graph.


  • checkpoint: APIs to save and load checkpoints for MAX graphs.
  • ops: Implements various ops used when building a graph.
  • quantization: APIs to quantize graph tensors.