Build a graph with MAX Graph
Welcome to the quickstart tutorial for the MAX Graph API.
In this brief tutorial, you'll learn the basics about how to build a neural network graph using the MAX Graph API, and then execute it with the MAX Engine API, all in Mojo. To learn more about what MAX Graph is, see the MAX Graph intro.
Set up the project environmentβ
After you install Magic, create a new Mojo project:
magic init graph-project --format mojoproject
magic init graph-project --format mojoproject
Then activate a shell into the virtual environment:
cd graph-project && magic shell
cd graph-project && magic shell
This installs everything you need to get started with MAX and the Mojo APIs. You can verify your Mojo version like this:
mojo --version
mojo --version
Import Mojo packagesβ
We need to import max.graph
to build the graph and max.engine
to execute
it:
from max.engine import InferenceSession
from max.graph import Graph, TensorType, ops
from max.tensor import Tensor, TensorShape
from max.engine import InferenceSession
from max.graph import Graph, TensorType, ops
from max.tensor import Tensor, TensorShape
Create the graphβ
The Graph
object is your
starting point. It's a lot like a function: it has a name, it takes arguments (inputs),
it performs calculations (feeds data through the graph), and it returns values (outputs).
We can instantiate a Graph
that takes one input like this:
def main():
graph = Graph(TensorType(DType.float32, 2, 6))
print(graph)
def main():
graph = Graph(TensorType(DType.float32, 2, 6))
print(graph)
We're printing the graph just to satisfy curiosity, but what we get isn't useful because there's nothing connecting the input and output types, so we don't know the graph shape yet. This is basically an intermediate debug format for now.
When you initialize a Graph
, you need to specify the data type and shape for
the input using
TensorType
. In this case, the
input shape is 2x6
and float32
.
If your model takes multiple inputs or returns multiple outputs, you can pass
a list of TensorType
values like this (although we're still using just
one item in each list to match the model we're building):
def main():
graph = Graph(
in_types=List[Type](TensorType(DType.float32, 2, 6)),
out_types=List[Type](TensorType(DType.float32, 2, 1)),
)
def main():
graph = Graph(
in_types=List[Type](TensorType(DType.float32, 2, 6)),
out_types=List[Type](TensorType(DType.float32, 2, 1)),
)
Add some opsβ
All ops receive inputs from either graph inputs, constants, or other op outputs. To build a sequence of ops, call each op function and pass it the appropriate inputs, which usually includes the output from a previous op.
For example, we'll now add three simple ops to our graph:
- A matrix-multiplication function, which takes the graph input and a constant.
- A RELU activation function, which takes the mat-mul output.
- A softmax activation function, which takes the RELU output.
To close the graph, we then pass the final softmax op as the output:
# Create a constant for usage in the matmul op below:
matmul_constant_value = Tensor[DType.float32](TensorShape(6, 1), 0.15)
matmul_constant = graph.constant(matmul_constant_value)
# Start adding a sequence of operator calls to build the graph.
# We use the index accessor to get the graph's first input tensor:
matmul = graph[0] @ matmul_constant
relu = ops.relu(matmul)
softmax = ops.softmax(relu)
# Add the sequence of ops as the graph output:
graph.output(softmax)
# Create a constant for usage in the matmul op below:
matmul_constant_value = Tensor[DType.float32](TensorShape(6, 1), 0.15)
matmul_constant = graph.constant(matmul_constant_value)
# Start adding a sequence of operator calls to build the graph.
# We use the index accessor to get the graph's first input tensor:
matmul = graph[0] @ matmul_constant
relu = ops.relu(matmul)
softmax = ops.softmax(relu)
# Add the sequence of ops as the graph output:
graph.output(softmax)
Notice that we get the graph input using graph[0]
, which denotes the graph's
first input (it's the firstβand, in this case, the onlyβTensorType
passed to
the Graph
constructor in_type
). Then we perform a matrix-multiply with the
constant, using the @
matrix-multiply operator, which is equivalent to
calling ops.matmul()
.
The value returned into each variable (matmul
, relu
, and softmax
)
is a Symbol
value. Each one is
a symbolic handle for the output of an op, and not a real value/tensor. Because
we're building a static graph, real values won't exist until execution time,
and we can't execute the graph until we compile it with MAX Engine.
The only concrete value in the above code is matmul_constant_value
, which
holds static weights that we then convert into a Symbol
with
Graph.constant()
.
To finish the graph, we pass the entire sequence of ops to
Graph.output()
.
We can print the graph again, but it's still not pretty. At this point, the graph has output but it's now in an intermediate representation that uses a lot of ops, because we have not made any optimization passes yet. We'll improve this output soon to make it more useful to you.
print(graph)
print(graph)
Execute the modelβ
Now we can load the graph into a MAX Engine
InferenceSession
.
And, before we feed the model with inputs, we need to know the names for the
input and output tensors, so let's print those now:
session = InferenceSession()
model = session.load(graph)
in_names = model.get_model_input_names()
for name in in_names:
print("Input:", name[])
out_names = model.get_model_output_names()
for name in out_names:
print("Output:", name[])
session = InferenceSession()
model = session.load(graph)
in_names = model.get_model_input_names()
for name in in_names:
print("Input:", name[])
out_names = model.get_model_output_names()
for name in out_names:
print("Output:", name[])
Now that we know the tensor names, we can create our input as a
Tensor
, pass it into the graph, execute
it, and get the output:
input = Tensor[DType.float32](TensorShape(2, 6), 0.5)
results = model.execute("input0", input^)
output = results.get[DType.float32]("output0")
print(output)
input = Tensor[DType.float32](TensorShape(2, 6), 0.5)
results = model.execute("input0", input^)
output = results.get[DType.float32]("output0")
print(output)
That's it! You just built a model with the MAX Graph API and ran it with MAX Engine. But this was just a brief introduction, using only ops that are built into the MAX library.
And this is just the beginning of the Graph API. To stay up to date on what's coming, sign up for our newsletter.
For a more a larger code example, check out our MAX Graph implementation of Llama2.
Full code exampleβ
Here's all the code from above (also available on GitHub):
from max.engine import InferenceSession
from max.graph import Graph, TensorType, ops
from max.tensor import Tensor, TensorShape
def main():
graph = Graph(TensorType(DType.float32, 2, 6))
# Create a constant for usage in the matmul op below:
matmul_constant_value = Tensor[DType.float32](TensorShape(6, 1), 0.15)
matmul_constant = graph.constant(matmul_constant_value)
# Start adding a sequence of operator calls to build the graph.
# We can use the subscript notation to get the graph's first input tensor:
matmul = graph[0] @ matmul_constant
relu = ops.relu(matmul)
softmax = ops.softmax(relu)
graph.output(softmax)
# Load the graph:
session = InferenceSession()
model = session.load(graph)
# Print the input/output names:
in_names = model.get_model_input_names()
for name in in_names:
print("Input:", name[])
out_names = model.get_model_output_names()
for name in out_names:
print("Output:", name[])
# Execute the model:
input = Tensor[DType.float32](TensorShape(2, 6), 0.5)
results = model.execute("input0", input^)
output = results.get[DType.float32]("output0")
print(output)
from max.engine import InferenceSession
from max.graph import Graph, TensorType, ops
from max.tensor import Tensor, TensorShape
def main():
graph = Graph(TensorType(DType.float32, 2, 6))
# Create a constant for usage in the matmul op below:
matmul_constant_value = Tensor[DType.float32](TensorShape(6, 1), 0.15)
matmul_constant = graph.constant(matmul_constant_value)
# Start adding a sequence of operator calls to build the graph.
# We can use the subscript notation to get the graph's first input tensor:
matmul = graph[0] @ matmul_constant
relu = ops.relu(matmul)
softmax = ops.softmax(relu)
graph.output(softmax)
# Load the graph:
session = InferenceSession()
model = session.load(graph)
# Print the input/output names:
in_names = model.get_model_input_names()
for name in in_names:
print("Input:", name[])
out_names = model.get_model_output_names()
for name in out_names:
print("Output:", name[])
# Execute the model:
input = Tensor[DType.float32](TensorShape(2, 6), 0.5)
results = model.execute("input0", input^)
output = results.get[DType.float32]("output0")
print(output)
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!
π What went wrong?