MAX changelog
The MAX platform is a unified set of tools and libraries that unlock performance, programmability, and portability for your AI inference pipeline. It includes several products, including MAX Engine, MAX Serving, and the Mojo programming language.
This page describes all the changes in each version of the MAX platform.
To learn more about the platform, read What is MAX.
If you already have MAX, see how to update. If you don't have it yet, see the get started guide.
v24.3 (2024-05-02)
🔥 Legendary
You can now write custom ops for your models with Mojo!
Learn more about MAX extensibility.
🦋 Changed
Added support for named dynamic dimensions. This means you can specify when two or more dimensions in your model's input are dynamic but their sizes at run time must match each other. By specifying each of these dimension sizes with a name (instead of using
None
to indicate a dynamic size), the MAX Engine compiler can perform additional optimizations. See the notes below for the corresponding API changes that support named dimensions.Simplified all the APIs to load input specs for models, making them more consistent.
MAX Engine performance
- Compared to v24.2, MAX Engine v24.3 shows an average speedup of 10% on PyTorch models, and an average 20% speedup on dynamically quantized ONNX transformers.
MAX Graph API
The max.graph
APIs are still changing
rapidly, but starting to stabilize.
See the updated guide to build a graph with MAX Graph.
AnyMoType
renamed toType
,MOTensor
renamed toTensorType
, andMOList
renamed toListType
.Removed
ElementType
in favor of usingDType
.Removed
TypeTuple
in favor of usingList[Type]
.Removed the
Module
type so you can now start building a graph by directly instantiating aGraph
.Some new ops in
max.ops
, including support for custom ops.See how to create a custom op in MAX Graph.
MAX Engine Python API
Redesigned
InferenceSession.load()
to replace the confusingoptions
argument with acustom_ops_path
argument for use when loading a custom op, and aninput_specs
argument for use when loading TorchScript models.As a result,
CommonLoadOptions
,TorchLoadOptions
, andTensorFlowLoadOptions
have all been removed.TorchInputSpec
now supports named dynamic dimensions (previously, dynamic dimension sizes could be specified only asNone
). This lets you tell MAX which dynamic dimensions are required to have the same size, which helps MAX better optimize your model.
MAX Engine Mojo API
InferenceSession.load_model()
was renamed toload()
.Redesigned
InferenceSession.load()
to replace the confusingconfig
argument with acustom_ops_path
argument for use when loading a custom op, and aninput_specs
argument for use when loading TorchScript models.Doing so removed
LoadOptions
and introduced the newInputSpec
type to define the input shape/type of a model (instead ofLoadOptions
).New
ShapeElement
type to allow for named dynamic dimensions (inInputSpec
).max.engine.engine
module was renamed tomax.engine.info
.
MAX Engine C API
M_newTorchInputSpec()
now supports named dynamic dimensions (via newdimNames
argument).
❌ Removed
Removed TensorFlow support in the MAX SDK, so you can no longer load a TensorFlow SavedModel for inference. However, TensorFlow is still available for enterprise customers.
We removed TensorFlow because industry-wide TensorFlow usage has declined significantly, especially for the latest AI innovations. Removing TensorFlow also cuts our package size by over 50% and accelerates the development of other customer-requested features. If you have a production use-case for a TensorFlow model, please contact us.
Removed the Python
CommonLoadOptions
,TorchLoadOptions
, andTensorFlowLoadOptions
classes. See note above aboutInferenceSession.load()
changes.Removed the Mojo
LoadOptions
type. See the note above aboutInferenceSession.load()
changes.
v24.2.1 (2024-04-11)
You can now import more MAX Graph functions from
max.graph.ops
instead of usingmax.graph.ops.elementwise
. For example:from max.graph import ops
var relu = ops.relu(matmul)
v24.2 (2024-03-28)
MAX Engine now supports TorchScript models with dynamic input shapes.
No matter what the input shapes are, you still need to specify the input specs for all TorchScript models.
The Mojo standard library is now open source!
Read more about it in this blog post.
And, of course, lots of Mojo updates, including implicit traits, support for keyword arguments in Python calls, a new
List
type (previouslyDynamicVector
), some refactoring that might break your code, and much more.For details, see the Mojo changelog.
v24.1.1 (2024-03-18)
This is a minor release that improves error reports.
v24.1 (2024-02-29)
The first release of the MAX platform is here! 🚀
This is a preview version of the MAX platform. That means it is not ready for production deployment and designed only for local development and evaluation.
Because this is a preview, some API libraries are still in development and subject to change, and some features that we previously announced are not quite ready yet. But there is a lot that you can do in this release!
This release includes our flagship developer tools, currently for Linux only:
MAX Engine: Our state-of-the-art graph compiler and runtime library that executes models from PyTorch and ONNX, with incredible inference speed on a wide range of hardware.
API libraries in Python, C, and Mojo to run inference with your existing models. See the API references.
The
max benchmark
tool, which runs MLPerf benchmarks on any compatible model without writing any code.The
max visualize
tool, which allows you to visualize your model in Netron after partially lowering in MAX Engine.An early look at the MAX Graph API, our low-level library for building high-performance inference graphs in Mojo.
MAX Serving: A preview of our serving wrapper for MAX Engine that provides full interoperability with existing AI serving systems (such as Triton) and that seamlessly deploys within existing container infrastructure (such as Kubernetes).
- A Docker image that runs MAX Engine as a backend for NVIDIA Triton Inference Server. Try it now.
Mojo: The world's first programming language built from the ground-up for AI developers, with cutting-edge compiler technology that delivers unparalleled performance and programmability for any hardware.
The latest version of Mojo, the standard library, and the
mojo
command line tool. These are always included in MAX, so you don't need to download any separate packages.The Mojo changes in each release are often quite long, so we're going to continue sharing those in the existing Mojo changelog.
Additionally, we've started a new GitHub repo for MAX, where we currently share a bunch of code examples for our API libraries, including some large model pipelines such as Stable Diffusion in Mojo and Llama2 built with MAX Graph. You can also use this repo to report issues with MAX.
To get a peek at what's coming soon, and learn about some of the bugs we're working on right now, see the MAX roadmap & known issues.