Skip to main content

MAX changelog

The MAX platform is a unified set of tools and libraries that unlock performance, programmability, and portability for your AI inference pipeline. It includes several products, including MAX Engine, MAX Serving, and the Mojo programming language.

This page describes all the changes in each version of the MAX platform.

To learn more about the platform, read What is MAX.


If you already have MAX, see how to update. If you don't have it yet, see the get started guide.

v24.2.1 (2024-04-11)

  • You can now import more MAX Graph functions from max.graph.ops instead of using max.graph.ops.elementwise. For example:

    from max.graph import ops
    var relu = ops.relu(matmul)

v24.2 (2024-03-28)

  • MAX Engine now supports TorchScript models with dynamic input shapes.

    No matter what the input shapes are, you still need to specify the input specs for all TorchScript models.

  • The Mojo standard library is now open source!

    Read more about it in this blog post.

  • And, of course, lots of Mojo updates, including implicit traits, support for keyword arguments in Python calls, a new List type (previously DynamicVector), some refactoring that might break your code, and much more.

    For details, see the Mojo changelog.

v24.1.1 (2024-03-18)

This is a minor release that improves error reports.

v24.1 (2024-02-29)

The first release of the MAX platform is here! 🚀

This is a preview version of the MAX SDK Developer Edition. That means it is not ready for production deployment and designed only for local development and evaluation.

Because this is a preview, some API libraries are still in development and subject to change, and some features that we previously announced are not quite ready yet. But there is a lot that you can do in this release!

This release includes our flagship developer tools, currently for Linux only:

  • MAX Engine: Our state-of-the-art graph compiler and runtime library that executes models from TensorFlow, PyTorch, and ONNX, with incredible inference speed on a wide range of hardware.

    • API libraries in Python, C, and Mojo to run inference with your existing models. See the API references.

    • The max benchmark tool, which runs MLPerf benchmarks on any compatible model without writing any code.

    • The max visualize tool, which allows you to visualize your model in Netron after partially lowering in MAX Engine.

    • An early look at the MAX Graph API, our low-level library for building high-performance inference graphs in Mojo.

  • MAX Serving: A preview of our serving wrapper for MAX Engine that provides full interoperability with existing AI serving systems (such as Triton) and that seamlessly deploys within existing container infrastructure (such as Kubernetes).

    • A Docker image that runs MAX Engine as a backend for NVIDIA Triton Inference Server. Try it now.
  • Mojo: The world's first programming language built from the ground-up for AI developers, with cutting-edge compiler technology that delivers unparalleled performance and programmability for any hardware.

    • The latest version of Mojo, the standard library, and the mojo command line tool. These are always included in MAX, so you don't need to download any separate packages.

    • The Mojo changes in each release are often quite long, so we're going to continue sharing those in the existing Mojo changelog.

Additionally, we've started a new GitHub repo for MAX, where we currently share a bunch of code examples for our API libraries, including some large model pipelines such as Stable Diffusion in Mojo and Llama2 built with MAX Graph. You can also use this repo to report issues with MAX.

To get a peek at what's coming soon, and learn about some of the bugs we're working on right now, see the MAX roadmap & known issues.