Welcome to

MAX simplifies the process to deploy your own AI endpoint. Try it now:

Run MAX now

Deploy an LLM on a GPU

Explore

Tutorials

Step-by-step programming guides using MAX APIs.

Models

Ready-to-deploy GenAI models accelerated with MAX.

APIs

Python, C, and Mojo API libraries for MAX.

Concepts

MAX Engine

An introduction to the features and technology in MAX Engine.

MAX Serve

An introduction to our model serving library called MAX Serve.

MAX container

A guide to our official Docker container for MAX deployments.

MAX changelog

A summary of changes in each MAX release.

Tutorials

Model Repository

We’re on a mission to make open source AI models as fast and easy to use as they can be. Check out the 400+ AI models that run on MAX - each with step-by-step install instructions for CPU, GPU, and Cloud.

Go to site

Latest blog posts

Go to blog

Modverse #46: MAX 25.1, MAX Builds, and Democratizing AI Compute

Democratizing AI Compute, Part 4: CUDA is the incumbent, but is it any good?

MAX 25.1 - Introducing MAX Builds

Democratizing AI Compute, Part 3: How did CUDA succeed?

Paged Attention & Prefix Caching Now Available in MAX Serve

Democratizing AI Compute, Part 2: What exactly is “CUDA”?