Skip to main content


MAX provides a unified and extensible platform that includes everything you need to deploy low-latency, high-throughput AI inference pipelines into production.

What you can do with MAX

Want to know
more about MAX?

Read about MAX
One AI runtime for any model from any ML framework on any hardware
Unparalleled performance for generative and traditional AI models
Compatible with the tools and technologies you already use in production
Provides model extensibility with custom ops and kernels written in Mojo