MAX changelog
This page describes all the changes in each version of the MAX platform.
See how to update MAX with magic
.
v24.5 (2024-09-13)β
π₯ Legendaryβ
-
Mojo and MAX are magical! We've created a new package and virtual environment manager,
magic
, for MAX and Mojo. Check it out! -
New Llama3.1 pipeline built with the new MAX Graph Python API.
-
We have not one, but two new Python APIs that we're introducing in this release:
βοΈ Newβ
-
Added
repeat_interleave
graph op. -
Added caching for MAX graph models. This means that graph compilation is cached and the executable model is retrieved from cache on the 2nd and subsequent runs. Note that the model cache is architecture specific and isn't portable across different targets.
-
Support for Python 3.12.
MAX Graph Python APIβ
This Python API will ultimately provide the same low-level programming interface for high-performance inference graphs as the Mojo API. As with the Mojo API, it's an API for graph-building only, and it does not implement support for training.
You can take a look at how the API works in the MAX Graph Python API reference.
MAX Driver Python APIβ
The MAX Driver API allows you to interact with devices (such as CPUs and GPUs) and allocate memory directly onto them. With this API, you interact with this memory as tensors.
Note that this API is still under development, with support for non-host devices, such as GPUs, planned for a future release.
To learn more, check out the MAX Driver Python API reference.
MAX C APIβ
New APIs for adding torch metadata libraries:
M_setTorchMetadataLibraryPath
M_setTorchMetadataLibraryPtr
π¦ Changedβ
MAX Engine performanceβ
- Compared to v24.4, MAX Engine v24.5 generates tokens for Llama an average of 15%-48% faster.
MAX C APIβ
Simplified the API for adding torch library paths, which now only takes one path per API call, but can be called multiple times to add paths to the config:
M_setTorchLibraries
->M_setTorchLibraryPath