Skip to main content

v24.4 (2024-06-07)

πŸ”₯ Legendary​

  • MAX is now available on macOS! Try it now.

  • New quantization APIs for MAX Graph. You can now build high-performance graphs in Mojo that use the latest quantization techniques, enabling even faster performance and more system compatibility for large models.

    Learn more in the guide to quantize your graph weights.

⭐️ New​

MAX Mojo APIs​

  • Added AI pipeline examples in the max repo, with Mojo implementations for common transformer layers, including quantization support.

    • New Llama3 pipeline built with MAX Graph.

    • New Replit Code pipeline built with MAX Graph.

    • New TinyStories pipeline (based on TinyLlama) that offers a simple demo of the MAX Graph quantization API.

  • Added max.graph.checkpoint package to save and load model weights.

    All weights are stored in a TensorDict. You can save and load a TensorDict to disk with save() and load() functions.

  • Added MAX Graph quantization APIs:

    • Added quantization encodings BFloat16Encoding, Q4_0Encoding, Q4_KEncoding, and Q6_KEncoding.
    • Added the QuantizationEncoding trait so you can build custom quantization encodings.
    • Added Graph.quantize() to create a quantized tensor node.
    • Added qmatmul() to perform matrix-multiplication with a float32 and a quantized matrix.
  • Added some MAX Graph ops:

    • avg_pool()
    • max_pool()
    • conv2d()
    • conv3d()
    • layer_norm()
    • tile()
    • select()
  • Added a layer() context manager and current_layer() function to aid in debugging during graph construction. For example:

    with graph.layer("foo"):
        with graph.layer("bar"):
            print(graph.current_layer())  # prints "foo.bar"
            x = graph.constant[DType.int64](1)
            graph.output(x)

    This adds a path foo.bar to the added nodes, which will be reported during errors.

  • Added format_system_stack() function to format the stack trace, which we use to print better error messages from error().

  • Added TensorMap.keys() to get all the tensor key names.

MAX C API​

Miscellaneous new APIs:

  • M_cloneCompileConfig()
  • M_copyAsyncTensorMap()
  • M_tensorMapKeys() and M_deleteTensorMapKeys()
  • M_setTorchLibraries()

πŸ¦‹ Changed​

MAX Mojo API​

  • EngineNumpyView.data() and EngineTensorView.data() functions that return a type-erased pointer were renamed to unsafe_ptr().

  • TensorMap now conforms to CollectionElement trait to be copyable and movable.

  • custom_nv() was removed, and its functionality moved into custom() as a function overload, so it can now output a list of tensor symbols.

Was this page helpful?