Install MAX with pip
You can install everything you need to build and deploy MAX models using pip. However, if you want to develop with Mojo, we recommend using Magic or conda.
Get started using pip
Here's how to install the Modular platform APIs and tools with pip, and then deploy a GenAI model on a local endpoint:
-
Start a Python virtual environment and install MAX:
- pip
- uv
-
Create a project folder:
mkdir modular && cd modular
mkdir modular && cd modular
-
Create and activate a virtual environment:
python3 -m venv .venv/modular \
&& source .venv/modular/bin/activatepython3 -m venv .venv/modular \
&& source .venv/modular/bin/activate -
Install the
modular
Python package:- Nightly
- Stable
pip install modular \
--index-url https://download.pytorch.org/whl/cpu \
--extra-index-url https://dl.modular.com/public/nightly/python/simple/pip install modular \
--index-url https://download.pytorch.org/whl/cpu \
--extra-index-url https://dl.modular.com/public/nightly/python/simple/pip install modular \
--index-url https://download.pytorch.org/whl/cpupip install modular \
--index-url https://download.pytorch.org/whl/cpu
-
Install
uv
:curl -LsSf https://astral.sh/uv/install.sh | sh
curl -LsSf https://astral.sh/uv/install.sh | sh
Then restart your terminal to make
uv
accessible. -
Create a project:
uv init modular && cd modular
uv init modular && cd modular
-
Create and start a virtual environment:
uv venv && source .venv/bin/activate
uv venv && source .venv/bin/activate
-
Install the
modular
Python package:- Nightly
- Stable
uv pip install modular \
--index-url https://download.pytorch.org/whl/cpu \
--extra-index-url https://dl.modular.com/public/nightly/python/simple/uv pip install modular \
--index-url https://download.pytorch.org/whl/cpu \
--extra-index-url https://dl.modular.com/public/nightly/python/simple/uv pip install modular \
--index-url https://download.pytorch.org/whl/cpuuv pip install modular \
--index-url https://download.pytorch.org/whl/cpu
-
Start a local endpoint for Llama 3:
max serve --model-path=modularai/Llama-3.1-8B-Instruct-GGUF
max serve --model-path=modularai/Llama-3.1-8B-Instruct-GGUF
In addition to starting a local server, this downloads the model weights and compiles the model, which might take some time.
The endpoint is ready when you see the URI printed in your terminal:
Server ready on http://0.0.0.0:8000 (Press CTRL+C to quit)
Server ready on http://0.0.0.0:8000 (Press CTRL+C to quit)
-
Now open another terminal to send a request using
curl
:curl -N http://0.0.0.0:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "modularai/Llama-3.1-8B-Instruct-GGUF",
"stream": true,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the World Series in 2020?"}
]
}' | grep -o '"content":"[^"]*"' | sed 's/"content":"//g' | sed 's/"//g' | tr -d '\n' | sed 's/\\n/\n/g'curl -N http://0.0.0.0:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "modularai/Llama-3.1-8B-Instruct-GGUF",
"stream": true,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the World Series in 2020?"}
]
}' | grep -o '"content":"[^"]*"' | sed 's/"content":"//g' | sed 's/"//g' | tr -d '\n' | sed 's/\\n/\n/g'
Now check out these tutorials for more about how to accelerate your GenAI models with MAX:
What's included
The modular
Python package installs the following:
- MAX tools and libraries
- Mojo tools and libraries
Known issues
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!