Function calling and tool use
Function calling is a feature available with some large language models (LLMs) that allows them to call external program functions (or tools). This allows the model to interact with external systems to retrieve new data for use as input or execute other tasks. This is a foundational building block for agentic AI applications, in which an LLM can chain together various functions to achieve complex objectives.
Function calling is also called "tool use" because the manner in which you tell
the LLM what functions are available is with a tools
parameter in the request
body.
When to use function calling
You should use function calling when you want your LLM to:
-
Fetch data: Such as fetch weather data, stock prices, or news updates from a database. The model will call a function to get information, and then incorporate that data into its final response.
-
Perform actions: Such as modify application states, invoke workflows, or call upon other AI systems. The model will call another tool to perform an action, effectively handing off the request after it determines what the user wants.
How function calling works
When you send an inference request to a model that supports function calling,
you can specify which functions are available to the model using the tools
body parameter.
The tools
parameter provides information that allows the LLM to understand:
- What each function can do
- How to call each function (the arguments it accepts/requires)
For example, here's a request with the chat completions
API that declares an available
function named get_weather()
:
from openai import OpenAI
def get_weather(city: str) -> str:
print("Get weather:", city)
client = OpenAI(base_url="http://0.0.0.0:8000/v1", api_key="EMPTY")
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current temperature for a given location.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and country e.g. Bogotá, Colombia"
}
},
"required": [
"location"
],
"additionalProperties": False
},
"strict": True
}
}]
messages = [
{
"role": "user",
"content": "What's the weather like in San Francisco today?"
}
]
completion = client.chat.completions.create(
model="modularai/Llama-3.1-8B-Instruct-GGUF",
messages=messages,
tools=tools
)
from openai import OpenAI
def get_weather(city: str) -> str:
print("Get weather:", city)
client = OpenAI(base_url="http://0.0.0.0:8000/v1", api_key="EMPTY")
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current temperature for a given location.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and country e.g. Bogotá, Colombia"
}
},
"required": [
"location"
],
"additionalProperties": False
},
"strict": True
}
}]
messages = [
{
"role": "user",
"content": "What's the weather like in San Francisco today?"
}
]
completion = client.chat.completions.create(
model="modularai/Llama-3.1-8B-Instruct-GGUF",
messages=messages,
tools=tools
)
Let's take a closer look at each parameter shown in the tools
property:
type
: Currently this is alwaysfunction
function
: Definition of the functionname
: The function name used by the LLM to call itdescription
: A function description that helps the LLM understand when to use itparameters
: Definition of the function parameterstype
: Defines this as an object containing parametersproperties
: Lists all possible function arguments and their typesrequired
: Specifies which function arguments are required
This format follows the OpenAI function calling specification to specify functions as tools that a model can use.
Using this information, the model will decide whether to call any functions
specified in tools
. In this case, we expect the model to call get_weather()
and incorporate that information into its final response. So, the initial
completion
response from above includes a tool_calls
parameter like this:
print(completion.choices[0].message.tool_calls)
print(completion.choices[0].message.tool_calls)
[ChatCompletionMessageToolCall(
id='call_a175692d9ff54554',
function=Function(
arguments='{
"location": "San Francisco, USA"
}',
name='get_weather'
),
type='function'
)]
[ChatCompletionMessageToolCall(
id='call_a175692d9ff54554',
function=Function(
arguments='{
"location": "San Francisco, USA"
}',
name='get_weather'
),
type='function'
)]
From here, you must parse the tool_calls
body and execute the function as
appropriate. For example:
import json
tool_call = completion.choices[0].message.tool_calls[0]
args = json.loads(tool_call.function.arguments)
result = get_weather(args["location"])
import json
tool_call = completion.choices[0].message.tool_calls[0]
args = json.loads(tool_call.function.arguments)
result = get_weather(args["location"])
If the function is designed to fetch data for the model, you should call
the function and then call the model again with the function results appended
as a message using the tool
role.
If the function is designed to perform an action, then you don't need to call the model again.
For detail about how to execute the function and feed the results back to the model, see the OpenAI docs about handling function calls.
The OpenAI function calling spec is compatible with multiple agent frameworks, such as AutoGen, CrewAI, and more.
Supported models
Function calling is model-dependent and will produce valid output only if the model is pretrained to return tool use responses. Here are just a few that we've verified work with function calling:
modularai/Llama-3.1-8B-Instruct-GGUF
- Meta's Llama 3.1 models & evals collection
- Meta's Llama 3.2 language models & evals collection
Quickstart
Here's how you can quickly try the example code from above using a locally-hosted endpoint:
-
Create a virtual environment and install the
max
CLI:- pixi
- uv
- pip
- conda
- If you don't have it, install
pixi
:curl -fsSL https://pixi.sh/install.sh | sh
curl -fsSL https://pixi.sh/install.sh | sh
Then restart your terminal for the changes to take effect.
- Create a project:
pixi init function-calling \
-c https://conda.modular.com/max-nightly/ -c conda-forge \
&& cd function-callingpixi init function-calling \
-c https://conda.modular.com/max-nightly/ -c conda-forge \
&& cd function-calling - Install the
modular
conda package:- Nightly
- Stable
pixi add modular
pixi add modular
pixi add "modular=25.4"
pixi add "modular=25.4"
- Start the virtual environment:
pixi shell
pixi shell
- If you don't have it, install
uv
:curl -LsSf https://astral.sh/uv/install.sh | sh
curl -LsSf https://astral.sh/uv/install.sh | sh
Then restart your terminal to make
uv
accessible. - Create a project:
uv init function-calling && cd function-calling
uv init function-calling && cd function-calling
- Create and start a virtual environment:
uv venv && source .venv/bin/activate
uv venv && source .venv/bin/activate
- Install the
modular
Python package:- Nightly
- Stable
uv pip install modular \
--extra-index-url https://download.pytorch.org/whl/cpu \
--index-url https://dl.modular.com/public/nightly/python/simple/ \
--index-strategy unsafe-best-match --prerelease allowuv pip install modular \
--extra-index-url https://download.pytorch.org/whl/cpu \
--index-url https://dl.modular.com/public/nightly/python/simple/ \
--index-strategy unsafe-best-match --prerelease allowuv pip install modular \
--extra-index-url https://download.pytorch.org/whl/cpu \
--extra-index-url https://modular.gateway.scarf.sh/simple/ \
--index-strategy unsafe-best-matchuv pip install modular \
--extra-index-url https://download.pytorch.org/whl/cpu \
--extra-index-url https://modular.gateway.scarf.sh/simple/ \
--index-strategy unsafe-best-match
- Create a project folder:
mkdir function-calling && cd function-calling
mkdir function-calling && cd function-calling
- Create and activate a virtual environment:
python3 -m venv .venv/function-calling \
&& source .venv/function-calling/bin/activatepython3 -m venv .venv/function-calling \
&& source .venv/function-calling/bin/activate - Install the
modular
Python package:- Nightly
- Stable
pip install --pre modular \
--extra-index-url https://download.pytorch.org/whl/cpu \
--index-url https://dl.modular.com/public/nightly/python/simple/pip install --pre modular \
--extra-index-url https://download.pytorch.org/whl/cpu \
--index-url https://dl.modular.com/public/nightly/python/simple/pip install modular \
--extra-index-url https://download.pytorch.org/whl/cpu \
--extra-index-url https://modular.gateway.scarf.sh/simple/pip install modular \
--extra-index-url https://download.pytorch.org/whl/cpu \
--extra-index-url https://modular.gateway.scarf.sh/simple/
- If you don't have it, install conda. A common choice is with
brew
:brew install miniconda
brew install miniconda
- Initialize
conda
for shell interaction:conda init
conda init
If you're on a Mac, instead use:
conda init zsh
conda init zsh
Then restart your terminal for the changes to take effect.
- Create a project:
conda create -n function-calling
conda create -n function-calling
- Start the virtual environment:
conda activate function-calling
conda activate function-calling
- Install the
modular
conda package:- Nightly
- Stable
conda install -c conda-forge -c https://conda.modular.com/max-nightly/ modular
conda install -c conda-forge -c https://conda.modular.com/max-nightly/ modular
conda install -c conda-forge -c https://conda.modular.com/max/ modular
conda install -c conda-forge -c https://conda.modular.com/max/ modular
-
Start an endpoint with a model that supports function calling:
max serve --model-path=modularai/Llama-3.1-8B-Instruct-GGUF
max serve --model-path=modularai/Llama-3.1-8B-Instruct-GGUF
-
Wait until you see this message:
Server ready on http://0.0.0.0:8000 (Press CTRL+C to quit)
Server ready on http://0.0.0.0:8000 (Press CTRL+C to quit)
Then open a new terminal send a request with the
tools
parameter:- Python
- curl
First install the
openai
API (make sure your current working directory is still thefunction-calling
directory):- pixi
- uv
- pip
- conda
pixi add openai
pixi add openai
uv add openai
uv add openai
pip install openai
pip install openai
conda install openai
conda install openai
Then, create a program to send a request specifying the available
get_weather()
function:function-calling.pyfrom openai import OpenAI
import json
def get_weather(city: str) -> str:
print("Get weather:", city)
client = OpenAI(base_url="http://0.0.0.0:8000/v1", api_key="EMPTY")
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current temperature for a given location.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and country e.g. Bogotá, Colombia"
}
},
"required": [
"location"
],
"additionalProperties": False
},
"strict": True
}
}]
messages = [
{
"role": "user",
"content": "What's the weather like in San Francisco today?"
}
]
completion = client.chat.completions.create(
model="modularai/Llama-3.1-8B-Instruct-GGUF",
messages=messages,
tools=tools
)
tool_call = completion.choices[0].message.tool_calls[0]
args = json.loads(tool_call.function.arguments)
result = get_weather(args["location"])from openai import OpenAI
import json
def get_weather(city: str) -> str:
print("Get weather:", city)
client = OpenAI(base_url="http://0.0.0.0:8000/v1", api_key="EMPTY")
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current temperature for a given location.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and country e.g. Bogotá, Colombia"
}
},
"required": [
"location"
],
"additionalProperties": False
},
"strict": True
}
}]
messages = [
{
"role": "user",
"content": "What's the weather like in San Francisco today?"
}
]
completion = client.chat.completions.create(
model="modularai/Llama-3.1-8B-Instruct-GGUF",
messages=messages,
tools=tools
)
tool_call = completion.choices[0].message.tool_calls[0]
args = json.loads(tool_call.function.arguments)
result = get_weather(args["location"])Run it and the
get_weather()
function should print the argument received (make sure you're in the virtual environment—for example, first runpixi shell
):python function-calling.py
python function-calling.py
Get weather: San Francisco, USA
Get weather: San Francisco, USA
Use the following
curl
command to send a request specifying the availableget_weather()
function:curl -N http://0.0.0.0:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "modularai/Llama-3.1-8B-Instruct-GGUF",
"stream": false,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the weather like in Boston today?"}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. Los Angeles, CA"
}
},
"required": ["location"]
}
}
}
],
"tool_choice": "auto"
}'curl -N http://0.0.0.0:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "modularai/Llama-3.1-8B-Instruct-GGUF",
"stream": false,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the weather like in Boston today?"}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. Los Angeles, CA"
}
},
"required": ["location"]
}
}
}
],
"tool_choice": "auto"
}'You should receive a response similar to this:
"tool_calls": [
{
"id": "call_ac73df14fe184349",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\": \"Boston, MA\"}"
}
}
]"tool_calls": [
{
"id": "call_ac73df14fe184349",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\": \"Boston, MA\"}"
}
}
]
For a more complete walkthrough of how to handle a tools_call
response and
send the function results back to the LLM as input, see the OpenAI docs about
handling function
calls.
Next steps
Now that you know the basics of function calling, you can get started with MAX on GPUs.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!