Skip to main content
Log in

Function calling and tool use with MAX Serve

Function calling enables AI models to dynamically interact with external systems, retrieve up-to-date data, and execute tasks. This capability is a foundational building block for agentic GenAI applications, where models call different functions to achieve specific objectives.

You may want to define functions for the following purposes:

  • To fetch data: Access APIs, knowledge bases, or external services to retrieve up-to-date information and augment model responses
  • To perform actions: Execute predefined tasks like modifying application states, invoking workflows, or integrating with custom business logic

Based on the system prompt and messages, the model may decide to call these functions instead of or in addition to generating text. Developers then handle the function calls, execute them, and return the results to the model, which integrates the function call results into its final response.

Named function specifications

MAX supports the OpenAI function calling specification to call developer-defined functions as tools that a model can use to augment prompts, in order to have more control over model behavior and directly trigger actions based on user input.

The following example defines a function, registers that function as a tool, and sends a request to the chat completion client.

from openai import OpenAI
import json

client = OpenAI(base_url="http://localhost:8000/v1", api_key="<your-api-key>")

# Define a function that the model can call
def get_weather(location: str):
return f"Getting the weather for {location} ..."

# Register your function as an available tool
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and state, e.g., 'Los Angeles, CA'"
}
},
"required": [
"location"
]
}
}
}]

# Generate a response with the chat completion client with access to tools
response = client.chat.completions.create(
model="modularai/llama-3.1",
messages=[{"role": "user", "content": "What's the weather like in Paris today?"}],
tools=tools,
stream=False
)

# Print the model's selected function call
print(completion.choices[0].message.tool_calls)
from openai import OpenAI
import json

client = OpenAI(base_url="http://localhost:8000/v1", api_key="<your-api-key>")

# Define a function that the model can call
def get_weather(location: str):
return f"Getting the weather for {location} ..."

# Register your function as an available tool
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and state, e.g., 'Los Angeles, CA'"
}
},
"required": [
"location"
]
}
}
}]

# Generate a response with the chat completion client with access to tools
response = client.chat.completions.create(
model="modularai/llama-3.1",
messages=[{"role": "user", "content": "What's the weather like in Paris today?"}],
tools=tools,
stream=False
)

# Print the model's selected function call
print(completion.choices[0].message.tool_calls)

At this stage of the function calling workflow, the model responds with the selected tool to use along with detected function inputs:

[{
"id": "call_12345xyz",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\":\"Paris, France\"}"
}
}]
[{
"id": "call_12345xyz",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\":\"Paris, France\"}"
}
}]

From here, you must execute the function call and supply the model with the results in order to augment the model response.

The OpenAI function calling spec is compatible with multiple agent frameworks, such as AutoGen, CrewAI, and more.

Quickstart

Use MAX to serve a model that is compatible with function calling and test it out locally.

  1. Follow the steps in Get started with MAX to set up a GenAI endpoint.
  2. Next, open a new window and send a request to the endpoint specifying the available tools.
curl -N http://0.0.0.0:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "modularai/llama-3.1",
"stream": false,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the weather like in Boston today?"}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. Los Angeles, CA"
}
},
"required": ["location"]
}
}
}
],
"tool_choice": "auto"
}'
curl -N http://0.0.0.0:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "modularai/llama-3.1",
"stream": false,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the weather like in Boston today?"}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. Los Angeles, CA"
}
},
"required": ["location"]
}
}
}
],
"tool_choice": "auto"
}'

Within the generated response, you should see that the get_weather function was chosen to call as a tool and the inputs for the function are taken from the original prompt.

"tool_calls": [
{
"id": "call_ac73df14fe184349",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\": \"Boston, MA\"}"
}
}
]
"tool_calls": [
{
"id": "call_ac73df14fe184349",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\": \"Boston, MA\"}"
}
}
]

Supported models

MAX Serve supports several LLMs optimized for function calling:

Next steps

Now that you know the basics of function calling, you can get started with MAX Serve on GPUs.