Tool Calling
Tool calling (also known as function calling) allows LLMs to interact with external tools, APIs, and functions. Test tool calling in the vLLM Playground or integrate via the OpenAI-compatible API.
Overview
With tool calling, the LLM can:
- Understand when a tool is needed based on the conversation
- Generate structured arguments for the tool
- Process tool results and continue the conversation
Supported Models
Deploy models with tool calling support via GPU Instance > Create Instance > One-Click Deployment:
| Model | Provider | Capabilities |
|---|---|---|
| GPT-OSS-120B | OpenAI | text, reasoning, tools, grammar |
| GPT-OSS-20B | OpenAI | text, reasoning, tools, grammar |
| Qwen3-VL-235B-A22B-Instruct | Alibaba | text, vision, tools, grammar |
| Qwen3-VL-30B-A3B-Instruct | Alibaba | text, vision, tools, grammar |
| Qwen3-VL-32B-Instruct | Alibaba | text, vision, tools, grammar |
| Qwen3-VL-8B-Instruct | Alibaba | text, vision, tools, grammar |
| Llama 3.3 70B Instruct | Meta | text, tools, grammar |
| Typhoon2.5-qwen3-30b-a3b | SCB10X | text, tools, grammar |
| GLM 4.7 Flash | ZAI | text, tools, grammar |
Testing in vLLM Playground
Test tool calling with pre-configured example tools in the Playground.
Accessing Tool Calling
- Deploy a model with tool calling support
- Navigate to GPU Instance > Instances
- Click View on your vLLM instance
- Select the Playground tab
- Toggle Tool Calling to ON
Tool Choice Options
| Option | Description |
|---|---|
| Auto | Model decides when to use tools |
| Required | Model must use a tool |
| None | Disable tool usage |
Available Example Tools
The Playground includes 3 pre-configured tools for testing:
| Tool | Description | Example Prompt |
|---|---|---|
| Weather | Get weather information for a location | "What's the weather in Bangkok?" |
| Calculator | Perform mathematical calculations | "Calculate 25 * 4" |
| Search | Search for information | "Search for Thai restaurants" |
API Usage
Integrate tool calling via the OpenAI-compatible API.
Defining Tools
Define tools using JSON Schema format:
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city name, e.g., Bangkok"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["location"]
}
}
}
]
Making the API Call
from openai import OpenAI
client = OpenAI(
api_key="not-needed",
base_url="https://proxy-instance.float16.cloud/{instance_id}/3900/v1"
)
response = client.chat.completions.create(
model="your-model-name",
messages=[
{"role": "user", "content": "What's the weather in Bangkok?"}
],
tools=tools,
tool_choice="auto"
)
Handling Tool Calls
import json
message = response.choices[0].message
if message.tool_calls:
for tool_call in message.tool_calls:
function_name = tool_call.function.name
arguments = json.loads(tool_call.function.arguments)
# Execute your function
if function_name == "get_weather":
result = get_weather(**arguments)
# Send result back to the model
messages.append(message)
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result)
})
# Get final response
final_response = client.chat.completions.create(
model="your-model-name",
messages=messages,
tools=tools
)
Tool Choice API Options
Control how the model uses tools:
| Option | Behavior |
|---|---|
"auto" |
Model decides when to use tools |
"none" |
Model won't use any tools |
"required" |
Model must use at least one tool |
{"type": "function", "function": {"name": "..."}} |
Force specific tool |
Multiple Tools
Define multiple tools for the model to choose from:
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather",
"parameters": {...}
}
},
{
"type": "function",
"function": {
"name": "calculate",
"description": "Perform mathematical calculations",
"parameters": {...}
}
},
{
"type": "function",
"function": {
"name": "search",
"description": "Search for information",
"parameters": {...}
}
}
]
Best Practices
Write Clear Descriptions
Good descriptions help the model understand when to use tools:
# Good
"description": "Get the current weather conditions including temperature, humidity, and forecast for a specific city"
# Bad
"description": "Weather"
Use Specific Parameter Types
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name, e.g., Bangkok, Tokyo"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["location"]
}
Handle Errors Gracefully
try:
result = execute_tool(function_name, arguments)
tool_response = {"success": True, "data": result}
except Exception as e:
tool_response = {"success": False, "error": str(e)}
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(tool_response)
})
Pricing
Tool calling uses GPU Instance pricing:
| Instance | On-Demand | Spot (Save 50%) | Storage |
|---|---|---|---|
| H100 | $4.32/hr | $2.16/hr | $1.00/GB/mo |
View current pricing at GPU Instance > Pricing.
Next Steps
- vLLM Playground - Test tool calling interactively
- Structured Outputs - Generate structured responses
- LLM Deployment - Deploy models with tool support