Tool Calling

Tool calling (also known as function calling) allows LLMs to interact with external tools, APIs, and functions. Test tool calling in the vLLM Playground or integrate via the OpenAI-compatible API.

Overview

With tool calling, the LLM can:

Understand when a tool is needed based on the conversation
Generate structured arguments for the tool
Process tool results and continue the conversation

Supported Models

Deploy models with tool calling support via GPU Instance > Create Instance > One-Click Deployment:

Model	Provider	Capabilities
GPT-OSS-120B	OpenAI	text, reasoning, tools, grammar
GPT-OSS-20B	OpenAI	text, reasoning, tools, grammar
Qwen3-VL-235B-A22B-Instruct	Alibaba	text, vision, tools, grammar
Qwen3-VL-30B-A3B-Instruct	Alibaba	text, vision, tools, grammar
Qwen3-VL-32B-Instruct	Alibaba	text, vision, tools, grammar
Qwen3-VL-8B-Instruct	Alibaba	text, vision, tools, grammar
Llama 3.3 70B Instruct	Meta	text, tools, grammar
Typhoon2.5-qwen3-30b-a3b	SCB10X	text, tools, grammar
GLM 4.7 Flash	ZAI	text, tools, grammar

Testing in vLLM Playground

Test tool calling with pre-configured example tools in the Playground.

Accessing Tool Calling

Deploy a model with tool calling support
Navigate to GPU Instance > Instances
Click View on your vLLM instance
Select the Playground tab
Toggle Tool Calling to ON

Tool Choice Options

Option	Description
Auto	Model decides when to use tools
Required	Model must use a tool
None	Disable tool usage

Available Example Tools

The Playground includes 3 pre-configured tools for testing:

Tool	Description	Example Prompt
Weather	Get weather information for a location	"What's the weather in Bangkok?"
Calculator	Perform mathematical calculations	"Calculate 25 * 4"
Search	Search for information	"Search for Thai restaurants"

API Usage

Integrate tool calling via the OpenAI-compatible API.

Defining Tools

Define tools using JSON Schema format:

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g., Bangkok"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit"
                    }
                },
                "required": ["location"]
            }
        }
    }
]

Making the API Call

from openai import OpenAI

client = OpenAI(
    api_key="not-needed",
    base_url="https://proxy-instance.float16.cloud/{instance_id}/3900/v1"
)

response = client.chat.completions.create(
    model="your-model-name",
    messages=[
        {"role": "user", "content": "What's the weather in Bangkok?"}
    ],
    tools=tools,
    tool_choice="auto"
)

Handling Tool Calls

import json

message = response.choices[0].message

if message.tool_calls:
    for tool_call in message.tool_calls:
        function_name = tool_call.function.name
        arguments = json.loads(tool_call.function.arguments)

        # Execute your function
        if function_name == "get_weather":
            result = get_weather(**arguments)

        # Send result back to the model
        messages.append(message)
        messages.append({
            "role": "tool",
            "tool_call_id": tool_call.id,
            "content": json.dumps(result)
        })

    # Get final response
    final_response = client.chat.completions.create(
        model="your-model-name",
        messages=messages,
        tools=tools
    )

Tool Choice API Options

Control how the model uses tools:

Option	Behavior
`"auto"`	Model decides when to use tools
`"none"`	Model won't use any tools
`"required"`	Model must use at least one tool
`{"type": "function", "function": {"name": "..."}}`	Force specific tool

Multiple Tools

Define multiple tools for the model to choose from:

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather",
            "parameters": {...}
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Perform mathematical calculations",
            "parameters": {...}
        }
    },
    {
        "type": "function",
        "function": {
            "name": "search",
            "description": "Search for information",
            "parameters": {...}
        }
    }
]

Best Practices

Write Clear Descriptions

Good descriptions help the model understand when to use tools:

# Good
"description": "Get the current weather conditions including temperature, humidity, and forecast for a specific city"

# Bad
"description": "Weather"

Use Specific Parameter Types

"parameters": {
    "type": "object",
    "properties": {
        "location": {
            "type": "string",
            "description": "City name, e.g., Bangkok, Tokyo"
        },
        "unit": {
            "type": "string",
            "enum": ["celsius", "fahrenheit"],
            "description": "Temperature unit"
        }
    },
    "required": ["location"]
}

Handle Errors Gracefully

try:
    result = execute_tool(function_name, arguments)
    tool_response = {"success": True, "data": result}
except Exception as e:
    tool_response = {"success": False, "error": str(e)}

messages.append({
    "role": "tool",
    "tool_call_id": tool_call.id,
    "content": json.dumps(tool_response)
})

Pricing

Tool calling uses GPU Instance pricing:

Instance	On-Demand	Spot (Save 50%)	Storage
H100	$4.32/hr	$2.16/hr	$1.00/GB/mo

View current pricing at GPU Instance > Pricing.

Next Steps

vLLM Playground - Test tool calling interactively
Structured Outputs - Generate structured responses
LLM Deployment - Deploy models with tool support