Structured Outputs
Structured Outputs enable LLMs to generate responses that conform to specific formats. Test structured outputs in the vLLM Playground or integrate via the OpenAI-compatible API.
Overview
With structured outputs, you can:
- Guarantee valid JSON that matches your schema
- Constrain outputs to match regex patterns
- Limit responses to specific choices
- Eliminate parsing errors in your applications
Supported Models
Deploy models with structured output support via GPU Instance > Create Instance > One-Click Deployment:
| Model | Provider | Capabilities |
|---|---|---|
| GPT-OSS-120B | OpenAI | text, reasoning, tools, grammar |
| GPT-OSS-20B | OpenAI | text, reasoning, tools, grammar |
| Qwen3-VL-235B-A22B-Instruct | Alibaba | text, vision, tools, grammar |
| Qwen3-VL-30B-A3B-Instruct | Alibaba | text, vision, tools, grammar |
| Qwen3-VL-32B-Instruct | Alibaba | text, vision, tools, grammar |
| Qwen3-VL-8B-Instruct | Alibaba | text, vision, tools, grammar |
| Llama 3.3 70B Instruct | Meta | text, tools, grammar |
| Typhoon2.5-qwen3-30b-a3b | SCB10X | text, tools, grammar |
| GLM 4.7 Flash | ZAI | text, tools, grammar |
Testing in vLLM Playground
Test structured outputs with pre-configured format presets in the Playground.
Accessing Structured Outputs
- Deploy a model with grammar support
- Navigate to GPU Instance > Instances
- Click View on your vLLM instance
- Select the Playground tab
- Toggle Structured Output to ON
Output Format Types
| Type | Description |
|---|---|
| JSON Schema | Define object structure with properties and types |
| Regex | Constrain output to match a regex pattern |
| Choice | Limit output to specific options |
Available Presets
The Playground includes pre-configured format presets:
| Preset | Type | Description |
|---|---|---|
| Person Info | JSON Schema | Extract name, age, occupation, email |
| Sentiment Analysis | JSON Schema | Analyze sentiment with confidence score |
| Product Review | JSON Schema | Extract product review details |
| Simple JSON | JSON Schema | Basic key-value structure |
| Email Pattern | Regex | Match email format |
| Yes/No Choice | Choice | Binary response constraint |
| Rating Choice | Choice | Rating scale constraint |
Try It
- "Extract info: John Smith is a 32-year-old software engineer at john.smith@example.com"
- "Analyze sentiment: This product is amazing, I love it!"
- "Rate this: The service was okay but could be better"
API Usage
Integrate structured outputs via the OpenAI-compatible API.
JSON Schema Format
from openai import OpenAI
client = OpenAI(
api_key="not-needed",
base_url="https://proxy-instance.float16.cloud/{instance_id}/3900/v1"
)
response = client.chat.completions.create(
model="your-model-name",
messages=[
{"role": "user", "content": "Extract info: John is 30 years old and works as an engineer"}
],
extra_body={
"guided_json": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
"occupation": {"type": "string"}
},
"required": ["name", "age", "occupation"]
}
}
)
print(response.choices[0].message.content)
# {"name": "John", "age": 30, "occupation": "engineer"}
Regex Pattern
response = client.chat.completions.create(
model="your-model-name",
messages=[
{"role": "user", "content": "Generate an email address for John Smith"}
],
extra_body={
"guided_regex": r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"
}
)
print(response.choices[0].message.content)
# john.smith@example.com
Choice Constraint
response = client.chat.completions.create(
model="your-model-name",
messages=[
{"role": "user", "content": "Is this review positive or negative: 'Great product!'"}
],
extra_body={
"guided_choice": ["positive", "negative", "neutral"]
}
)
print(response.choices[0].message.content)
# positive
Schema Examples
Sentiment Analysis
schema = {
"type": "object",
"properties": {
"sentiment": {
"type": "string",
"enum": ["positive", "negative", "neutral"]
},
"confidence": {
"type": "number"
},
"summary": {
"type": "string"
}
},
"required": ["sentiment", "confidence"]
}
response = client.chat.completions.create(
model="your-model-name",
messages=[
{"role": "user", "content": "Analyze: This product exceeded my expectations!"}
],
extra_body={"guided_json": schema}
)
Entity Extraction
schema = {
"type": "object",
"properties": {
"entities": {
"type": "array",
"items": {
"type": "object",
"properties": {
"text": {"type": "string"},
"type": {
"type": "string",
"enum": ["person", "organization", "location", "date"]
}
},
"required": ["text", "type"]
}
}
},
"required": ["entities"]
}
Product Review
schema = {
"type": "object",
"properties": {
"product_name": {"type": "string"},
"rating": {"type": "integer"},
"pros": {
"type": "array",
"items": {"type": "string"}
},
"cons": {
"type": "array",
"items": {"type": "string"}
},
"recommendation": {"type": "boolean"}
},
"required": ["product_name", "rating", "recommendation"]
}
Best Practices
Keep Schemas Simple
Start with minimal required fields:
# Good - Start simple
schema = {
"type": "object",
"properties": {
"answer": {"type": "string"},
"confidence": {"type": "number"}
},
"required": ["answer"]
}
Use Enums for Categorical Values
# Good - Use enums
"status": {
"type": "string",
"enum": ["pending", "approved", "rejected"]
}
Handle Optional Fields
{
"type": "object",
"properties": {
"name": {"type": "string"},
"nickname": {"type": ["string", "null"]}
},
"required": ["name"]
}
Comparison of Output Formats
| Feature | JSON Schema | Regex | Choice |
|---|---|---|---|
| Structured data | Yes | No | No |
| Pattern matching | Limited | Yes | No |
| Fixed options | Via enum | No | Yes |
| Nested objects | Yes | No | No |
| Arrays | Yes | No | No |
Pricing
Structured outputs use GPU Instance pricing:
| Instance | On-Demand | Spot (Save 50%) | Storage |
|---|---|---|---|
| H100 | $4.32/hr | $2.16/hr | $1.00/GB/mo |
View current pricing at GPU Instance > Pricing.
Next Steps
- vLLM Playground - Test structured outputs interactively
- Tool Calling - Implement function calling
- LLM Deployment - Deploy models with grammar support