Documentation

Structured Outputs

Generate responses in specific formats using JSON Schema, Regex, or Choice constraints

Structured Outputs

Structured Outputs enable LLMs to generate responses that conform to specific formats. Test structured outputs in the vLLM Playground or integrate via the OpenAI-compatible API.

Overview

With structured outputs, you can:

  1. Guarantee valid JSON that matches your schema
  2. Constrain outputs to match regex patterns
  3. Limit responses to specific choices
  4. Eliminate parsing errors in your applications

Supported Models

Deploy models with structured output support via GPU Instance > Create Instance > One-Click Deployment:

Model Provider Capabilities
GPT-OSS-120B OpenAI text, reasoning, tools, grammar
GPT-OSS-20B OpenAI text, reasoning, tools, grammar
Qwen3-VL-235B-A22B-Instruct Alibaba text, vision, tools, grammar
Qwen3-VL-30B-A3B-Instruct Alibaba text, vision, tools, grammar
Qwen3-VL-32B-Instruct Alibaba text, vision, tools, grammar
Qwen3-VL-8B-Instruct Alibaba text, vision, tools, grammar
Llama 3.3 70B Instruct Meta text, tools, grammar
Typhoon2.5-qwen3-30b-a3b SCB10X text, tools, grammar
GLM 4.7 Flash ZAI text, tools, grammar

Testing in vLLM Playground

Test structured outputs with pre-configured format presets in the Playground.

Accessing Structured Outputs

  1. Deploy a model with grammar support
  2. Navigate to GPU Instance > Instances
  3. Click View on your vLLM instance
  4. Select the Playground tab
  5. Toggle Structured Output to ON

Output Format Types

Type Description
JSON Schema Define object structure with properties and types
Regex Constrain output to match a regex pattern
Choice Limit output to specific options

Available Presets

The Playground includes pre-configured format presets:

Preset Type Description
Person Info JSON Schema Extract name, age, occupation, email
Sentiment Analysis JSON Schema Analyze sentiment with confidence score
Product Review JSON Schema Extract product review details
Simple JSON JSON Schema Basic key-value structure
Email Pattern Regex Match email format
Yes/No Choice Choice Binary response constraint
Rating Choice Choice Rating scale constraint

Try It

  • "Extract info: John Smith is a 32-year-old software engineer at john.smith@example.com"
  • "Analyze sentiment: This product is amazing, I love it!"
  • "Rate this: The service was okay but could be better"

API Usage

Integrate structured outputs via the OpenAI-compatible API.

JSON Schema Format

from openai import OpenAI

client = OpenAI(
    api_key="not-needed",
    base_url="https://proxy-instance.float16.cloud/{instance_id}/3900/v1"
)

response = client.chat.completions.create(
    model="your-model-name",
    messages=[
        {"role": "user", "content": "Extract info: John is 30 years old and works as an engineer"}
    ],
    extra_body={
        "guided_json": {
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "age": {"type": "integer"},
                "occupation": {"type": "string"}
            },
            "required": ["name", "age", "occupation"]
        }
    }
)

print(response.choices[0].message.content)
# {"name": "John", "age": 30, "occupation": "engineer"}

Regex Pattern

response = client.chat.completions.create(
    model="your-model-name",
    messages=[
        {"role": "user", "content": "Generate an email address for John Smith"}
    ],
    extra_body={
        "guided_regex": r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"
    }
)

print(response.choices[0].message.content)
# john.smith@example.com

Choice Constraint

response = client.chat.completions.create(
    model="your-model-name",
    messages=[
        {"role": "user", "content": "Is this review positive or negative: 'Great product!'"}
    ],
    extra_body={
        "guided_choice": ["positive", "negative", "neutral"]
    }
)

print(response.choices[0].message.content)
# positive

Schema Examples

Sentiment Analysis

schema = {
    "type": "object",
    "properties": {
        "sentiment": {
            "type": "string",
            "enum": ["positive", "negative", "neutral"]
        },
        "confidence": {
            "type": "number"
        },
        "summary": {
            "type": "string"
        }
    },
    "required": ["sentiment", "confidence"]
}

response = client.chat.completions.create(
    model="your-model-name",
    messages=[
        {"role": "user", "content": "Analyze: This product exceeded my expectations!"}
    ],
    extra_body={"guided_json": schema}
)

Entity Extraction

schema = {
    "type": "object",
    "properties": {
        "entities": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "text": {"type": "string"},
                    "type": {
                        "type": "string",
                        "enum": ["person", "organization", "location", "date"]
                    }
                },
                "required": ["text", "type"]
            }
        }
    },
    "required": ["entities"]
}

Product Review

schema = {
    "type": "object",
    "properties": {
        "product_name": {"type": "string"},
        "rating": {"type": "integer"},
        "pros": {
            "type": "array",
            "items": {"type": "string"}
        },
        "cons": {
            "type": "array",
            "items": {"type": "string"}
        },
        "recommendation": {"type": "boolean"}
    },
    "required": ["product_name", "rating", "recommendation"]
}

Best Practices

Keep Schemas Simple

Start with minimal required fields:

# Good - Start simple
schema = {
    "type": "object",
    "properties": {
        "answer": {"type": "string"},
        "confidence": {"type": "number"}
    },
    "required": ["answer"]
}

Use Enums for Categorical Values

# Good - Use enums
"status": {
    "type": "string",
    "enum": ["pending", "approved", "rejected"]
}

Handle Optional Fields

{
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "nickname": {"type": ["string", "null"]}
    },
    "required": ["name"]
}

Comparison of Output Formats

Feature JSON Schema Regex Choice
Structured data Yes No No
Pattern matching Limited Yes No
Fixed options Via enum No Yes
Nested objects Yes No No
Arrays Yes No No

Pricing

Structured outputs use GPU Instance pricing:

Instance On-Demand Spot (Save 50%) Storage
H100 $4.32/hr $2.16/hr $1.00/GB/mo

View current pricing at GPU Instance > Pricing.

Next Steps

Tags:jsonschemastructuredllmoutputregexgrammar
Last updated: February 1, 20255 min read