Skip to content
Learn Agentic AI
Learn Agentic AI13 min read14 views

Structured Outputs with Pydantic: Type-Safe Agent Responses

Learn how to use Pydantic models with the OpenAI Agents SDK output_type parameter to get type-safe, validated, structured JSON responses from your agents.

Why Structured Outputs Matter

By default, agents return free-form text. That works for chatbots and creative writing, but most production applications need structured data they can programmatically consume — JSON objects that match a specific schema, with validated fields, correct types, and no missing data.

The OpenAI Agents SDK integrates deeply with Pydantic to provide structured outputs. You define a Pydantic model, set it as the agent's output_type, and the SDK guarantees the agent's response conforms to your schema.

Basic Structured Output

Define a Pydantic model and pass it as output_type:

flowchart TD
    START["Structured Outputs with Pydantic: Type-Safe Agent…"] --> A
    A["Why Structured Outputs Matter"]
    A --> B
    B["Basic Structured Output"]
    B --> C
    C["How It Works Under the Hood"]
    C --> D
    D["Field Descriptions and Constraints"]
    D --> E
    E["Nested Models"]
    E --> F
    F["Enum Fields for Constrained Choices"]
    F --> G
    G["Optional Fields"]
    G --> H
    H["TypedDict and Dataclass Support"]
    H --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
from pydantic import BaseModel
from agents import Agent, Runner

class MovieReview(BaseModel):
    title: str
    rating: float
    pros: list[str]
    cons: list[str]
    summary: str

agent = Agent(
    name="Movie Critic",
    instructions="You are a movie critic. Analyze the given movie and provide a structured review.",
    output_type=MovieReview,
)

result = Runner.run_sync(agent, "Review the movie Inception (2010)")
review: MovieReview = result.final_output_as(MovieReview)

print(f"Title: {review.title}")
print(f"Rating: {review.rating}/10")
print(f"Pros: {', '.join(review.pros)}")
print(f"Cons: {', '.join(review.cons)}")
print(f"Summary: {review.summary}")

The LLM is instructed to respond with JSON matching the MovieReview schema. The SDK parses and validates the response before returning it. If the response does not match the schema, the SDK can retry by feeding the validation error back to the model.

How It Works Under the Hood

When you set output_type, the SDK:

  1. Converts the Pydantic model to a JSON schema
  2. Includes the schema in the LLM request as the response_format
  3. The model generates JSON that conforms to the schema (using constrained decoding)
  4. The SDK parses the JSON into a Pydantic model instance
  5. Pydantic validates all field types, constraints, and required fields
  6. The validated model instance is available via result.final_output_as()

This is not "hoping the model returns valid JSON" — the Responses API uses constrained decoding to guarantee the output matches the schema structurally.

Field Descriptions and Constraints

Use Pydantic's Field to add descriptions (which guide the LLM) and constraints (which validate the output):

from pydantic import BaseModel, Field

class LeadScore(BaseModel):
    company_name: str = Field(
        description="The name of the company being scored"
    )
    score: int = Field(
        description="Lead quality score from 0 to 100",
        ge=0,
        le=100,
    )
    confidence: float = Field(
        description="Confidence in the score from 0.0 to 1.0",
        ge=0.0,
        le=1.0,
    )
    reasoning: str = Field(
        description="Brief explanation of why this score was assigned"
    )
    recommended_action: str = Field(
        description="Next best action: 'nurture', 'qualify', 'close', or 'disqualify'",
        pattern="^(nurture|qualify|close|disqualify)$",
    )
    tags: list[str] = Field(
        default_factory=list,
        description="Relevant tags like 'enterprise', 'startup', 'high-value'",
    )

Field descriptions are included in the JSON schema sent to the model, so they act as inline instructions for each field.

Nested Models

Complex data structures are supported through nested Pydantic models:

from pydantic import BaseModel, Field

class Address(BaseModel):
    street: str
    city: str
    state: str
    zip_code: str
    country: str = "US"

class ContactInfo(BaseModel):
    email: str
    phone: str = Field(default="", description="Phone number with country code")
    address: Address

class ExtractedContact(BaseModel):
    first_name: str
    last_name: str
    job_title: str
    company: str
    contact_info: ContactInfo
    notes: str = Field(description="Any additional context from the source text")

agent = Agent(
    name="Contact Extractor",
    instructions="Extract structured contact information from the provided text. If a field is not mentioned, use reasonable defaults.",
    output_type=ExtractedContact,
)

result = Runner.run_sync(agent, """
Please extract the contact from this email signature:
John Smith | VP of Engineering
Acme Corp
[email protected] | (555) 123-4567
123 Main St, San Francisco, CA 94102
""")

contact = result.final_output_as(ExtractedContact)
print(f"{contact.first_name} {contact.last_name} at {contact.company}")
print(f"Email: {contact.contact_info.email}")
print(f"City: {contact.contact_info.address.city}")

Enum Fields for Constrained Choices

Use Python enums or Literal types to restrict field values:

flowchart TD
    CENTER(("Core Concepts"))
    CENTER --> N0["Converts the Pydantic model to a JSON s…"]
    CENTER --> N1["Includes the schema in the LLM request …"]
    CENTER --> N2["The model generates JSON that conforms …"]
    CENTER --> N3["The SDK parses the JSON into a Pydantic…"]
    CENTER --> N4["Pydantic validates all field types, con…"]
    CENTER --> N5["The validated model instance is availab…"]
    style CENTER fill:#4f46e5,stroke:#4338ca,color:#fff
from enum import Enum
from typing import Literal
from pydantic import BaseModel

class Severity(str, Enum):
    LOW = "low"
    MEDIUM = "medium"
    HIGH = "high"
    CRITICAL = "critical"

class BugReport(BaseModel):
    title: str
    severity: Severity
    category: Literal["ui", "api", "database", "auth", "performance"]
    steps_to_reproduce: list[str]
    expected_behavior: str
    actual_behavior: str
    affected_users: int

The enum values are included in the JSON schema, so the model knows exactly which values are valid.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Optional Fields

Use Optional for fields that may not always be present:

from typing import Optional
from pydantic import BaseModel

class AnalysisResult(BaseModel):
    main_topic: str
    sentiment: str
    language: str
    detected_entities: list[str]
    translation: Optional[str] = None  # Only present if text is non-English
    profanity_detected: bool
    profanity_examples: Optional[list[str]] = None  # Only if profanity found

TypedDict and Dataclass Support

While Pydantic models are recommended, the SDK also supports TypedDict and dataclass for output types:

from typing import TypedDict

class WeatherData(TypedDict):
    city: str
    temperature: float
    conditions: str
    humidity: int

agent = Agent(
    name="Weather Agent",
    instructions="Provide weather information.",
    output_type=WeatherData,
)

However, Pydantic models offer richer validation, field descriptions, and better error messages. Prefer Pydantic unless you have a specific reason to use TypedDict.

Combining Structured Output with Tools

Agents can use tools to gather information and then produce a structured output:

from pydantic import BaseModel, Field
from agents import Agent, Runner, function_tool

@function_tool
def get_stock_price(ticker: str) -> str:
    """Get the current stock price.

    Args:
        ticker: Stock ticker symbol.
    """
    prices = {"AAPL": 187.50, "GOOGL": 141.25, "MSFT": 415.80}
    price = prices.get(ticker, 0)
    return f"{ticker}: ${price}"

@function_tool
def get_company_financials(ticker: str) -> str:
    """Get key financial metrics for a company.

    Args:
        ticker: Stock ticker symbol.
    """
    return f"{ticker}: Revenue $394B, Net Income $97B, P/E 29.5, Market Cap $2.9T"

class InvestmentAnalysis(BaseModel):
    ticker: str
    current_price: float
    recommendation: str = Field(
        description="One of: strong_buy, buy, hold, sell, strong_sell"
    )
    target_price: float
    risk_level: str = Field(description="low, medium, or high")
    key_factors: list[str]
    summary: str

analyst = Agent(
    name="Investment Analyst",
    instructions="""Analyze stocks using available tools. Provide a structured
investment analysis based on the data you gather.""",
    tools=[get_stock_price, get_company_financials],
    output_type=InvestmentAnalysis,
)

result = Runner.run_sync(analyst, "Analyze Apple stock (AAPL)")
analysis = result.final_output_as(InvestmentAnalysis)
print(f"Recommendation: {analysis.recommendation}")
print(f"Target Price: ${analysis.target_price}")

The agent loop works exactly the same — the agent calls tools, gathers data, and when it is ready to produce a final response, it formats it according to the Pydantic schema.

Handling Validation Errors

If the model produces JSON that does not validate against the Pydantic model, the SDK can retry. You can configure this behavior:

agent = Agent(
    name="Strict Extractor",
    instructions="Extract data precisely. All fields are required and must be accurate.",
    output_type=StrictDataModel,
)

# The SDK automatically retries with the validation error message
# if the first attempt fails validation

In practice, constrained decoding on the Responses API makes structural validation failures extremely rare. Most validation failures come from semantic issues (wrong value in a field) rather than structural issues (missing field or wrong type).

Real-World Example: Resume Parser

Here is a production-realistic example that extracts structured data from unstructured text:

import asyncio
from pydantic import BaseModel, Field
from agents import Agent, Runner

class Education(BaseModel):
    institution: str
    degree: str
    field_of_study: str
    graduation_year: int = Field(ge=1950, le=2030)

class WorkExperience(BaseModel):
    company: str
    title: str
    start_year: int
    end_year: int = Field(default=0, description="0 means current/present")
    highlights: list[str]

class ParsedResume(BaseModel):
    full_name: str
    email: str
    phone: str = ""
    location: str = ""
    summary: str = Field(description="Professional summary in 2-3 sentences")
    skills: list[str]
    education: list[Education]
    experience: list[WorkExperience]
    years_of_experience: int
    seniority_level: str = Field(
        description="junior, mid, senior, or lead"
    )

resume_parser = Agent(
    name="Resume Parser",
    instructions="""Parse the provided resume text into structured data.
Extract all available information. For missing fields, use empty strings or
reasonable defaults. Calculate total years of experience from work history.""",
    output_type=ParsedResume,
    model="gpt-4o",
)

async def parse_resume(resume_text: str) -> ParsedResume:
    result = await Runner.run(resume_parser, resume_text)
    return result.final_output_as(ParsedResume)

This pattern is extremely common in production AI applications: take unstructured input, use an LLM to extract and structure the data, and get a validated Pydantic model that your application can consume with full type safety.

Best Practices

  1. Add Field descriptions for every field. They guide the LLM and serve as documentation.

  2. Use constraints (ge, le, pattern, min_length) to catch invalid values before they enter your application.

  3. Keep models focused. A model with 30 fields is harder for the LLM to fill correctly than three models with 10 fields each.

  4. Use Optional for truly optional fields. Do not make fields required if the source data might not contain them.

  5. Test with edge cases. Try inputs where fields are ambiguous or missing to see how the model handles them.


Source: OpenAI Agents SDK — Structured Outputs

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Technical Guides

Building Voice Agents with the OpenAI Realtime API: Full Tutorial

Hands-on tutorial for building voice agents with the OpenAI Realtime API — WebSocket setup, PCM16 audio, server VAD, and function calling.

Technical Guides

How AI Voice Agents Actually Work: Technical Deep Dive (2026 Edition)

A full technical walkthrough of how modern AI voice agents work — speech-to-text, LLM orchestration, TTS, tool calling, and sub-second latency.

Technical Guides

Voice AI Latency: Why Sub-Second Response Time Matters (And How to Hit It)

A technical breakdown of voice AI latency budgets — STT, LLM, TTS, network — and how to hit sub-second end-to-end response times.

AI Interview Prep

8 AI System Design Interview Questions Actually Asked at FAANG in 2026

Real AI system design interview questions from Google, Meta, OpenAI, and Anthropic. Covers LLM serving, RAG pipelines, recommendation systems, AI agents, and more — with detailed answer frameworks.

AI Interview Prep

8 LLM & RAG Interview Questions That OpenAI, Anthropic & Google Actually Ask

Real LLM and RAG interview questions from top AI labs in 2026. Covers fine-tuning vs RAG decisions, production RAG pipelines, evaluation, PEFT methods, positional embeddings, and safety guardrails with expert answers.

AI Interview Prep

7 ML Fundamentals Questions That Top AI Companies Still Ask in 2026

Real machine learning fundamentals interview questions from OpenAI, Google DeepMind, Meta, and xAI in 2026. Covers attention mechanisms, KV cache, distributed training, MoE, speculative decoding, and emerging architectures.