Why Structured Outputs Matter

By default, agents return free-form text. That works for chatbots and creative writing, but most production applications need structured data they can programmatically consume — JSON objects that match a specific schema, with validated fields, correct types, and no missing data.

The OpenAI Agents SDK integrates deeply with Pydantic to provide structured outputs. You define a Pydantic model, set it as the agent's output_type, and the SDK guarantees the agent's response conforms to your schema.

Basic Structured Output

Define a Pydantic model and pass it as output_type:

flowchart LR
    INPUT(["User input"])
    AGENT["Agent<br/>name plus instructions"]
    HAND{"Handoff to<br/>another agent?"}
    SUB["Sub-agent<br/>specialist"]
    GUARD{"Guardrail<br/>passed?"}
    TOOL["Tool call"]
    SDK[("Tracing<br/>OpenAI dashboard")]
    OUT(["Final output"])
    INPUT --> AGENT --> HAND
    HAND -->|Yes| SUB --> GUARD
    HAND -->|No| GUARD
    GUARD -->|Yes| TOOL --> AGENT
    GUARD -->|Block| OUT
    AGENT --> OUT
    AGENT --> SDK
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style SDK fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff

from pydantic import BaseModel
from agents import Agent, Runner

class MovieReview(BaseModel):
    title: str
    rating: float
    pros: list[str]
    cons: list[str]
    summary: str

agent = Agent(
    name="Movie Critic",
    instructions="You are a movie critic. Analyze the given movie and provide a structured review.",
    output_type=MovieReview,
)

result = Runner.run_sync(agent, "Review the movie Inception (2010)")
review: MovieReview = result.final_output_as(MovieReview)

print(f"Title: {review.title}")
print(f"Rating: {review.rating}/10")
print(f"Pros: {', '.join(review.pros)}")
print(f"Cons: {', '.join(review.cons)}")
print(f"Summary: {review.summary}")

The LLM is instructed to respond with JSON matching the MovieReview schema. The SDK parses and validates the response before returning it. If the response does not match the schema, the SDK can retry by feeding the validation error back to the model.

How It Works Under the Hood

When you set output_type, the SDK:

Converts the Pydantic model to a JSON schema
Includes the schema in the LLM request as the response_format
The model generates JSON that conforms to the schema (using constrained decoding)
The SDK parses the JSON into a Pydantic model instance
Pydantic validates all field types, constraints, and required fields
The validated model instance is available via result.final_output_as()

This is not "hoping the model returns valid JSON" — the Responses API uses constrained decoding to guarantee the output matches the schema structurally.

Field Descriptions and Constraints

Use Pydantic's Field to add descriptions (which guide the LLM) and constraints (which validate the output):

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

from pydantic import BaseModel, Field

class LeadScore(BaseModel):
    company_name: str = Field(
        description="The name of the company being scored"
    )
    score: int = Field(
        description="Lead quality score from 0 to 100",
        ge=0,
        le=100,
    )
    confidence: float = Field(
        description="Confidence in the score from 0.0 to 1.0",
        ge=0.0,
        le=1.0,
    )
    reasoning: str = Field(
        description="Brief explanation of why this score was assigned"
    )
    recommended_action: str = Field(
        description="Next best action: 'nurture', 'qualify', 'close', or 'disqualify'",
        pattern="^(nurture|qualify|close|disqualify)$",
    )
    tags: list[str] = Field(
        default_factory=list,
        description="Relevant tags like 'enterprise', 'startup', 'high-value'",
    )

Field descriptions are included in the JSON schema sent to the model, so they act as inline instructions for each field.

Nested Models

Complex data structures are supported through nested Pydantic models:

from pydantic import BaseModel, Field

class Address(BaseModel):
    street: str
    city: str
    state: str
    zip_code: str
    country: str = "US"

class ContactInfo(BaseModel):
    email: str
    phone: str = Field(default="", description="Phone number with country code")
    address: Address

class ExtractedContact(BaseModel):
    first_name: str
    last_name: str
    job_title: str
    company: str
    contact_info: ContactInfo
    notes: str = Field(description="Any additional context from the source text")

agent = Agent(
    name="Contact Extractor",
    instructions="Extract structured contact information from the provided text. If a field is not mentioned, use reasonable defaults.",
    output_type=ExtractedContact,
)

result = Runner.run_sync(agent, """
Please extract the contact from this email signature:
John Smith | VP of Engineering
Acme Corp
john.smith@acme.com | (555) 123-4567
123 Main St, San Francisco, CA 94102
""")

contact = result.final_output_as(ExtractedContact)
print(f"{contact.first_name} {contact.last_name} at {contact.company}")
print(f"Email: {contact.contact_info.email}")
print(f"City: {contact.contact_info.address.city}")

Enum Fields for Constrained Choices

Use Python enums or Literal types to restrict field values:

from enum import Enum
from typing import Literal
from pydantic import BaseModel

class Severity(str, Enum):
    LOW = "low"
    MEDIUM = "medium"
    HIGH = "high"
    CRITICAL = "critical"

class BugReport(BaseModel):
    title: str
    severity: Severity
    category: Literal["ui", "api", "database", "auth", "performance"]
    steps_to_reproduce: list[str]
    expected_behavior: str
    actual_behavior: str
    affected_users: int

The enum values are included in the JSON schema, so the model knows exactly which values are valid.

Optional Fields

Use Optional for fields that may not always be present:

from typing import Optional
from pydantic import BaseModel

class AnalysisResult(BaseModel):
    main_topic: str
    sentiment: str
    language: str
    detected_entities: list[str]
    translation: Optional[str] = None  # Only present if text is non-English
    profanity_detected: bool
    profanity_examples: Optional[list[str]] = None  # Only if profanity found

TypedDict and Dataclass Support

While Pydantic models are recommended, the SDK also supports TypedDict and dataclass for output types:

from typing import TypedDict

class WeatherData(TypedDict):
    city: str
    temperature: float
    conditions: str
    humidity: int

agent = Agent(
    name="Weather Agent",
    instructions="Provide weather information.",
    output_type=WeatherData,
)

However, Pydantic models offer richer validation, field descriptions, and better error messages. Prefer Pydantic unless you have a specific reason to use TypedDict.

Combining Structured Output with Tools

Agents can use tools to gather information and then produce a structured output:

from pydantic import BaseModel, Field
from agents import Agent, Runner, function_tool

@function_tool
def get_stock_price(ticker: str) -> str:
    """Get the current stock price.

    Args:
        ticker: Stock ticker symbol.
    """
    prices = {"AAPL": 187.50, "GOOGL": 141.25, "MSFT": 415.80}
    price = prices.get(ticker, 0)
    return f"{ticker}: ${price}"

@function_tool
def get_company_financials(ticker: str) -> str:
    """Get key financial metrics for a company.

    Args:
        ticker: Stock ticker symbol.
    """
    return f"{ticker}: Revenue $394B, Net Income $97B, P/E 29.5, Market Cap $2.9T"

class InvestmentAnalysis(BaseModel):
    ticker: str
    current_price: float
    recommendation: str = Field(
        description="One of: strong_buy, buy, hold, sell, strong_sell"
    )
    target_price: float
    risk_level: str = Field(description="low, medium, or high")
    key_factors: list[str]
    summary: str

analyst = Agent(
    name="Investment Analyst",
    instructions="""Analyze stocks using available tools. Provide a structured
investment analysis based on the data you gather.""",
    tools=[get_stock_price, get_company_financials],
    output_type=InvestmentAnalysis,
)

result = Runner.run_sync(analyst, "Analyze Apple stock (AAPL)")
analysis = result.final_output_as(InvestmentAnalysis)
print(f"Recommendation: {analysis.recommendation}")
print(f"Target Price: ${analysis.target_price}")

The agent loop works exactly the same — the agent calls tools, gathers data, and when it is ready to produce a final response, it formats it according to the Pydantic schema.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Handling Validation Errors

If the model produces JSON that does not validate against the Pydantic model, the SDK can retry. You can configure this behavior:

agent = Agent(
    name="Strict Extractor",
    instructions="Extract data precisely. All fields are required and must be accurate.",
    output_type=StrictDataModel,
)

# The SDK automatically retries with the validation error message
# if the first attempt fails validation

In practice, constrained decoding on the Responses API makes structural validation failures extremely rare. Most validation failures come from semantic issues (wrong value in a field) rather than structural issues (missing field or wrong type).

Real-World Example: Resume Parser

Here is a production-realistic example that extracts structured data from unstructured text:

import asyncio
from pydantic import BaseModel, Field
from agents import Agent, Runner

class Education(BaseModel):
    institution: str
    degree: str
    field_of_study: str
    graduation_year: int = Field(ge=1950, le=2030)

class WorkExperience(BaseModel):
    company: str
    title: str
    start_year: int
    end_year: int = Field(default=0, description="0 means current/present")
    highlights: list[str]

class ParsedResume(BaseModel):
    full_name: str
    email: str
    phone: str = ""
    location: str = ""
    summary: str = Field(description="Professional summary in 2-3 sentences")
    skills: list[str]
    education: list[Education]
    experience: list[WorkExperience]
    years_of_experience: int
    seniority_level: str = Field(
        description="junior, mid, senior, or lead"
    )

resume_parser = Agent(
    name="Resume Parser",
    instructions="""Parse the provided resume text into structured data.
Extract all available information. For missing fields, use empty strings or
reasonable defaults. Calculate total years of experience from work history.""",
    output_type=ParsedResume,
    model="gpt-4o",
)

async def parse_resume(resume_text: str) -> ParsedResume:
    result = await Runner.run(resume_parser, resume_text)
    return result.final_output_as(ParsedResume)

This pattern is extremely common in production AI applications: take unstructured input, use an LLM to extract and structure the data, and get a validated Pydantic model that your application can consume with full type safety.

Best Practices

Add Field descriptions for every field. They guide the LLM and serve as documentation.
Use constraints (ge, le, pattern, min_length) to catch invalid values before they enter your application.
Keep models focused. A model with 30 fields is harder for the LLM to fill correctly than three models with 10 fields each.
Use Optional for truly optional fields. Do not make fields required if the source data might not contain them.
Test with edge cases. Try inputs where fields are ambiguous or missing to see how the model handles them.

Source: OpenAI Agents SDK — Structured Outputs

Structured Outputs with Pydantic: Type-Safe Agent Responses

Why Structured Outputs Matter

Basic Structured Output

How It Works Under the Hood

Field Descriptions and Constraints

Nested Models

Enum Fields for Constrained Choices

Optional Fields

TypedDict and Dataclass Support

Combining Structured Output with Tools

Handling Validation Errors

Real-World Example: Resume Parser

Best Practices

Try CallSphere AI Voice Agents

Related Articles You May Like

Desktop AI Agents in 2026: Project Arc, Claude Cowork, OpenAI Agents Compared

OpenAI Frontier: Model-Native Orchestration Is the Default in 2026

Gemini Enterprise vs Anthropic vs OpenAI Frontier: 2026 Comparison

Anthropic's Financial Services Platform: State of Play in May 2026

Model-Native Harness: Why OpenAI and Anthropic Are Killing ReAct Loops

GPT-Realtime-Whisper vs Deepgram: Streaming STT in 2026