Pydantic Models for LLM Output: Type-Safe AI Responses in Python

Why Type Safety Matters for LLM Outputs

Large language models return strings. Sometimes that string is valid JSON, sometimes it is almost-valid JSON with trailing commas, and sometimes the model ignores your formatting instructions entirely. If your application blindly calls json.loads() on raw LLM output, you are one creative hallucination away from a runtime crash.

Pydantic solves this by letting you define a Python class that describes exactly what your data should look like. When you parse LLM output through a Pydantic model, you get automatic type coercion, validation, and clear error messages when the data does not match your expectations.

Defining a Basic Output Model

Start with a simple model that describes a structured answer from an LLM:

flowchart LR
    INPUT(["User intent"])
    PARSE["Parse plus<br/>classify"]
    PLAN["Plan and tool<br/>selection"]
    AGENT["Agent loop<br/>LLM plus tools"]
    GUARD{"Guardrails<br/>and policy"}
    EXEC["Execute and<br/>verify result"]
    OBS[("Trace and metrics")]
    OUT(["Outcome plus<br/>next action"])
    INPUT --> PARSE --> PLAN --> AGENT --> GUARD
    GUARD -->|Pass| EXEC --> OUT
    GUARD -->|Fail| AGENT
    AGENT --> OBS
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style OBS fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff

from pydantic import BaseModel, Field
from typing import List, Optional

class AnalysisResult(BaseModel):
    sentiment: str = Field(description="positive, negative, or neutral")
    confidence: float = Field(ge=0.0, le=1.0, description="Confidence score")
    key_phrases: List[str] = Field(description="Important phrases from the text")
    summary: Optional[str] = Field(default=None, description="Brief summary")

The Field function adds constraints and descriptions. The ge and le parameters enforce that confidence stays between 0 and 1. The description strings serve double duty: they document your code and they can be fed back to the LLM as schema instructions.

Parsing Raw LLM Responses

Here is how you parse a JSON string from an LLM into your model:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

import json

raw_response = '''
{
  "sentiment": "positive",
  "confidence": 0.92,
  "key_phrases": ["excellent product", "fast shipping"],
  "summary": "Customer is satisfied with purchase."
}
'''

result = AnalysisResult.model_validate_json(raw_response)
print(result.sentiment)      # "positive"
print(result.confidence)     # 0.92
print(result.key_phrases)    # ["excellent product", "fast shipping"]

If the LLM returns a confidence of 1.5, Pydantic raises a ValidationError with a clear message explaining the constraint violation. No silent failures.

Nested Models for Complex Structures

Real-world extraction often requires nested data. Define models that compose together:

class Address(BaseModel):
    street: str
    city: str
    state: str
    zip_code: str = Field(pattern=r"^\d{5}(-\d{4})?$")

class Person(BaseModel):
    name: str
    age: Optional[int] = Field(default=None, ge=0, le=150)
    email: Optional[str] = None
    address: Optional[Address] = None

class ExtractionResult(BaseModel):
    people: List[Person]
    document_type: str
    extraction_confidence: float = Field(ge=0.0, le=1.0)

When you call ExtractionResult.model_validate_json(llm_output), Pydantic recursively validates every nested object. The zip code regex runs automatically. Ages outside 0-150 are rejected.

Custom Validators for Domain Logic

Add custom validators when built-in constraints are not enough:

from pydantic import field_validator, model_validator

class InvoiceItem(BaseModel):
    description: str
    quantity: int = Field(gt=0)
    unit_price: float = Field(gt=0)
    total: float

    @field_validator("description")
    @classmethod
    def description_not_empty(cls, v: str) -> str:
        if not v.strip():
            raise ValueError("Description cannot be blank")
        return v.strip()

    @model_validator(mode="after")
    def check_total(self) -> "InvoiceItem":
        expected = round(self.quantity * self.unit_price, 2)
        if abs(self.total - expected) > 0.01:
            raise ValueError(
                f"Total {self.total} does not match "
                f"quantity * unit_price = {expected}"
            )
        return self

The field_validator runs on a single field. The model_validator with mode="after" runs after all fields are parsed, so you can do cross-field checks like verifying that the total equals quantity times price.

Generating JSON Schema for the LLM Prompt

One of Pydantic's most powerful features is automatic JSON schema generation. Pass the schema directly to the LLM so it knows exactly what to produce:

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

schema = AnalysisResult.model_json_schema()
print(json.dumps(schema, indent=2))

prompt = f"""Analyze the following customer review and return your
analysis as JSON matching this exact schema:

{json.dumps(schema, indent=2)}

Review: "The product arrived quickly and works perfectly."
"""

This creates a tight feedback loop: the model sees the schema, generates matching JSON, and Pydantic validates the result. If validation fails, you can retry with the error message included in the prompt.

Handling Partial and Malformed Output

LLMs sometimes return JSON wrapped in markdown code fences or with extra text. Write a helper to clean up common issues:

import re

def parse_llm_json(raw: str, model_class: type[BaseModel]):
    """Extract JSON from LLM output and parse with Pydantic."""
    # Strip markdown code fences
    cleaned = re.sub(r"```json?\n?", "", raw)
    cleaned = re.sub(r"```", "", cleaned)
    cleaned = cleaned.strip()

    try:
        return model_class.model_validate_json(cleaned)
    except Exception as e:
        # Try parsing as Python dict (handles trailing commas, etc.)
        try:
            import ast
            data = ast.literal_eval(cleaned)
            return model_class.model_validate(data)
        except Exception:
            raise ValueError(f"Could not parse LLM output: {e}")

This two-stage approach handles the most common failure modes: markdown wrapping and minor JSON syntax issues.

FAQ

How does Pydantic v2 differ from v1 for LLM output parsing?

Pydantic v2 introduced model_validate_json() which parses JSON strings directly without an intermediate json.loads() call. It is also significantly faster thanks to the Rust-based core. Use model_validate() for dictionaries and model_validate_json() for raw JSON strings.

What happens when the LLM returns fields not in my schema?

By default, Pydantic v2 ignores extra fields. If you want strict parsing, add model_config = ConfigDict(extra="forbid") to your model class. This causes validation to fail if the LLM includes unexpected fields.

Can I use Pydantic models with streaming LLM responses?

Not directly, because streaming delivers partial JSON that is not valid until complete. You need a partial JSON parser to handle incremental tokens. Libraries like instructor handle this by buffering the stream and validating once the JSON object is complete.

#Pydantic #StructuredOutputs #Python #TypeSafety #LLM #AgenticAI #LearnAI #AIEngineering

Pydantic Models for LLM Output: Type-Safe AI Responses in Python

Why Type Safety Matters for LLM Outputs

Defining a Basic Output Model

Parsing Raw LLM Responses

Nested Models for Complex Structures

Custom Validators for Domain Logic

Generating JSON Schema for the LLM Prompt

Handling Partial and Malformed Output

FAQ

How does Pydantic v2 differ from v1 for LLM output parsing?

What happens when the LLM returns fields not in my schema?

Can I use Pydantic models with streaming LLM responses?

Try CallSphere AI Voice Agents

Related Articles You May Like

Building Your First Agent with the OpenAI Agents SDK in 2026: A Hands-On Walkthrough

Claude Sonnet 4.6 Vision Capabilities for Document and Chart Unders...

Claude for Equity Research: Workflows from Buy-Side Analysts

Smolagents: Hugging Face's Code-First Agent Framework Reviewed

Constitutional AI: Genuine Safety Moat or Sophisticated Marketing?

Bedrock Agents Powered by Claude: A Reference Architecture