Skip to content
AI Engineering
AI Engineering10 min read0 views

Pydantic AI in 2026: When Type-Safe Agents Beat Everything Else

Pydantic AI v1.85 ships with MCP, dependency injection, and 16k+ stars. We break down where strict typing pays for itself and where it gets in the way of shipping.

TL;DR — Pydantic AI treats every LLM output as a Pydantic model. That sounds boring until you've shipped your fourth agent and realized validation, dependency injection, and structured outputs were the actual hard parts. v1.85.1 (April 22, 2026) makes it production-default for type-first Python teams.

Why type safety matters for agents

flowchart LR
  User --> Triage["Triage / Supervisor"]
  Triage -->|tool A| A["Specialist A"]
  Triage -->|tool B| B["Specialist B"]
  Triage -->|tool C| C["Specialist C"]
  A --> Mem[(Shared memory · mem0/Letta)]
  B --> Mem
  C --> Mem
  Mem --> Final["Final response"]
CallSphere reference architecture

Most agent failures in production are not "the model hallucinated a tool name" — those get caught instantly. The failures that hurt are:

  • The model returned valid JSON but the amount field is a string instead of a Decimal.
  • A nested object is missing a required field.
  • An enum value is one of the 30 allowed values, but lower-cased differently than your DB enum.

Pydantic AI's pitch is that the same library you already use for FastAPI request validation and DB schemas should validate LLM outputs too. Define a pydantic.BaseModel, hand it to the agent, and the framework retries the model on validation failure with a clear error message.

What's in Pydantic AI 2026

The library reached its 1.x stable API in late 2025 and now sits at v1.85.1 (April 22, 2026) with 16.5k+ GitHub stars. Highlights:

  • Model-agnostic: OpenAI, Anthropic, Gemini, DeepSeek, Grok, Cohere, Mistral, Bedrock, Vertex AI, Ollama, LiteLLM — same API.
  • Dependency injection via the deps_type parameter — type-safe context for tools and unit tests.
  • MCP client + server built-in. Three transports: MCPServerStreamableHTTP, MCPServerSSE, MCPServerStdio.
  • Capability library ("Pydantic AI Harness") for web search, chain-of-thought, and MCP without bolt-ons.
  • Structured outputs end-to-end with native function calling, JSON schema fallback, or grammar constraints depending on the provider.

Where it shines

Financial/health/regulated workflows. When the cost of a malformed response is a wrong charge or a wrong dose, Pydantic AI's "validate or retry" loop is the cheapest insurance you can buy.

Legacy Python codebases. If your team already runs Pydantic v2 in FastAPI services, the agent code reads like normal application code. No new mental model.

Evaluation pipelines. Because every output is a typed model, snapshot testing and contract testing are trivial.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

Where it gets in the way

Voice / streaming. The validate-or-retry loop adds latency that voice agents can't afford. CallSphere's voice surface uses OpenAI Realtime with light-weight Zod validation in TypeScript instead.

Heavy code-gen agents. If your agent's "output" is hundreds of lines of Python, validating with a Pydantic schema is the wrong tool. Use smolagents or deepagents instead.

Highly experimental loops. When you don't yet know what the output should look like, strict typing fights you. Prototype with raw openai or anthropic SDKs first, then port to Pydantic AI when the schema stabilizes.

How CallSphere uses it

For our healthcare and behavioral-health verticals, the non-voice parts of the workflow — intake form parsing, insurance eligibility extraction, prior-auth letter drafting — run on Pydantic AI agents. The validators reject any output where, say, policy_number doesn't match a known carrier's format, and the agent retries with a corrective system message. That's saved us from at least three "we wrote the wrong number to the EHR" incidents during pilot.

Our Sales browser dialer post-call summary pipeline is also Pydantic AI: the model emits a typed CallSummary with required disposition, next_action, mentioned_competitors[], and buying_signals[]. Downstream systems (Salesforce, HubSpot, our internal pipeline) consume the validated object directly.

Pricing: $149 Starter / $499 Growth / $1499 Scale. 14-day trial. 22% affiliate.

Build steps — type-safe call-summary agent

  1. pip install "pydantic-ai[mcp]".
  2. Define your output as a BaseModel with fields, enums, and validators.
  3. Create the agent: agent = Agent("openai:gpt-5", output_type=CallSummary, deps_type=Deps).
  4. Inject deps into tools via the @agent.tool decorator and RunContext[Deps].
  5. Write a unit test using agent.override(model=TestModel()) — no API calls needed.
  6. Mount MCP servers if needed: agent = Agent(..., toolsets=[MCPServerStreamableHTTP(url=...)]).
  7. Deploy behind your existing FastAPI service; the agent is a normal async function.

Code: typed CallSummary agent

from pydantic import BaseModel
from pydantic_ai import Agent

class CallSummary(BaseModel):
    disposition: Literal["qualified", "not_interested", "callback", "voicemail"]
    next_action: str
    mentioned_competitors: list[str]
    buying_signals: list[str]
    follow_up_at: datetime | None = None

agent = Agent("openai:gpt-5", output_type=CallSummary,
              system_prompt="Summarize the call into the typed schema.")
result = await agent.run(transcript)
assert isinstance(result.output, CallSummary)

MCP server side — Pydantic AI as the server

Less talked about: Pydantic AI also lets your own application become an MCP server. You write tools as typed Python functions, expose them over MCPServerStreamableHTTP or MCPServerStdio, and any MCP-aware client (Claude Desktop, Cursor, Cline, your own agent) can mount them.

For CallSphere this means we can ship a "CallSphere MCP server" to customers' internal agents. The server exposes typed tools like get_call_summary(call_id), book_callback(at, contact_id), update_disposition(call_id, value) — same Pydantic models we use internally, automatically reflected as MCP tool schemas. Customers point their Claude Desktop or their internal agents at the server and our entire surface becomes available with strict typing.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

from pydantic_ai.mcp import MCPServerStreamableHTTP

server = MCPServerStreamableHTTP("/mcp")

@server.tool
async def get_call_summary(call_id: str) -> CallSummary:
    """Return the validated summary for a CallSphere call."""
    return await db.fetch_call_summary(call_id)

The big benefit: every MCP client gets the same JSON Schema we use server-side, with no hand-written schema duplication. That alone has saved us hours per integration.

Sampling — agents inside MCP servers

Pydantic AI models can also be used within MCP servers, allowing agents to use sampling via MCPSamplingModel to make LLM calls through the MCP client. This is the inverse pattern: the MCP server delegates LLM calls to whatever model the client chose. Useful when you want to ship tools but stay model-agnostic — your tool's logic uses the user's API key, not yours.

FAQ

Does it work with Anthropic prompt caching? Yes — the Anthropic provider sets cache breakpoints automatically around stable system prompts and tool definitions.

How does the dependency injection compare to FastAPI's? Same mental model. deps flows through the run, RunContext[Deps] is your handle inside tools and validators.

Can I stream typed outputs? Yes via agent.run_stream — partial validation runs as fields arrive.

How do I demo this on CallSphere? Pick the 14-day trial, choose the Sales product, and the post-call summary pipeline is the demo.

Does it support graph workflows like LangGraph? No — Pydantic AI is single-agent first. For graph topology, mix Pydantic AI agents inside LangGraph nodes. We do this in our healthcare deployment.

What's the testing story? First-class. Use agent.override(model=TestModel()) to swap a deterministic test model in unit tests, no API calls.

Is it FastAPI-friendly? Extremely. Pydantic models flow seamlessly between FastAPI endpoints and Pydantic AI agents — same dependency-injection mental model.

Sources

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like