Deep Agents is LangChain's harness for complex agentic tasks: planning tools, filesystem backend, and the ability to spawn subagents. It uses the same core tool-calling loop as ReAct but adds primitives that traditional ReAct loops lack.

What changed

The 2022 ReAct paper from Yao et al. (Princeton + Google) formalized the core agent loop: think, act, observe, repeat. Until 2025, almost every production agent ran some variant of this loop. The pattern works, but it has limits — long-horizon tasks blow context, tool choice gets confused with 20+ tools, and there is no native way to spawn subagents.

Deep agents are not a new loop. They are a harness on top of the ReAct loop with three additions:

Planning tool. The agent can write a plan to a virtual scratchpad, refer back to it, and update it. This stabilizes long-horizon work.
Filesystem backend. The agent gets a virtual filesystem to read and write intermediate work products. Replaces context-bloat with explicit state.
Subagent spawning. The agent can spawn focused subagents for sub-tasks (search, summarize, refactor). Each subagent has its own context.

Anthropic also published an Agent SDK alongside Claude 4.6 with similar primitives — extended thinking, computer use, MCP integration, persistent memory. Two implementations, same idea: agents need more than the bare ReAct loop for hard tasks.

Why it matters for production agent teams

Three concrete cases where deep agents beat ReAct:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

Long-horizon tasks (10+ minutes of agent work). ReAct loops drown in their own context. Deep agents externalize state to the virtual filesystem; the model sees a stable scratchpad rather than a growing context.

Branching exploration. "Research these 5 candidate vendors and compare them" — five subagents work in parallel, each with focused context. ReAct would serialize this and run out of context.

Iterative refinement. "Draft this document, then review it, then revise" — deep agents naturally split into draft / review / revise subagents. Each subagent prompt is small and focused.

For shorter, single-domain tasks (the vast majority of voice agent turns) the bare ReAct loop is faster and cheaper.

How CallSphere applies this

CallSphere uses both patterns deliberately:

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

All voice conversation turns: classic ReAct via the OpenAI Agents SDK. Single specialist, focused tool surface, sub-second turn time. Adding a deepagent harness here would make the conversation slower without quality benefit.
GTM lead-research workflows: deep agents with subagent spawning. "Research this prospect" spawns a Web Research subagent, a Public Filings subagent, and a Synthesis subagent in parallel. Each subagent gets focused context.
Internal due-diligence pipelines: deep agents for multi-step refactoring of internal docs and KB articles. Plan + filesystem + iterative refinement is exactly the right shape for this work.
Behavioral health intake forms: classic ReAct. The conversation is too time-sensitive for deep agent overhead.

The mental model: deep agents are for tasks that humans would say "give me 10 minutes to think." ReAct is for tasks that humans would say "let me grab that for you."

Migration / build steps

Classify your workloads. Sub-minute / single-domain = ReAct. Multi-minute / multi-step = deep agents.
Don't rebuild the world. ReAct works for 80% of voice agent turns. Reach for deep agents only when ReAct demonstrably falls down.
Start with the deepagents library. It is a thin layer on LangGraph; the learning curve is small.
Define a planning tool first. Even before subagents, the planning scratchpad alone delivers most of the long-horizon win.
Bound subagent depth. Allow at most 1-2 levels of subagent spawning. Unbounded recursion is a cost trap.

graph TD
    A[Master Deep Agent] --> B[Planning Tool]
    A --> C[Filesystem]
    A -->|spawn| D[Subagent: Web Research]
    A -->|spawn| E[Subagent: Public Filings]
    A -->|spawn| F[Subagent: Synthesis]
    D --> G[Tool Calls + Findings]
    E --> G
    F --> H[Final Output]
    B -.->|consults| A
    C -.->|persists| A

FAQ

Are deep agents just multi-agent under another name? Partly. The differentiator is the harness — planning tool, filesystem, controlled subagent lifecycle. Multi-agent without those primitives is brittle.

Does this work with Claude? Yes. The deepagents library supports any model that supports tool calling. Claude 4.6 / 4.7 are particularly strong on the underlying capabilities (extended thinking, MCP).

Should voice conversation turns ever use deep agents? Almost never. The latency is wrong for voice. Use deep agents for the async work that surrounds voice conversations.

What about Anthropic's official Agent SDK? Same idea, different implementation. Most teams pick based on their existing stack — LangGraph users go with deepagents; Anthropic-first teams use the Agent SDK.

Where do I start? Pick a single async workflow that already runs longer than 5 minutes and rebuild it as a deep agent. Compare cost, latency, and output quality. See our demo for live agent examples and our trial for a tenant to experiment in.

Deep Agents vs Traditional ReAct Loops: When CallSphere Picks What

What changed

Why it matters for production agent teams

How CallSphere applies this

Migration / build steps

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

MCP Servers for SaaS Tools: A 2026 Registry Walkthrough for Voice Agent Teams

Agentic RAG with LangGraph: Iterative Retrieval, Self-Correction, and Eval Pipelines

Evaluating Multi-Step Tool-Using Agents: Why End-to-End Metrics Lie

LangGraph State-Machine Architecture: A Principal-Engineer Deep Dive (2026)

LangGraph Supervisor Pattern: Orchestrating Multi-Agent Teams in 2026

LangGraph Checkpointers in Production: Durable, Resumable Agents with Eval Replay