By Sagar Shankaran, Founder of CallSphere
Subgraphs are the LangGraph equivalent of microservice decomposition. We unpack namespace isolation, per-subgraph checkpointers, and the MultipleSubgraphsError trap.
Key takeaways
TL;DR — A LangGraph subgraph is the agent equivalent of a microservice. It owns its state schema, can have its own checkpointer, and communicates with the parent through a tightly typed input/output contract. Get the namespace isolation right and you can deploy, test, and re-use them independently.
flowchart TD
Client[MCP client · Claude Desktop] --> MCP[MCP server]
MCP --> Tool1[Tool: Calendar]
MCP --> Tool2[Tool: CRM]
MCP --> Tool3[Tool: KB search]
Tool1 --> SaaS1[(Calendly)]
Tool2 --> SaaS2[(Salesforce)]
Tool3 --> SaaS3[(Notion)]You want a subgraph when a section of your workflow is independently testable, has its own state, and might be re-used or deployed elsewhere. A research workflow might decompose into a retrieval subgraph, a synthesis subgraph, and a review subgraph — each with its own schema, unit tests, and (critically) its own checkpointer.
You do not want a subgraph when the section is just a few nodes that share state with the parent. That's a function, not a subgraph.
If you call subgraphs from inside a node, LangGraph assigns checkpoint namespaces by call order. If two of your subgraph instances accidentally share a namespace, their checkpoints overwrite each other. This is the root cause of the dreaded MultipleSubgraphsError in GitHub discussion #2095.
The fix: when you have multiple instances of the same subgraph (e.g., one per tenant, one per call, one per user), each one needs its own storage namespace so checkpoints don't collide.
Two patterns work:
A) Pass a unique thread_id per invocation:
config = {"configurable": {"thread_id": f"tenant-{tenant_id}-call-{call_id}"}}
result = parent_graph.invoke(input, config=config)
B) Compile each subgraph instance with a dedicated checkpointer:
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
research_graph = StateGraph(ResearchState).compile(checkpointer=research_saver)
synthesis_graph = StateGraph(SynthesisState).compile(checkpointer=synthesis_saver)
Pattern (A) is the right default. Pattern (B) is for when the subgraph has fundamentally different durability requirements (e.g., synthesis needs Postgres, retrieval is happy with SQLite).
Shared state: subgraph reuses the parent's state schema. Simpler, automatic communication, but cross-contamination risk — the subgraph can stomp parent fields.
Isolated state: subgraph has its own schema, independent from the parent. You write explicit transformations at the boundaries. More boilerplate but proper encapsulation.
For production, default to isolated state. The boilerplate is a 5-line input/output transformer per subgraph; the safety is permanent. Shared state is a prototyping shortcut, not a production pattern.
Each subgraph can have its own checkpointer — useful when:
In a multi-agent system this also means each agent can keep its own internal scratchpad without leaking into the supervisor's state. That's a security and a clarity win.
CallSphere's Real Estate OneRoof deployment is the canonical example. The supervisor agent is a parent LangGraph that handles routing, escalation, and human-in-the-loop. It calls 10 specialist subgraphs — Buyer, Seller, Renter, Investor, Commercial, Land, Mortgage, Inspection, Listing, Showing — each with isolated state, its own checkpointer, and its own observability project in LangSmith.
When a buyer subgraph fails (say, an MLS API outage), the supervisor sees a clean failure boundary and can re-route or retry without dragging the rest of the conversation into a half-state. We learned the hard way that without isolation a single API failure could corrupt the entire call's checkpoint and force a hard restart.
This same pattern runs in our healthcare deployment (14 specialist subgraphs for intake, eligibility, scheduling, refills, prior auth) and our after-hours product (7 agents with explicit escalation).
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Pricing: $149 / $499 / $1499. 14-day trial. 22% affiliate.
StateGraph with its own TypedDict state schema.builder.add_node("research", research_graph).from langgraph.graph import StateGraph
from typing import TypedDict
class ResearchState(TypedDict):
query: str
docs: list[str]
summary: str
research = StateGraph(ResearchState)
# ...add nodes
research_graph = research.compile(checkpointer=research_saver)
class ParentState(TypedDict):
user_input: str
research_query: str
research_summary: str
def call_research(state: ParentState) -> dict:
sub = research_graph.invoke({"query": state["research_query"], "docs": [], "summary": ""})
return {"research_summary": sub["summary"]}
parent = StateGraph(ParentState)
parent.add_node("research", call_research)
Two production patterns that subgraphs make clean:
Streaming. When you call parent_graph.astream(..., subgraphs=True), you get a unified stream of events from the parent and all nested subgraphs. Each event includes its namespace, so the UI can render which subgraph is currently working. This is how OneRoof's UI shows "Buyer agent is checking listings..." mid-call.
Human-in-the-loop. Each subgraph can independently use LangGraph's interrupt primitive to pause for human approval. The parent doesn't need to know — when the subgraph resumes, control flows back. We use this for high-stakes writes (sending a quote, scheduling a property tour) where the rep must approve before the agent commits.
We tag each subgraph with its own LangSmith project name. Trace traversal becomes much faster: instead of scrolling through a 200-span supervisor trace looking for the buyer subgraph's behavior, you open the buyer-agent project and see only buyer spans. When something fails in production, the right team's dashboard lights up.
The cost of this discipline is small (a single env var per subgraph). The benefit compounds with every incident.
Two anti-patterns we've watched teams fall into:
Can I stream from a subgraph? Yes — stream_mode="values" on the parent surfaces subgraph state updates if you set subgraphs=True.
How do I avoid MultipleSubgraphsError? Always pass a unique thread_id per invocation. If you must call the same subgraph multiple times in one parent run, give each call a distinct config.
Should every node be a subgraph? No. Use subgraphs for cohesive, independently testable units. Otherwise you're just reinventing function calls with extra ceremony.
Where do I see this on CallSphere? Run a demo of OneRoof and ask to see the subgraph trace in LangSmith — happy to walk through it.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
How we built a fault-tolerant HVAC emergency triage and tech-dispatch platform on Kubernetes — three-tier CQRS, 11 micro-agents on the OpenAI Agents SDK + LangGraph, NATS JetStream, DTMF/SMS/WebSocket acceptance, circuit breakers, and an evaluation pipeline that catches regressions before they wake a tech at 3 AM.
OpenAI's GPT-Realtime-2 quadruples voice context to 128K tokens. Here is exactly what the 32K-to-128K jump changes for production phone agents.
Fully autonomous agents are still a fantasy in production. LangGraph's interrupt() lets you pause for human approval mid-graph without losing state. We cover approve/edit/reject/respond actions and CallSphere's escalation ladder.
How to stream tokens, tool-call deltas, and intermediate steps from an agent — with code for both the OpenAI Agents SDK and LangChain — and the gotchas that bite in production.
Beyond single-shot RAG — agentic RAG with LangGraph that re-retrieves, self-grades, and rewrites queries. With evals that catch silent retrieval drift.
Memory is supposed to make agents better — but does it? Build a memory eval pipeline that measures recall, precision, contradiction rate, and the freshness/staleness tradeoff.
© 2026 CallSphere LLC. All rights reserved.
Watch how CallSphere handles real customer calls, schedules appointments, and processes payments — live.
Try Live DemoBook a DemoCalculate Your ROI