LangGraph Subgraphs in Production: Isolation, Checkpointing, Namespaces
Subgraphs are the LangGraph equivalent of microservice decomposition. We unpack namespace isolation, per-subgraph checkpointers, and the MultipleSubgraphsError trap.
TL;DR — A LangGraph subgraph is the agent equivalent of a microservice. It owns its state schema, can have its own checkpointer, and communicates with the parent through a tightly typed input/output contract. Get the namespace isolation right and you can deploy, test, and re-use them independently.
When to reach for a subgraph
flowchart TD
Client[MCP client · Claude Desktop] --> MCP[MCP server]
MCP --> Tool1[Tool: Calendar]
MCP --> Tool2[Tool: CRM]
MCP --> Tool3[Tool: KB search]
Tool1 --> SaaS1[(Calendly)]
Tool2 --> SaaS2[(Salesforce)]
Tool3 --> SaaS3[(Notion)]You want a subgraph when a section of your workflow is independently testable, has its own state, and might be re-used or deployed elsewhere. A research workflow might decompose into a retrieval subgraph, a synthesis subgraph, and a review subgraph — each with its own schema, unit tests, and (critically) its own checkpointer.
You do not want a subgraph when the section is just a few nodes that share state with the parent. That's a function, not a subgraph.
Namespace isolation — the trap nobody warns you about
If you call subgraphs from inside a node, LangGraph assigns checkpoint namespaces by call order. If two of your subgraph instances accidentally share a namespace, their checkpoints overwrite each other. This is the root cause of the dreaded MultipleSubgraphsError in GitHub discussion #2095.
The fix: when you have multiple instances of the same subgraph (e.g., one per tenant, one per call, one per user), each one needs its own storage namespace so checkpoints don't collide.
Two patterns work:
A) Pass a unique thread_id per invocation:
config = {"configurable": {"thread_id": f"tenant-{tenant_id}-call-{call_id}"}}
result = parent_graph.invoke(input, config=config)
B) Compile each subgraph instance with a dedicated checkpointer:
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
research_graph = StateGraph(ResearchState).compile(checkpointer=research_saver)
synthesis_graph = StateGraph(SynthesisState).compile(checkpointer=synthesis_saver)
Pattern (A) is the right default. Pattern (B) is for when the subgraph has fundamentally different durability requirements (e.g., synthesis needs Postgres, retrieval is happy with SQLite).
Shared state vs isolated state
Shared state: subgraph reuses the parent's state schema. Simpler, automatic communication, but cross-contamination risk — the subgraph can stomp parent fields.
Isolated state: subgraph has its own schema, independent from the parent. You write explicit transformations at the boundaries. More boilerplate but proper encapsulation.
For production, default to isolated state. The boilerplate is a 5-line input/output transformer per subgraph; the safety is permanent. Shared state is a prototyping shortcut, not a production pattern.
Checkpointers per subgraph
Each subgraph can have its own checkpointer — useful when:
- Retrieval is stateless and shouldn't pollute the parent's history.
- Synthesis is long-running and needs Postgres-grade durability.
- Review is short-lived and can run with an in-memory saver.
In a multi-agent system this also means each agent can keep its own internal scratchpad without leaking into the supervisor's state. That's a security and a clarity win.
How CallSphere structures it
CallSphere's Real Estate OneRoof deployment is the canonical example. The supervisor agent is a parent LangGraph that handles routing, escalation, and human-in-the-loop. It calls 10 specialist subgraphs — Buyer, Seller, Renter, Investor, Commercial, Land, Mortgage, Inspection, Listing, Showing — each with isolated state, its own checkpointer, and its own observability project in LangSmith.
When a buyer subgraph fails (say, an MLS API outage), the supervisor sees a clean failure boundary and can re-route or retry without dragging the rest of the conversation into a half-state. We learned the hard way that without isolation a single API failure could corrupt the entire call's checkpoint and force a hard restart.
This same pattern runs in our healthcare deployment (14 specialist subgraphs for intake, eligibility, scheduling, refills, prior auth) and our after-hours product (7 agents with explicit escalation).
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Pricing: $149 / $499 / $1499. 14-day trial. 22% affiliate.
Build steps — extract a subgraph
- Identify a node cluster that has clear inputs, clear outputs, and an internal state vocabulary the rest of the graph doesn't need.
- Move it to its own
StateGraphwith its ownTypedDictstate schema. - Add an input transformer (parent state → subgraph input) and an output transformer (subgraph output → parent state update).
- Compile with a dedicated checkpointer if it has different durability needs.
- In the parent, add the subgraph as a node:
builder.add_node("research", research_graph). - Test the subgraph independently with its own pytest suite — this is the whole point.
- Wire LangSmith with a dedicated project name per subgraph for clean tracing.
Code: parent + isolated subgraph
from langgraph.graph import StateGraph
from typing import TypedDict
class ResearchState(TypedDict):
query: str
docs: list[str]
summary: str
research = StateGraph(ResearchState)
# ...add nodes
research_graph = research.compile(checkpointer=research_saver)
class ParentState(TypedDict):
user_input: str
research_query: str
research_summary: str
def call_research(state: ParentState) -> dict:
sub = research_graph.invoke({"query": state["research_query"], "docs": [], "summary": ""})
return {"research_summary": sub["summary"]}
parent = StateGraph(ParentState)
parent.add_node("research", call_research)
Streaming and human-in-the-loop across subgraphs
Two production patterns that subgraphs make clean:
Streaming. When you call parent_graph.astream(..., subgraphs=True), you get a unified stream of events from the parent and all nested subgraphs. Each event includes its namespace, so the UI can render which subgraph is currently working. This is how OneRoof's UI shows "Buyer agent is checking listings..." mid-call.
Human-in-the-loop. Each subgraph can independently use LangGraph's interrupt primitive to pause for human approval. The parent doesn't need to know — when the subgraph resumes, control flows back. We use this for high-stakes writes (sending a quote, scheduling a property tour) where the rep must approve before the agent commits.
Observability — why per-subgraph LangSmith projects help
We tag each subgraph with its own LangSmith project name. Trace traversal becomes much faster: instead of scrolling through a 200-span supervisor trace looking for the buyer subgraph's behavior, you open the buyer-agent project and see only buyer spans. When something fails in production, the right team's dashboard lights up.
The cost of this discipline is small (a single env var per subgraph). The benefit compounds with every incident.
When subgraphs are the wrong answer
Two anti-patterns we've watched teams fall into:
- Premature decomposition. Splitting a 4-node section into a subgraph "for cleanliness" before you have evidence it's actually independently testable. The boundary state-transformer adds complexity that pays off only when you get the second use case.
- Fan-out without clear merging. If your parent calls 5 subgraphs in parallel and you don't have a deterministic way to merge their outputs, you'll fight non-determinism forever. Either pick one subgraph's output as canonical, or write an explicit merge node.
FAQ
Can I stream from a subgraph? Yes — stream_mode="values" on the parent surfaces subgraph state updates if you set subgraphs=True.
How do I avoid MultipleSubgraphsError? Always pass a unique thread_id per invocation. If you must call the same subgraph multiple times in one parent run, give each call a distinct config.
Should every node be a subgraph? No. Use subgraphs for cohesive, independently testable units. Otherwise you're just reinventing function calls with extra ceremony.
Where do I see this on CallSphere? Run a demo of OneRoof and ask to see the subgraph trace in LangSmith — happy to walk through it.
Sources
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.