Vapi Squads vs CallSphere Hierarchical Agents (Deep Dive)
Side-by-side architecture deep dive: Vapi Squads chained handoffs vs CallSphere OpenAI Agents SDK hierarchical orchestrator with return-to-parent flow.
TL;DR
Vapi Squads wire multiple assistants into a chain inside a single call — one assistant transfers to the next, and the new assistant runs to completion. CallSphere uses the OpenAI Agents SDK with a hierarchical orchestrator pattern: a parent agent dispatches to specialists and the specialist returns control when done. The first model is great for predictable linear flows. The second is required when you need branching, fallbacks, and "go back to triage if scheduling fails" semantics.
If your call flow is "always greet, then qualify, then book," Vapi Squads will work. If your flow has loops, retries, and policy-driven routing, you want hierarchy.
Why This Difference Matters
Voice calls are messy. People interrupt themselves, change topics, hang up mid-sentence, restart with a new question. A clean chain (A → B → C) breaks the moment a caller in C says "actually, can you check the records again?" — you need to hop back to A or sideways to a Records specialist.
Vapi Squads model agents as a directed graph of transitions. Once you transfer, you live in the new agent's frame. You can transfer again, but you cannot "return to parent." That makes the call topology essentially a tree with leaves — you walk down, you do not walk back up.
CallSphere's hierarchy explicitly supports return-to-parent. Specialists complete, the orchestrator regains control with the specialist's structured result merged into context, and the orchestrator decides what is next.
Vapi Squads Approach
A typical Vapi Squad config looks like this:
{
"name": "Real Estate Squad",
"members": [
{ "assistantId": "asst_greeter", "first": true },
{ "assistantId": "asst_qualifier" },
{ "assistantId": "asst_scheduler" }
],
"membersOverrides": {
"asst_greeter": {
"transferList": [
{ "destination": "asst_qualifier", "message": "Transferring you to qualification" }
]
},
"asst_qualifier": {
"transferList": [
{ "destination": "asst_scheduler", "message": "Booking specialist now" }
]
}
}
}
The flow is: greeter → qualifier → scheduler. Each assistant has its own system prompt, tools, and voice. Transitions are explicit and unidirectional unless you wire a back edge manually (which the model has no way to know about during reasoning).
Strengths:
- Conceptually simple
- Easy to debug per-assistant
- Per-assistant voice changes are crisp
Weaknesses:
- No true return-to-parent
- State sharing across squad members is shallow
- The current assistant has no awareness of the wider call plan
- Branching ("if user asks about pricing, jump to pricing assistant") requires manual transfer rules in every member
CallSphere Hierarchical Approach
CallSphere builds on the OpenAI Agents SDK. The basic shape is a parent orchestrator plus N specialist agents that the orchestrator can hand off to and that return when done.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
from agents import Agent, Runner, handoff
scheduling_agent = Agent(
name="SchedulingAgent",
instructions="...",
tools=[search_providers, schedule_appointment, send_sms],
)
records_agent = Agent(
name="RecordsAgent",
instructions="...",
tools=[lookup_patient, fetch_visits],
)
orchestrator = Agent(
name="Orchestrator",
instructions="""You triage healthcare calls. Hand off to scheduling for
appointment work, records for chart questions, and escalate to human for
clinical questions. Always summarize what the specialist returned before
closing the call.""",
handoffs=[
handoff(scheduling_agent, on_complete="return_to_parent"),
handoff(records_agent, on_complete="return_to_parent"),
],
)
result = await Runner.run(orchestrator, transcript)
The key flag is on_complete="return_to_parent". When scheduling_agent finishes its job (or hits a stop condition), control returns to the orchestrator with a structured handoff result attached to context. The orchestrator can then:
- Close the call
- Send to another specialist
- Recover from a specialist failure
- Re-prompt the user with a wider question
This is structurally equivalent to a function call returning, vs Vapi Squads' more JSR-style "jump and never come back."
Real Production Example: Real Estate
CallSphere's Real Estate voice agent ships with a hierarchy of:
- Lead Triage agent (orchestrator)
- Property Search specialist (with vision tool for buyer-uploaded photos)
- Mortgage Pre-Qual specialist
- Tour Scheduling specialist
- Escalation-to-Human specialist
A typical conversation does not visit each specialist once. A buyer asks about a 3-bedroom in Sunnyvale (Property Search), then asks "can I afford this on $180K?" (Mortgage), then returns to Property Search to refine. Hierarchy makes this trivial; chained Squads would require manual back-edges between every pair.
Handoff Trace
A real handoff trace from a Real Estate call (sanitized):
[00:00.512] orchestrator: hand_off → property_search
payload: { intent: "search", filters: { city: "Sunnyvale", beds: 3 } }
[00:08.341] property_search: vision_analyze(photo_id=ph_4k2)
[00:11.119] property_search: search_listings(filters)
[00:13.882] property_search: complete → return_to_parent
result: { listings: [...3 items...] }
[00:14.110] orchestrator: hand_off → mortgage_prequal
payload: { income: 180000, target_price: 1.4M }
[00:18.502] mortgage_prequal: prequal_calc(...)
[00:19.881] mortgage_prequal: complete → return_to_parent
result: { max_loan: 1.1M, monthly: 7200 }
[00:20.044] orchestrator: hand_off → property_search
payload: { intent: "search", filters: { city: "Sunnyvale", beds: 3, max_price: 1.1M } }
That third hand-off (back into property_search with a tighter budget) is the exact case Vapi Squads cannot model cleanly.
Vapi vs CallSphere Multi-Agent Comparison
| Dimension | Vapi Squads | CallSphere Hierarchy |
|---|---|---|
| Topology | DAG of assistants | Tree with return-to-parent |
| Return semantics | Explicit transfer back, manual | First-class return_to_parent |
| State sharing | Shallow (transfer message) | Full conversation + structured payload |
| Voice per agent | Easy (per-assistant) | Easy (per-specialist config) |
| Tool isolation | Per-assistant | Per-specialist |
| Reentry into prior agent | Manual transfer rule | Native handoff |
| Failure recovery | Hangup or transfer to fallback | Orchestrator catches and reroutes |
| Debugging | Vapi dashboard per call | Full handoff trace + Postgres logs |
| Agent count practical | 3-5 | 5-10 specialists comfortable |
Hierarchical Handoff State Machine
stateDiagram-v2
[*] --> Orchestrator
Orchestrator --> PropertySearch: intent=search
Orchestrator --> Mortgage: intent=affordability
Orchestrator --> TourSchedule: intent=tour
Orchestrator --> Human: complex/legal
PropertySearch --> Orchestrator: complete (return)
Mortgage --> Orchestrator: complete (return)
TourSchedule --> Orchestrator: complete (return)
PropertySearch --> Mortgage: cross-handoff via parent
Mortgage --> PropertySearch: refine search
Human --> [*]: warm transfer
Orchestrator --> [*]: call ends
Code: Failure Recovery Pattern
Hierarchy makes "specialist failed, fall back to human" trivial:
@handoff_returns
async def scheduling_complete(result: SchedulingResult, ctx: Ctx):
if result.status == "failed":
# Orchestrator picks this up and routes to human
return ctx.handoff_to(human_specialist, reason=result.error)
return ctx.close_call(summary=result.summary)
The Squads equivalent requires every member to know about the human-fallback assistant and emit a transfer rule for it.
FAQ
Can Vapi Squads do return-to-parent at all?
Not as a first-class primitive. You can simulate it with a transfer rule that points back to the prior assistant, but the inbound assistant has no awareness it is "resuming" — it starts fresh.
Does CallSphere ever use a flat (non-hierarchical) agent?
Yes — the After-Hours voicemail agent is a single specialist with no orchestrator, because the flow is genuinely linear (capture, classify, route).
How deep can the hierarchy go?
Three levels in production today: orchestrator → specialist → sub-specialist (e.g., Real Estate Tour Schedule has a sub-specialist for virtual vs in-person). Anything deeper tends to be a smell.
Is this OpenAI-specific?
The OpenAI Agents SDK runs on OpenAI models, but the hierarchy pattern is portable. The CallSphere implementation can be ported to Anthropic or open-weight models with minor adapter work.
What is the latency cost of hierarchy?
A hand-off costs roughly 80-150ms of additional roundtrip vs a flat agent because of the structured result merge. Below the human-perceivable threshold for voice.
See It In Action
The demo at /demo walks through a hierarchical Real Estate call with live trace, and /features documents the orchestrator + specialist patterns shipped per vertical.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.