By Sagar Shankaran, Founder of CallSphere
Build a triage RealtimeAgent that hands off to specialist agents (billing, scheduling) in a single voice session — TypeScript code, real handoff events, and CallSphere patterns.
Key takeaways
TL;DR —
RealtimeAgentfrom@openai/agents/realtimelets one voice agent hand off to another mid-call by triggering asession.updateunder the hood. You define the graph; the model picks the path.
A voice triage system with three agents: Triage (greets and routes), Billing (handles payments), and Scheduling (books slots). The user calls in, the triage agent classifies intent, then control transfers to the right specialist with new instructions and tools — all in one continuous audio session.
npm install @openai/agents @openai/agents-realtime zod.zod for tool schemas.@openai/agents-extensions for tracing.flowchart TD
T[Triage Agent] -->|"need billing"| B[Billing Agent]
T -->|"need booking"| S[Scheduling Agent]
B -.->|escalate| T
S -.->|escalate| T
T --> END[End call]
```ts import { RealtimeAgent } from "@openai/agents/realtime"; import { tool } from "@openai/agents"; import { z } from "zod";
const billingAgent = new RealtimeAgent({ name: "Billing", instructions: `You handle billing questions only. Use lookup_invoice for invoice lookups. If the caller asks something else, hand off back to Triage.`, tools: [ tool({ name: "lookup_invoice", description: "Look up an invoice by id", parameters: z.object({ invoice_id: z.string() }), execute: async ({ invoice_id }) => { const inv = await db.invoice.findUnique({ where: { id: invoice_id } }); return JSON.stringify(inv); }, }), ], });
const schedulingAgent = new RealtimeAgent({ name: "Scheduling", instructions: `You schedule appointments. Use list_slots and book_slot.`, tools: [ tool({ name: "list_slots", description: "List available slots for a date", parameters: z.object({ date: z.string() }), execute: async ({ date }) => JSON.stringify(await db.slot.findMany({ where: { date } })), }), tool({ name: "book_slot", description: "Book a slot", parameters: z.object({ slot_id: z.string(), name: z.string() }), execute: async (a) => { await db.booking.create({ data: a }); return `Booked: GB-${new Date().toISOString().slice(0,10).replaceAll("-","")}-${Math.floor(Math.random()*900+100)}`; }, }), ], }); ```
```ts const triageAgent = new RealtimeAgent({ name: "Triage", instructions: `You greet callers and route them. If they mention invoices, refunds, or charges, hand off to Billing. If they want appointments, hand off to Scheduling. Keep replies to one short sentence before handing off.`, handoffs: [billingAgent, schedulingAgent], });
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
// Add return-handoffs so specialists can bounce back if user changes intent billingAgent.handoffs = [triageAgent, schedulingAgent]; schedulingAgent.handoffs = [triageAgent, billingAgent]; ```
The SDK auto-generates a transfer_to_<agent> tool on each agent for every entry in handoffs. The model picks which to call.
```ts import { RealtimeSession } from "@openai/agents/realtime";
const session = new RealtimeSession(triageAgent, { model: "gpt-4o-realtime-preview-2025-06-03", config: { voice: "alloy", inputAudioFormat: "pcm16", outputAudioFormat: "pcm16", turnDetection: { type: "server_vad", threshold: 0.55 }, }, });
await session.connect({ apiKey: process.env.OPENAI_API_KEY!, // Or pass an ephemeral key for browser transports }); ```
```ts session.on("agent_handoff", ({ from, to }) => { console.log(`Handoff: ${from.name} -> ${to.name}`); // Persist for analytics — handoff trail per call metrics.increment("handoff", { from: from.name, to: to.name }); });
session.on("tool_call", ({ name, arguments: args }) => { console.log("tool:", name, args); }); ```
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
For browser, use the WebRTC transport (see post 4). For phone via Twilio, use the SDK's Twilio adapter from @openai/agents/extensions:
```ts import { TwilioRealtimeTransportLayer } from "@openai/agents-extensions";
const transport = new TwilioRealtimeTransportLayer({ twilioWebSocket }); const session = new RealtimeSession(triageAgent, { transport }); await session.connect({ apiKey }); ```
session.update, but tool history is preserved within the session.handoffs — the model uses them to choose.CallSphere Salon runs 4 ElevenLabs agents (Reception → Booking → Reschedule → Retention) using exactly this handoff pattern. Real Estate OneRoof uses OpenAI Agents SDK with a triage agent and 5 specialists (listings, mortgage, valuation, scheduling, escalation). Across all 6 verticals, average per-call handoff count is 1.4. See healthcare or demo it.
Can specialists call other specialists' tools? Only the active agent's tools are exposed. To share, register them on each agent or build a shared tool module.
How is handoff different from a sub-agent call? Handoff transfers control (one active agent at a time). agent.asTool() lets you keep one agent active and call another as a function.
Do user audio buffers persist across handoff? Yes — within one Realtime session, audio history is kept; only instructions/tools change.
How do I gate handoffs? Return early from a tool with a custom error, or add a canHandoff middleware in your transport layer.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
Haystack 2.7's Agent component plus an Ollama-served Llama 3.2 gives you tool-calling RAG with citations. Here's a complete pipeline against your own document store.
Run STT, LLM, and TTS entirely on Cloudflare's edge — no OpenAI, no ElevenLabs. Real working code with Whisper, Llama 3.3 70B, and Deepgram Aura.
Version your prompts in git, run a 50-case eval suite on every PR, block merges below threshold, and ship a new agent prompt with confidence — full GitHub Actions tutorial.
AI SDK 5 ships fully typed chat for React, Svelte, Vue, and Angular plus first-class agent loop primitives. Here are the patterns that matter for shipping in 2026.
Replace expensive outbound SDR tooling with a self-hosted dialer that runs OpenAI Realtime agents at 100 concurrent calls. Full architecture and code.
Mastra.ai is becoming the go-to TypeScript agent framework in 2026. Workflows, RAG, evals, and an honest comparison with Vercel AI SDK 5 for serious teams.
© 2026 CallSphere LLC. All rights reserved.
Watch how CallSphere handles real customer calls, schedules appointments, and processes payments — live.
Try Live DemoBook a DemoCalculate Your ROI