Skip to content
AI Voice Agents
AI Voice Agents12 min read0 views

How to Build Multi-Agent Voice Handoffs with OpenAI Agents SDK

Build a triage RealtimeAgent that hands off to specialist agents (billing, scheduling) in a single voice session — TypeScript code, real handoff events, and CallSphere patterns.

TL;DRRealtimeAgent from @openai/agents/realtime lets one voice agent hand off to another mid-call by triggering a session.update under the hood. You define the graph; the model picks the path.

What you'll build

A voice triage system with three agents: Triage (greets and routes), Billing (handles payments), and Scheduling (books slots). The user calls in, the triage agent classifies intent, then control transfers to the right specialist with new instructions and tools — all in one continuous audio session.

Prerequisites

  1. Node 20+ with npm install @openai/agents @openai/agents-realtime zod.
  2. OpenAI API key with Realtime access.
  3. A WebRTC or WebSocket transport (browser or server).
  4. Familiarity with zod for tool schemas.
  5. Optional but recommended: @openai/agents-extensions for tracing.

Architecture

flowchart TD
  T[Triage Agent] -->|"need billing"| B[Billing Agent]
  T -->|"need booking"| S[Scheduling Agent]
  B -.->|escalate| T
  S -.->|escalate| T
  T --> END[End call]

Step 1 — Define the specialist agents

```ts import { RealtimeAgent } from "@openai/agents/realtime"; import { tool } from "@openai/agents"; import { z } from "zod";

const billingAgent = new RealtimeAgent({ name: "Billing", instructions: `You handle billing questions only. Use lookup_invoice for invoice lookups. If the caller asks something else, hand off back to Triage.`, tools: [ tool({ name: "lookup_invoice", description: "Look up an invoice by id", parameters: z.object({ invoice_id: z.string() }), execute: async ({ invoice_id }) => { const inv = await db.invoice.findUnique({ where: { id: invoice_id } }); return JSON.stringify(inv); }, }), ], });

const schedulingAgent = new RealtimeAgent({ name: "Scheduling", instructions: `You schedule appointments. Use list_slots and book_slot.`, tools: [ tool({ name: "list_slots", description: "List available slots for a date", parameters: z.object({ date: z.string() }), execute: async ({ date }) => JSON.stringify(await db.slot.findMany({ where: { date } })), }), tool({ name: "book_slot", description: "Book a slot", parameters: z.object({ slot_id: z.string(), name: z.string() }), execute: async (a) => { await db.booking.create({ data: a }); return `Booked: GB-${new Date().toISOString().slice(0,10).replaceAll("-","")}-${Math.floor(Math.random()*900+100)}`; }, }), ], }); ```

Step 2 — Triage agent with handoffs

```ts const triageAgent = new RealtimeAgent({ name: "Triage", instructions: `You greet callers and route them. If they mention invoices, refunds, or charges, hand off to Billing. If they want appointments, hand off to Scheduling. Keep replies to one short sentence before handing off.`, handoffs: [billingAgent, schedulingAgent], });

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

// Add return-handoffs so specialists can bounce back if user changes intent billingAgent.handoffs = [triageAgent, schedulingAgent]; schedulingAgent.handoffs = [triageAgent, billingAgent]; ```

The SDK auto-generates a transfer_to_<agent> tool on each agent for every entry in handoffs. The model picks which to call.

Step 3 — Open a Realtime session

```ts import { RealtimeSession } from "@openai/agents/realtime";

const session = new RealtimeSession(triageAgent, { model: "gpt-4o-realtime-preview-2025-06-03", config: { voice: "alloy", inputAudioFormat: "pcm16", outputAudioFormat: "pcm16", turnDetection: { type: "server_vad", threshold: 0.55 }, }, });

await session.connect({ apiKey: process.env.OPENAI_API_KEY!, // Or pass an ephemeral key for browser transports }); ```

Step 4 — Listen for handoff events

```ts session.on("agent_handoff", ({ from, to }) => { console.log(`Handoff: ${from.name} -> ${to.name}`); // Persist for analytics — handoff trail per call metrics.increment("handoff", { from: from.name, to: to.name }); });

session.on("tool_call", ({ name, arguments: args }) => { console.log("tool:", name, args); }); ```

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Step 5 — Transport: browser WebRTC or Twilio

For browser, use the WebRTC transport (see post 4). For phone via Twilio, use the SDK's Twilio adapter from @openai/agents/extensions:

```ts import { TwilioRealtimeTransportLayer } from "@openai/agents-extensions";

const transport = new TwilioRealtimeTransportLayer({ twilioWebSocket }); const session = new RealtimeSession(triageAgent, { transport }); await session.connect({ apiKey }); ```

Common pitfalls

  • Handoff loop: don't let two agents handoff to each other unconditionally. Add explicit "escalate when X" in instructions.
  • Specialists too broad: keep each agent narrow. Three focused agents beat one fat one with a giant prompt.
  • No persistent memory across handoffs: each handoff sends a session.update, but tool history is preserved within the session.
  • Model picks the wrong agent: tighten the descriptions in handoffs — the model uses them to choose.

How CallSphere does this in production

CallSphere Salon runs 4 ElevenLabs agents (Reception → Booking → Reschedule → Retention) using exactly this handoff pattern. Real Estate OneRoof uses OpenAI Agents SDK with a triage agent and 5 specialists (listings, mortgage, valuation, scheduling, escalation). Across all 6 verticals, average per-call handoff count is 1.4. See healthcare or demo it.

FAQ

Can specialists call other specialists' tools? Only the active agent's tools are exposed. To share, register them on each agent or build a shared tool module.

How is handoff different from a sub-agent call? Handoff transfers control (one active agent at a time). agent.asTool() lets you keep one agent active and call another as a function.

Do user audio buffers persist across handoff? Yes — within one Realtime session, audio history is kept; only instructions/tools change.

How do I gate handoffs? Return early from a tool with a custom error, or add a canHandoff middleware in your transport layer.

Sources

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.