Why Voice Agents Are Transforming Customer Support

Traditional IVR systems frustrate customers with rigid menu trees and robotic interactions. Voice agents powered by large language models change the equation entirely: they understand natural language, maintain context across a conversation, and can execute real actions like looking up orders or processing refunds.

In this tutorial, we build a production-grade voice customer support agent from scratch. The agent handles four departments — triage, billing, refunds, and FAQ — with seamless voice handoffs between them.

Architecture Overview

Our system has three layers:

flowchart LR
    USER(["Customer"])
    CHANNEL{"Channel"}
    CHAT["Chat agent"]
    VOICE["Voice agent"]
    EMAIL["Email agent"]
    TRIAGE["Triage and<br/>intent detection"]
    KB[("Knowledge base<br/>RAG")]
    CRM[("CRM context")]
    AUTORES{"Auto resolvable?"}
    RESOLVE(["Resolved with<br/>cited answer"])
    HUMAN(["Tier 2 agent"])
    USER --> CHANNEL --> CHAT --> TRIAGE
    CHANNEL --> VOICE --> TRIAGE
    CHANNEL --> EMAIL --> TRIAGE
    TRIAGE --> KB
    TRIAGE --> CRM
    TRIAGE --> AUTORES
    AUTORES -->|Yes| RESOLVE
    AUTORES -->|No| HUMAN
    style TRIAGE fill:#4f46e5,stroke:#4338ca,color:#fff
    style AUTORES fill:#f59e0b,stroke:#d97706,color:#1f2937
    style RESOLVE fill:#059669,stroke:#047857,color:#fff
    style HUMAN fill:#0ea5e9,stroke:#0369a1,color:#fff

Voice Transport Layer — WebSocket connection to OpenAI Realtime API for speech-to-speech
Agent Orchestration Layer — OpenAI Agents SDK managing triage, routing, and department-specific agents
Backend Integration Layer — FastAPI server with tools for order lookup, refund processing, and knowledge base queries

┌─────────────┐     WebSocket      ┌──────────────────┐
│   Customer   │◄──────────────────►│  OpenAI Realtime │
│   (Phone)    │                    │       API        │
└─────────────┘                    └────────┬─────────┘
                                            │
                                   ┌────────▼─────────┐
                                   │  Agent Orchestra  │
                                   │  ┌─────────────┐  │
                                   │  │   Triage     │  │
                                   │  │   Agent      │  │
                                   │  └──────┬──────┘  │
                                   │    ┌────┼────┐    │
                                   │  ┌─▼─┐┌─▼─┐┌─▼─┐ │
                                   │  │Bil││Ref││FAQ│  │
                                   │  └───┘└───┘└───┘  │
                                   └────────┬─────────┘
                                            │
                                   ┌────────▼─────────┐
                                   │   FastAPI Backend │
                                   │   (Tools + DB)    │
                                   └──────────────────┘

Step 1: Define the Tools

Every department needs access to backend systems. We define tools that the agents can call to look up orders, check billing, and process refunds.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

# tools.py
import httpx
from agents import function_tool

@function_tool
async def lookup_order(order_id: str) -> str:
    """Look up an order by its ID. Returns order status, items, and shipping info."""
    async with httpx.AsyncClient() as client:
        resp = await client.get(
            f"http://localhost:8000/api/orders/{order_id}",
            headers={"Authorization": f"Bearer {API_KEY}"},
        )
        if resp.status_code == 404:
            return f"No order found with ID {order_id}. Ask the customer to verify."
        data = resp.json()
    return (
        f"Order {order_id}: status={data['status']}, "
        f"items={data['items']}, total=${data['total']:.2f}, "
        f"shipped={data.get('shipped_date', 'not yet')}"
    )

@function_tool
async def check_billing(customer_id: str) -> str:
    """Retrieve billing history and current balance for a customer."""
    async with httpx.AsyncClient() as client:
        resp = await client.get(
            f"http://localhost:8000/api/billing/{customer_id}"
        )
        data = resp.json()
    invoices = data.get("invoices", [])
    summary = "; ".join(
        f"Invoice {inv['id']}: ${inv['amount']:.2f} ({inv['status']})"
        for inv in invoices[:5]
    )
    return f"Balance: ${data['balance']:.2f}. Recent invoices: {summary}"

@function_tool
async def process_refund(order_id: str, reason: str) -> str:
    """Process a refund for the given order. Requires a reason."""
    async with httpx.AsyncClient() as client:
        resp = await client.post(
            "http://localhost:8000/api/refunds",
            json={"order_id": order_id, "reason": reason},
        )
        if resp.status_code == 400:
            return f"Refund denied: {resp.json()['detail']}"
        data = resp.json()
    return f"Refund approved. Refund ID: {data['refund_id']}. Amount: ${data['amount']:.2f}. Expect 5-7 business days."

@function_tool
async def search_faq(query: str) -> str:
    """Search the FAQ knowledge base for answers to common questions."""
    async with httpx.AsyncClient() as client:
        resp = await client.post(
            "http://localhost:8000/api/faq/search",
            json={"query": query, "top_k": 3},
        )
        results = resp.json()["results"]
    if not results:
        return "No FAQ results found. Escalate to a human agent."
    return "\n\n".join(
        f"Q: {r['question']}\nA: {r['answer']}" for r in results
    )

@function_tool
async def escalate_to_human(reason: str, department: str) -> str:
    """Escalate the call to a human agent when the AI cannot resolve the issue."""
    async with httpx.AsyncClient() as client:
        resp = await client.post(
            "http://localhost:8000/api/escalate",
            json={"reason": reason, "department": department},
        )
        data = resp.json()
    return f"Transferring to human agent. Queue position: {data['position']}. Estimated wait: {data['wait_minutes']} minutes."

Step 2: Build Department Agents

Each department is a specialized agent with its own instructions and tools. The triage agent routes callers to the correct department using handoffs.

# agents_config.py
from agents import Agent
from tools import (
    lookup_order, check_billing, process_refund,
    search_faq, escalate_to_human,
)

billing_agent = Agent(
    name="Billing Agent",
    instructions="""You are a billing specialist. Help customers with:
- Viewing their current balance and invoice history
- Explaining charges on their account
- Setting up payment plans

Always verify the customer ID before accessing billing information.
If you cannot resolve the issue, escalate to a human agent.
Be empathetic and professional. Keep responses concise for voice delivery.""",
    tools=[check_billing, lookup_order, escalate_to_human],
)

refund_agent = Agent(
    name="Refund Agent",
    instructions="""You are a refund specialist. Help customers with:
- Processing refunds for eligible orders
- Explaining the refund policy (30-day window, original payment method)
- Checking refund status

Before processing a refund:
1. Look up the order to verify it exists and is eligible
2. Confirm the reason with the customer
3. Process the refund and provide the refund ID

Orders older than 30 days or already refunded are not eligible.
If the customer disputes eligibility, escalate to a human agent.""",
    tools=[lookup_order, process_refund, escalate_to_human],
)

faq_agent = Agent(
    name="FAQ Agent",
    instructions="""You are a general support agent. Help customers with:
- Answering common questions about products and services
- Providing shipping and return information
- Explaining company policies

Search the FAQ database first. If no relevant answer is found,
try to help based on your training. If the issue requires account
access or actions you cannot perform, route back to triage.""",
    tools=[search_faq, escalate_to_human],
)

triage_agent = Agent(
    name="Triage Agent",
    instructions="""You are the front-line customer support triage agent.
Your job is to:
1. Greet the customer warmly
2. Understand their issue
3. Route them to the correct department

Routing rules:
- Billing questions, charges, payment issues → Billing Agent
- Refund requests, return issues → Refund Agent
- General questions, shipping, product info → FAQ Agent

Ask clarifying questions if the intent is unclear.
Do NOT try to resolve issues yourself — route to the specialist.""",
    handoffs=[billing_agent, refund_agent, faq_agent],
    tools=[lookup_order],
)

Step 3: Voice Transport with OpenAI Realtime API

We connect the agent orchestration to OpenAI's Realtime API for speech-to-speech interaction. This uses WebSockets for low-latency bidirectional audio streaming.

# voice_session.py
import asyncio
import json
import websockets
from agents import Runner
from agents.voice import (
    AudioInput,
    StreamedAudioInput,
    VoicePipeline,
    SingleAgentVoiceWorkflow,
)
from agents_config import triage_agent

class CustomerSupportVoicePipeline:
    """Manages a voice session for customer support."""

    def __init__(self, session_id: str):
        self.session_id = session_id
        self.pipeline = VoicePipeline(
            workflow=SingleAgentVoiceWorkflow(triage_agent),
            config={
                "model": "gpt-4o-realtime",
                "voice": "nova",
                "turn_detection": {
                    "type": "server_vad",
                    "threshold": 0.5,
                    "silence_duration_ms": 800,
                },
            },
        )
        self.context = {}

    async def run_with_audio(self, audio_input: StreamedAudioInput):
        """Process streaming audio input and yield audio output."""
        result = await self.pipeline.run(audio_input)

        async for event in result.stream():
            if event.type == "voice_stream_event_audio":
                yield event.data
            elif event.type == "voice_stream_event_lifecycle":
                if event.data.get("event") == "turn_ended":
                    self.context["last_turn"] = event.data

    async def handle_websocket(self, websocket):
        """Handle a WebSocket connection from a client."""
        audio_input = StreamedAudioInput()

        async def receive_audio():
            async for message in websocket:
                if isinstance(message, bytes):
                    audio_input.add_audio(message)
                elif isinstance(message, str):
                    data = json.loads(message)
                    if data.get("type") == "end":
                        audio_input.close()
                        return

        async def send_audio():
            async for audio_chunk in self.run_with_audio(audio_input):
                await websocket.send(audio_chunk)

        await asyncio.gather(receive_audio(), send_audio())

Step 4: Session Persistence Across Calls

Customers may call back about the same issue. We persist session state in Redis so the agent remembers previous interactions.

# session_store.py
import json
import redis.asyncio as redis
from datetime import timedelta

class SessionStore:
    def __init__(self, redis_url: str = "redis://localhost:6379/0"):
        self.redis = redis.from_url(redis_url)
        self.ttl = timedelta(hours=24)

    async def save_session(self, phone_number: str, session_data: dict):
        key = f"support:session:{phone_number}"
        await self.redis.setex(key, self.ttl, json.dumps(session_data))

    async def get_session(self, phone_number: str) -> dict | None:
        key = f"support:session:{phone_number}"
        data = await self.redis.get(key)
        if data:
            return json.loads(data)
        return None

    async def append_interaction(self, phone_number: str, interaction: dict):
        session = await self.get_session(phone_number) or {
            "phone": phone_number,
            "interactions": [],
        }
        session["interactions"].append(interaction)
        # Keep only last 10 interactions to manage context size
        session["interactions"] = session["interactions"][-10:]
        await self.save_session(phone_number, session)

Step 5: FastAPI Server Tying It Together

# main.py
from fastapi import FastAPI, WebSocket, WebSocketDisconnect
from voice_session import CustomerSupportVoicePipeline
from session_store import SessionStore
from agents.voice import StreamedAudioInput
import uuid

app = FastAPI(title="Voice Customer Support Agent")
session_store = SessionStore()
active_sessions: dict[str, CustomerSupportVoicePipeline] = {}

@app.websocket("/ws/voice/{phone_number}")
async def voice_endpoint(websocket: WebSocket, phone_number: str):
    await websocket.accept()
    session_id = str(uuid.uuid4())

    # Load previous context if returning caller
    previous = await session_store.get_session(phone_number)

    pipeline = CustomerSupportVoicePipeline(session_id)
    if previous:
        pipeline.context["history"] = previous["interactions"]

    active_sessions[session_id] = pipeline

    try:
        await pipeline.handle_websocket(websocket)
    except WebSocketDisconnect:
        pass
    finally:
        # Persist session after call ends
        await session_store.append_interaction(phone_number, {
            "session_id": session_id,
            "context": pipeline.context,
        })
        del active_sessions[session_id]

@app.get("/health")
async def health():
    return {"status": "ok", "active_sessions": len(active_sessions)}

Step 6: Testing the Full Pipeline

# test_support_agent.py
import pytest
from agents import Runner
from agents_config import triage_agent

@pytest.mark.asyncio
async def test_triage_routes_to_billing():
    result = await Runner.run(
        triage_agent,
        input="I have a question about a charge on my account",
    )
    # The triage agent should hand off to the billing agent
    assert result.last_agent.name == "Billing Agent"

@pytest.mark.asyncio
async def test_triage_routes_to_refund():
    result = await Runner.run(
        triage_agent,
        input="I want to return an item and get my money back",
    )
    assert result.last_agent.name == "Refund Agent"

@pytest.mark.asyncio
async def test_refund_agent_looks_up_order():
    result = await Runner.run(
        triage_agent,
        input="I need a refund for order ORD-12345",
    )
    assert "refund" in result.final_output.lower()

Production Deployment Considerations

Health Monitoring: Track active sessions, average call duration, and handoff rates per department.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Graceful Shutdown: When deploying new versions, drain active WebSocket connections before terminating pods.

Rate Limiting: Limit concurrent voice sessions per phone number to prevent abuse.

Fallback: If the Realtime API is unavailable, fall back to a text-based chat agent with a TTS overlay.

# Kubernetes readiness probe that checks voice pipeline health
@app.get("/ready")
async def readiness():
    if len(active_sessions) > MAX_CONCURRENT_SESSIONS:
        return JSONResponse(status_code=503, content={"ready": False})
    return {"ready": True}

Key Takeaways

Building a voice customer support agent requires coordinating three concerns: voice transport, agent orchestration, and backend integration. The OpenAI Agents SDK handles the orchestration layer with its handoff mechanism, letting you define specialized department agents that the triage agent routes to naturally. Session persistence ensures returning callers get continuity. The most critical production concern is latency — keep tool calls fast and use streaming audio throughout the pipeline.

Sources:

Building a Voice-Powered Customer Support Agent: End-to-End Tutorial

Why Voice Agents Are Transforming Customer Support

Architecture Overview

Step 1: Define the Tools

Step 2: Build Department Agents

Step 3: Voice Transport with OpenAI Realtime API

Step 4: Session Persistence Across Calls

Step 5: FastAPI Server Tying It Together

Step 6: Testing the Full Pipeline

Production Deployment Considerations

Key Takeaways

Try CallSphere AI Voice Agents

Related Articles You May Like

Desktop AI Agents in 2026: Project Arc, Claude Cowork, OpenAI Agents Compared

OpenAI Frontier: Model-Native Orchestration Is the Default in 2026

Gemini Enterprise vs Anthropic vs OpenAI Frontier: 2026 Comparison

Anthropic's Financial Services Platform: State of Play in May 2026

Model-Native Harness: Why OpenAI and Anthropic Are Killing ReAct Loops

GPT-Realtime-Whisper vs Deepgram: Streaming STT in 2026