Building a Voice-Powered Customer Support Agent: End-to-End Tutorial
Build a complete voice-powered customer support agent with triage, billing, refund, and FAQ handling using OpenAI Realtime API, tool integration, and session persistence.
Why Voice Agents Are Transforming Customer Support
Traditional IVR systems frustrate customers with rigid menu trees and robotic interactions. Voice agents powered by large language models change the equation entirely: they understand natural language, maintain context across a conversation, and can execute real actions like looking up orders or processing refunds.
In this tutorial, we build a production-grade voice customer support agent from scratch. The agent handles four departments — triage, billing, refunds, and FAQ — with seamless voice handoffs between them.
Architecture Overview
Our system has three layers:
flowchart TD
START["Building a Voice-Powered Customer Support Agent: …"] --> A
A["Why Voice Agents Are Transforming Custo…"]
A --> B
B["Architecture Overview"]
B --> C
C["Step 1: Define the Tools"]
C --> D
D["Step 2: Build Department Agents"]
D --> E
E["Step 3: Voice Transport with OpenAI Rea…"]
E --> F
F["Step 4: Session Persistence Across Calls"]
F --> G
G["Step 5: FastAPI Server Tying It Together"]
G --> H
H["Step 6: Testing the Full Pipeline"]
H --> DONE["Key Takeaways"]
style START fill:#4f46e5,stroke:#4338ca,color:#fff
style DONE fill:#059669,stroke:#047857,color:#fff
- Voice Transport Layer — WebSocket connection to OpenAI Realtime API for speech-to-speech
- Agent Orchestration Layer — OpenAI Agents SDK managing triage, routing, and department-specific agents
- Backend Integration Layer — FastAPI server with tools for order lookup, refund processing, and knowledge base queries
┌─────────────┐ WebSocket ┌──────────────────┐
│ Customer │◄──────────────────►│ OpenAI Realtime │
│ (Phone) │ │ API │
└─────────────┘ └────────┬─────────┘
│
┌────────▼─────────┐
│ Agent Orchestra │
│ ┌─────────────┐ │
│ │ Triage │ │
│ │ Agent │ │
│ └──────┬──────┘ │
│ ┌────┼────┐ │
│ ┌─▼─┐┌─▼─┐┌─▼─┐ │
│ │Bil││Ref││FAQ│ │
│ └───┘└───┘└───┘ │
└────────┬─────────┘
│
┌────────▼─────────┐
│ FastAPI Backend │
│ (Tools + DB) │
└──────────────────┘
Step 1: Define the Tools
Every department needs access to backend systems. We define tools that the agents can call to look up orders, check billing, and process refunds.
# tools.py
import httpx
from agents import function_tool
@function_tool
async def lookup_order(order_id: str) -> str:
"""Look up an order by its ID. Returns order status, items, and shipping info."""
async with httpx.AsyncClient() as client:
resp = await client.get(
f"http://localhost:8000/api/orders/{order_id}",
headers={"Authorization": f"Bearer {API_KEY}"},
)
if resp.status_code == 404:
return f"No order found with ID {order_id}. Ask the customer to verify."
data = resp.json()
return (
f"Order {order_id}: status={data['status']}, "
f"items={data['items']}, total=${data['total']:.2f}, "
f"shipped={data.get('shipped_date', 'not yet')}"
)
@function_tool
async def check_billing(customer_id: str) -> str:
"""Retrieve billing history and current balance for a customer."""
async with httpx.AsyncClient() as client:
resp = await client.get(
f"http://localhost:8000/api/billing/{customer_id}"
)
data = resp.json()
invoices = data.get("invoices", [])
summary = "; ".join(
f"Invoice {inv['id']}: ${inv['amount']:.2f} ({inv['status']})"
for inv in invoices[:5]
)
return f"Balance: ${data['balance']:.2f}. Recent invoices: {summary}"
@function_tool
async def process_refund(order_id: str, reason: str) -> str:
"""Process a refund for the given order. Requires a reason."""
async with httpx.AsyncClient() as client:
resp = await client.post(
"http://localhost:8000/api/refunds",
json={"order_id": order_id, "reason": reason},
)
if resp.status_code == 400:
return f"Refund denied: {resp.json()['detail']}"
data = resp.json()
return f"Refund approved. Refund ID: {data['refund_id']}. Amount: ${data['amount']:.2f}. Expect 5-7 business days."
@function_tool
async def search_faq(query: str) -> str:
"""Search the FAQ knowledge base for answers to common questions."""
async with httpx.AsyncClient() as client:
resp = await client.post(
"http://localhost:8000/api/faq/search",
json={"query": query, "top_k": 3},
)
results = resp.json()["results"]
if not results:
return "No FAQ results found. Escalate to a human agent."
return "\n\n".join(
f"Q: {r['question']}\nA: {r['answer']}" for r in results
)
@function_tool
async def escalate_to_human(reason: str, department: str) -> str:
"""Escalate the call to a human agent when the AI cannot resolve the issue."""
async with httpx.AsyncClient() as client:
resp = await client.post(
"http://localhost:8000/api/escalate",
json={"reason": reason, "department": department},
)
data = resp.json()
return f"Transferring to human agent. Queue position: {data['position']}. Estimated wait: {data['wait_minutes']} minutes."
Step 2: Build Department Agents
Each department is a specialized agent with its own instructions and tools. The triage agent routes callers to the correct department using handoffs.
# agents_config.py
from agents import Agent
from tools import (
lookup_order, check_billing, process_refund,
search_faq, escalate_to_human,
)
billing_agent = Agent(
name="Billing Agent",
instructions="""You are a billing specialist. Help customers with:
- Viewing their current balance and invoice history
- Explaining charges on their account
- Setting up payment plans
Always verify the customer ID before accessing billing information.
If you cannot resolve the issue, escalate to a human agent.
Be empathetic and professional. Keep responses concise for voice delivery.""",
tools=[check_billing, lookup_order, escalate_to_human],
)
refund_agent = Agent(
name="Refund Agent",
instructions="""You are a refund specialist. Help customers with:
- Processing refunds for eligible orders
- Explaining the refund policy (30-day window, original payment method)
- Checking refund status
Before processing a refund:
1. Look up the order to verify it exists and is eligible
2. Confirm the reason with the customer
3. Process the refund and provide the refund ID
Orders older than 30 days or already refunded are not eligible.
If the customer disputes eligibility, escalate to a human agent.""",
tools=[lookup_order, process_refund, escalate_to_human],
)
faq_agent = Agent(
name="FAQ Agent",
instructions="""You are a general support agent. Help customers with:
- Answering common questions about products and services
- Providing shipping and return information
- Explaining company policies
Search the FAQ database first. If no relevant answer is found,
try to help based on your training. If the issue requires account
access or actions you cannot perform, route back to triage.""",
tools=[search_faq, escalate_to_human],
)
triage_agent = Agent(
name="Triage Agent",
instructions="""You are the front-line customer support triage agent.
Your job is to:
1. Greet the customer warmly
2. Understand their issue
3. Route them to the correct department
Routing rules:
- Billing questions, charges, payment issues → Billing Agent
- Refund requests, return issues → Refund Agent
- General questions, shipping, product info → FAQ Agent
Ask clarifying questions if the intent is unclear.
Do NOT try to resolve issues yourself — route to the specialist.""",
handoffs=[billing_agent, refund_agent, faq_agent],
tools=[lookup_order],
)
Step 3: Voice Transport with OpenAI Realtime API
We connect the agent orchestration to OpenAI's Realtime API for speech-to-speech interaction. This uses WebSockets for low-latency bidirectional audio streaming.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
flowchart LR
S0["Step 1: Define the Tools"]
S0 --> S1
S1["Step 2: Build Department Agents"]
S1 --> S2
S2["Step 3: Voice Transport with OpenAI Rea…"]
S2 --> S3
S3["Step 4: Session Persistence Across Calls"]
S3 --> S4
S4["Step 5: FastAPI Server Tying It Together"]
S4 --> S5
S5["Step 6: Testing the Full Pipeline"]
style S0 fill:#4f46e5,stroke:#4338ca,color:#fff
style S5 fill:#059669,stroke:#047857,color:#fff
# voice_session.py
import asyncio
import json
import websockets
from agents import Runner
from agents.voice import (
AudioInput,
StreamedAudioInput,
VoicePipeline,
SingleAgentVoiceWorkflow,
)
from agents_config import triage_agent
class CustomerSupportVoicePipeline:
"""Manages a voice session for customer support."""
def __init__(self, session_id: str):
self.session_id = session_id
self.pipeline = VoicePipeline(
workflow=SingleAgentVoiceWorkflow(triage_agent),
config={
"model": "gpt-4o-realtime",
"voice": "nova",
"turn_detection": {
"type": "server_vad",
"threshold": 0.5,
"silence_duration_ms": 800,
},
},
)
self.context = {}
async def run_with_audio(self, audio_input: StreamedAudioInput):
"""Process streaming audio input and yield audio output."""
result = await self.pipeline.run(audio_input)
async for event in result.stream():
if event.type == "voice_stream_event_audio":
yield event.data
elif event.type == "voice_stream_event_lifecycle":
if event.data.get("event") == "turn_ended":
self.context["last_turn"] = event.data
async def handle_websocket(self, websocket):
"""Handle a WebSocket connection from a client."""
audio_input = StreamedAudioInput()
async def receive_audio():
async for message in websocket:
if isinstance(message, bytes):
audio_input.add_audio(message)
elif isinstance(message, str):
data = json.loads(message)
if data.get("type") == "end":
audio_input.close()
return
async def send_audio():
async for audio_chunk in self.run_with_audio(audio_input):
await websocket.send(audio_chunk)
await asyncio.gather(receive_audio(), send_audio())
Step 4: Session Persistence Across Calls
Customers may call back about the same issue. We persist session state in Redis so the agent remembers previous interactions.
# session_store.py
import json
import redis.asyncio as redis
from datetime import timedelta
class SessionStore:
def __init__(self, redis_url: str = "redis://localhost:6379/0"):
self.redis = redis.from_url(redis_url)
self.ttl = timedelta(hours=24)
async def save_session(self, phone_number: str, session_data: dict):
key = f"support:session:{phone_number}"
await self.redis.setex(key, self.ttl, json.dumps(session_data))
async def get_session(self, phone_number: str) -> dict | None:
key = f"support:session:{phone_number}"
data = await self.redis.get(key)
if data:
return json.loads(data)
return None
async def append_interaction(self, phone_number: str, interaction: dict):
session = await self.get_session(phone_number) or {
"phone": phone_number,
"interactions": [],
}
session["interactions"].append(interaction)
# Keep only last 10 interactions to manage context size
session["interactions"] = session["interactions"][-10:]
await self.save_session(phone_number, session)
Step 5: FastAPI Server Tying It Together
# main.py
from fastapi import FastAPI, WebSocket, WebSocketDisconnect
from voice_session import CustomerSupportVoicePipeline
from session_store import SessionStore
from agents.voice import StreamedAudioInput
import uuid
app = FastAPI(title="Voice Customer Support Agent")
session_store = SessionStore()
active_sessions: dict[str, CustomerSupportVoicePipeline] = {}
@app.websocket("/ws/voice/{phone_number}")
async def voice_endpoint(websocket: WebSocket, phone_number: str):
await websocket.accept()
session_id = str(uuid.uuid4())
# Load previous context if returning caller
previous = await session_store.get_session(phone_number)
pipeline = CustomerSupportVoicePipeline(session_id)
if previous:
pipeline.context["history"] = previous["interactions"]
active_sessions[session_id] = pipeline
try:
await pipeline.handle_websocket(websocket)
except WebSocketDisconnect:
pass
finally:
# Persist session after call ends
await session_store.append_interaction(phone_number, {
"session_id": session_id,
"context": pipeline.context,
})
del active_sessions[session_id]
@app.get("/health")
async def health():
return {"status": "ok", "active_sessions": len(active_sessions)}
Step 6: Testing the Full Pipeline
# test_support_agent.py
import pytest
from agents import Runner
from agents_config import triage_agent
@pytest.mark.asyncio
async def test_triage_routes_to_billing():
result = await Runner.run(
triage_agent,
input="I have a question about a charge on my account",
)
# The triage agent should hand off to the billing agent
assert result.last_agent.name == "Billing Agent"
@pytest.mark.asyncio
async def test_triage_routes_to_refund():
result = await Runner.run(
triage_agent,
input="I want to return an item and get my money back",
)
assert result.last_agent.name == "Refund Agent"
@pytest.mark.asyncio
async def test_refund_agent_looks_up_order():
result = await Runner.run(
triage_agent,
input="I need a refund for order ORD-12345",
)
assert "refund" in result.final_output.lower()
Production Deployment Considerations
Health Monitoring: Track active sessions, average call duration, and handoff rates per department.
Graceful Shutdown: When deploying new versions, drain active WebSocket connections before terminating pods.
Rate Limiting: Limit concurrent voice sessions per phone number to prevent abuse.
Fallback: If the Realtime API is unavailable, fall back to a text-based chat agent with a TTS overlay.
# Kubernetes readiness probe that checks voice pipeline health
@app.get("/ready")
async def readiness():
if len(active_sessions) > MAX_CONCURRENT_SESSIONS:
return JSONResponse(status_code=503, content={"ready": False})
return {"ready": True}
Key Takeaways
Building a voice customer support agent requires coordinating three concerns: voice transport, agent orchestration, and backend integration. The OpenAI Agents SDK handles the orchestration layer with its handoff mechanism, letting you define specialized department agents that the triage agent routes to naturally. Session persistence ensures returning callers get continuity. The most critical production concern is latency — keep tool calls fast and use streaming audio throughout the pipeline.
Sources:
Written by
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.