Voice Agent Tools: Booking, Search, and Real-Time Actions
Add function tools to voice agents for booking appointments, searching databases, processing payments, and executing real-time actions with audio feedback during tool execution.
From Conversation to Action
A voice agent that can only talk is not very useful. The real power comes when the agent can do things — book appointments, search inventory, look up account details, process payments, and trigger workflows. These capabilities are implemented as function tools that the agent calls mid-conversation.
The challenge with voice is timing. When a user asks "Book me a dentist appointment for tomorrow at 2pm," they expect a response within a few seconds. If the tool takes 5 seconds to execute, the caller hears silence and wonders if the call dropped. Managing audio feedback during tool execution is critical for voice UX.
Defining Voice Agent Tools
Tools for voice agents follow the same pattern as chat agent tools in the OpenAI Agents SDK. The difference is in how you handle the user experience around tool execution:
flowchart TD
START["Voice Agent Tools: Booking, Search, and Real-Time…"] --> A
A["From Conversation to Action"]
A --> B
B["Defining Voice Agent Tools"]
B --> C
C["Audio Feedback During Tool Execution"]
C --> D
D["Handling Tool Errors Gracefully"]
D --> E
E["Confirmation Before Destructive Actions"]
E --> F
F["Production Checklist"]
F --> DONE["Key Takeaways"]
style START fill:#4f46e5,stroke:#4338ca,color:#fff
style DONE fill:#059669,stroke:#047857,color:#fff
from agents import Agent, function_tool
from datetime import datetime, timedelta
from typing import Optional
import httpx
@function_tool
async def check_availability(
provider_name: str,
date: str,
service_type: str,
) -> str:
"""Check appointment availability for a specific provider and date.
Returns available time slots."""
async with httpx.AsyncClient() as client:
resp = await client.get(
"https://api.scheduling.internal/v1/availability",
params={
"provider": provider_name,
"date": date,
"service": service_type,
},
timeout=5.0,
)
data = resp.json()
slots = data.get("available_slots", [])
if not slots:
return f"No availability for {provider_name} on {date} for {service_type}."
slot_list = ", ".join(slots[:5])
return f"Available times for {provider_name} on {date}: {slot_list}"
@function_tool
async def book_appointment(
provider_name: str,
date: str,
time: str,
patient_name: str,
patient_phone: str,
service_type: str,
notes: Optional[str] = None,
) -> str:
"""Book an appointment with a provider. Requires patient details."""
async with httpx.AsyncClient() as client:
resp = await client.post(
"https://api.scheduling.internal/v1/appointments",
json={
"provider": provider_name,
"date": date,
"time": time,
"patient_name": patient_name,
"patient_phone": patient_phone,
"service_type": service_type,
"notes": notes or "",
},
timeout=10.0,
)
if resp.status_code == 201:
confirmation = resp.json()
return (
f"Appointment booked successfully. "
f"Confirmation number: {confirmation['id']}. "
f"{patient_name} with {provider_name} on {date} at {time} "
f"for {service_type}."
)
elif resp.status_code == 409:
return f"That time slot is no longer available. Please choose another time."
else:
return f"Unable to book the appointment. Please try again or call the office directly."
@function_tool
async def search_providers(
specialty: str,
location: Optional[str] = None,
insurance: Optional[str] = None,
) -> str:
"""Search for providers by specialty, location, and insurance acceptance."""
async with httpx.AsyncClient() as client:
params = {"specialty": specialty}
if location:
params["location"] = location
if insurance:
params["insurance"] = insurance
resp = await client.get(
"https://api.scheduling.internal/v1/providers",
params=params,
timeout=5.0,
)
providers = resp.json().get("providers", [])
if not providers:
return f"No providers found for {specialty} in your area."
results = []
for p in providers[:3]:
results.append(
f"{p['name']} — {p['address']}, "
f"next available: {p['next_available']}"
)
return "Here are the top providers:\n" + "\n".join(results)
Audio Feedback During Tool Execution
When a tool takes more than a second to execute, the user hears dead air. This is the single biggest UX problem with voice agent tools. There are several strategies to fill this gap.
flowchart TD
CENTER(("Core Concepts"))
CENTER --> N0["Set timeouts on every external call — 5…"]
CENTER --> N1["Always provide audio feedback before to…"]
CENTER --> N2["Confirm destructive actions by reading …"]
CENTER --> N3["Handle partial failures — if SMS fails …"]
CENTER --> N4["Log every tool call with arguments, dur…"]
CENTER --> N5["Rate-limit tool calls per session to pr…"]
style CENTER fill:#4f46e5,stroke:#4338ca,color:#fff
Strategy 1: Filler Phrases
The simplest approach is to have the agent say something before calling the tool:
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
voice_agent = Agent(
name="BookingAgent",
instructions="""You are a medical appointment booking assistant.
IMPORTANT: Before calling any tool, always say a brief filler phrase so the
caller does not hear silence. Examples:
- "Let me check that for you."
- "One moment while I look that up."
- "I am searching for available times now."
- "Let me book that appointment for you right now."
After the tool returns, relay the results conversationally.""",
tools=[check_availability, book_appointment, search_providers],
)
This approach is simple but relies on the model following instructions consistently. For more reliable behavior, handle it in code.
Strategy 2: Programmatic Hold Audio
Intercept the tool call event and inject audio feedback before executing the tool:
import json
import asyncio
FILLER_MESSAGES = {
"check_availability": "Let me check those available times for you.",
"book_appointment": "I am booking that appointment now. Just a moment.",
"search_providers": "Searching for providers in your area.",
}
async def handle_function_call(ws, function_name: str, call_id: str, arguments: str):
"""Handle a function call with audio feedback."""
# Step 1: Send filler message immediately
filler = FILLER_MESSAGES.get(function_name, "One moment please.")
await ws.send(json.dumps({
"type": "conversation.item.create",
"item": {
"type": "message",
"role": "assistant",
"content": [{"type": "input_text", "text": filler}],
},
}))
# Step 2: Execute the tool
result = await execute_tool(function_name, arguments)
# Step 3: Return the tool result to the conversation
await ws.send(json.dumps({
"type": "conversation.item.create",
"item": {
"type": "function_call_output",
"call_id": call_id,
"output": result,
},
}))
# Step 4: Trigger the agent to respond with the result
await ws.send(json.dumps({"type": "response.create"}))
Handling Tool Errors Gracefully
Tools fail. APIs time out. Databases go down. In a chat agent, you can show an error message. In a voice agent, you need to speak the error naturally:
async def execute_tool_with_fallback(
function_name: str,
arguments: str,
max_retries: int = 1,
) -> str:
"""Execute a tool with retry and graceful error handling."""
tool_map = {
"check_availability": check_availability,
"book_appointment": book_appointment,
"search_providers": search_providers,
}
tool_fn = tool_map.get(function_name)
if not tool_fn:
return f"I do not have the ability to {function_name} at the moment."
import json as json_lib
parsed_args = json_lib.loads(arguments)
for attempt in range(max_retries + 1):
try:
result = await asyncio.wait_for(
tool_fn(**parsed_args),
timeout=8.0,
)
return result
except asyncio.TimeoutError:
if attempt < max_retries:
continue
return (
"I am sorry, the system is taking longer than expected. "
"Let me try a different approach, or I can transfer you "
"to someone who can help directly."
)
except Exception as e:
if attempt < max_retries:
continue
return (
"I encountered an issue while processing your request. "
"Would you like me to try again, or would you prefer "
"to speak with a human agent?"
)
Confirmation Before Destructive Actions
Voice is inherently error-prone — the STT might mishear a name, date, or number. Always confirm before executing actions that are hard to undo:
confirmation_agent = Agent(
name="BookingAgent",
instructions="""You are a medical appointment booking assistant.
CRITICAL RULES:
1. Before calling book_appointment, ALWAYS read back ALL details to the caller
and ask for explicit confirmation:
"Just to confirm — I will book an appointment with Dr. Smith on March 15th
at 2:00 PM for a dental cleaning. The name on the appointment will be
John Doe, and we will send a confirmation to 555-0123. Is all of that correct?"
2. Only proceed with booking after the caller says "yes", "correct",
"that is right", or similar affirmative.
3. If the caller corrects any detail, update it and read back the full
details again before booking.
4. After booking, always read back the confirmation number slowly and clearly.
Spell out any letters.""",
tools=[check_availability, book_appointment, search_providers],
)
Production Checklist
Before deploying voice agent tools to production:
- Set timeouts on every external call — 5-8 seconds maximum for voice
- Always provide audio feedback before tool execution
- Confirm destructive actions by reading back details and waiting for affirmation
- Handle partial failures — if SMS fails after booking succeeds, still confirm the booking
- Log every tool call with arguments, duration, and result for debugging
- Rate-limit tool calls per session to prevent abuse or infinite loops
- Test with real speech input — STT errors in tool arguments (like mishearing "March 15" as "March 50") need graceful handling
Voice agent tools transform passive conversations into active service delivery. The key is managing the timing and feedback so that tool execution feels seamless rather than interruptive.
Written by
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.