By Sagar Shankaran, Founder of CallSphere
Vocode-core wires phone, browser, and Zoom voice into one Python agent class. Build a Twilio inbound bot with GPT-4o and Azure TTS — code + pitfalls.
Key takeaways
TL;DR — Vocode-core is the original (2023) open-source voice-LLM framework. It still ships clean adapters for Twilio, Vonage, Telnyx, Zoom, and browser, with pluggable Transcriber + Agent + Synthesizer classes. Best fit when you want a single Python class that runs on phone AND browser.
A FastAPI server that answers a Twilio number, transcribes with Deepgram, reasons with GPT-4o, and speaks back through Azure Neural TTS — under 200 lines of Python.
flowchart LR
PSTN[Caller PSTN] --> TW[Twilio Voice]
TW -- Media Streams WS --> VC[Vocode FastAPI]
VC --> TRX[Deepgram Transcriber]
TRX --> AG[ChatGPTAgent]
AG --> SY[AzureSynthesizer]
SY --> TW --> PSTN
```bash pip install "vocode==0.1.x" fastapi uvicorn ```
```python import os from fastapi import FastAPI from vocode.streaming.telephony.server.base import TelephonyServer from vocode.streaming.telephony.config_manager.redis_config_manager import ( RedisConfigManager, ) from vocode.streaming.models.telephony import TwilioConfig from vocode.streaming.telephony.server.inbound_call_server import InboundCallServer
app = FastAPI() ```
```python from vocode.streaming.models.agent import ChatGPTAgentConfig from vocode.streaming.models.transcriber import DeepgramTranscriberConfig from vocode.streaming.models.synthesizer import AzureSynthesizerConfig from vocode.streaming.models.message import BaseMessage
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
config_manager = RedisConfigManager()
inbound = InboundCallServer( agent_config=ChatGPTAgentConfig( model_name="gpt-4o", initial_message=BaseMessage(text="Thanks for calling — how can I help?"), prompt_preamble="You are a friendly clinic concierge.", ), transcriber_config=DeepgramTranscriberConfig.from_telephone_input_device( endpointing_config={"type": "punctuation_based"}, ), synthesizer_config=AzureSynthesizerConfig.from_telephone_output_device( voice_name="en-US-JennyNeural", ), twilio_config=TwilioConfig( account_sid=os.environ["TWILIO_ACCOUNT_SID"], auth_token=os.environ["TWILIO_AUTH_TOKEN"], ), config_manager=config_manager, ) ```
```python app.include_router(inbound.get_router())
```
```python from vocode.streaming.action.base_action import BaseAction from vocode.streaming.models.actions import ActionConfig, ActionInput, ActionOutput
class BookSlot(BaseAction[ActionConfig, dict, dict]): description = "Book an appointment for the given ISO time." parameters_type = dict response_type = dict async def run(self, action_input: ActionInput) -> ActionOutput[dict]: return ActionOutput(action_type="book_slot", response={"ok": True, "iso": action_input.params["iso"]})
agent_config = ChatGPTAgentConfig(model_name="gpt-4o", actions=[BookSlot.get_config()]) ```
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
```python from vocode.streaming.telephony.conversation.outbound_call import OutboundCall
call = OutboundCall( base_url="yourhost.com", to_phone="+15551234567", from_phone="+15557654321", config_manager=config_manager, agent_config=agent_config, twilio_config=twilio_config, ) await call.start() ```
```bash uvicorn main:app --host 0.0.0.0 --port 3000 \ --ssl-keyfile key.pem --ssl-certfile cert.pem ```
RedisConfigManager is the default; switch to InMemoryConfigManager only for dev.time_based to avoid premature cuts.CallSphere ships 37 agents across 6 verticals with 90+ tools and 115+ DB tables. Older outbound campaigns still run on Vocode for the Twilio<>OpenAI path while net-new lines sit on LiveKit/Pipecat. $149/$499/$1,499 · 14-day trial · 22% affiliate.
Vocode vs LiveKit? Vocode is simpler for a single phone-first use case; LiveKit scales better to thousands of concurrent rooms.
Zoom support? Yes — ZoomDialIn adapter joins meetings via SIP and behaves identically.
Open-source license? MIT — no royalties even at scale.
Is it still maintained? Yes, but at a lower velocity than 2023 — community forks (e.g. gitduck, niveshi) are common.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
A VoIP telephone number is a phone number that routes calls over the internet instead of copper lines. Learn what a VoIP number is, how to get one, what it costs, and how to pair it with an AI voice agent in 2026.
How we built a fault-tolerant HVAC emergency triage and tech-dispatch platform on Kubernetes — three-tier CQRS, 11 micro-agents on the OpenAI Agents SDK + LangGraph, NATS JetStream, DTMF/SMS/WebSocket acceptance, circuit breakers, and an evaluation pipeline that catches regressions before they wake a tech at 3 AM.
The voice AI market hits $47.5B by 2034. For gyms and PT studios, voice agents now make economic sense for member intake, upsells, and reactivation campaigns.
With the voice AI market at $47.5B by 2034 and OpenAI's realtime release this week, every dealership and service shop should be evaluating voice agents. Here's how.
Spring 2026 AC season starts now. With the voice AI market at $47.5B by 2034, HVAC shops without after-hours voice agents will lose to those that have them.
OpenAI's GPT-Realtime-Translate handles 70 input languages live at $0.034/min. Here is what that means for multilingual restaurant takeout — and how CallSphere ships it.
© 2026 CallSphere LLC. All rights reserved.
Watch how CallSphere handles real customer calls, schedules appointments, and processes payments — live.