Skip to content
AI Voice Agents
AI Voice Agents11 min read0 views

Build a Voice Agent with Vocode Open-Source (Telephony, 2026)

Vocode-core is the modular open-source voice framework with first-class Twilio + Vonage telephony. Here's a phone-ready Vocode agent talking to Ollama with a Deepgram fallback.

TL;DR — Vocode-core is the framework you reach for when you want a phone number on day one. Twilio and Vonage are first-class transports; STT/TTS/LLM are pluggable. The OSS package has feature parity with the hosted Vocode API for self-hosters.

What you'll build

A Twilio-connected Vocode StreamingConversation that uses Deepgram STT, an Ollama-backed OpenAI shim for the LLM, and ElevenLabs (or Coqui) TTS. Inbound calls hit a FastAPI webhook; the agent answers and chats.

Prerequisites

  1. Python 3.11, pip install "vocode[all]" fastapi uvicorn.
  2. Twilio account + a phone number with a Voice webhook.
  3. Deepgram API key (free tier works).
  4. Ollama with llama3.1:8b.
  5. An ngrok tunnel (or stable HTTPS URL) for Twilio's webhook.

Architecture

flowchart LR
  PSTN[PSTN Caller] --> TW[Twilio Programmable Voice]
  TW -->|Media Streams WSS| VOC[Vocode StreamingConversation]
  VOC --> DG[Deepgram STT]
  VOC --> OLL[Ollama OpenAI shim]
  VOC --> EL[ElevenLabs TTS]
  EL --> TW

Step 1 — Vocode TelephonyServer skeleton

```python

server.py

from fastapi import FastAPI from vocode.streaming.telephony.server.base import TelephonyServer, TwilioInboundCallConfig from vocode.streaming.models.telephony import TwilioConfig from vocode.streaming.agent.openai_chat_agent_config import OpenAIChatAgentConfig from vocode.streaming.models.agent import ChatGPTAgentConfig from vocode.streaming.models.message import BaseMessage from vocode.streaming.transcriber.deepgram_transcriber import DeepgramTranscriberConfig from vocode.streaming.synthesizer.eleven_labs_synthesizer import ElevenLabsSynthesizerConfig import os

app = FastAPI()

agent_config = ChatGPTAgentConfig( initial_message=BaseMessage(text="Hi, this is your AI assistant. How can I help?"), prompt_preamble="You are a polite, concise phone assistant. Reply in 1-2 sentences.", model_name="llama3.1:8b", openai_api_base="http://127.0.0.1:11434/v1", openai_api_key="ollama", end_conversation_on_goodbye=True)

config_manager = ... # see Step 4

server = TelephonyServer( base_url=os.environ["BASE_URL"].lstrip("https://"), config_manager=config_manager, inbound_call_configs=[TwilioInboundCallConfig( url="/inbound_call", agent_config=agent_config, twilio_config=TwilioConfig( account_sid=os.environ["TWILIO_ACCOUNT_SID"], auth_token=os.environ["TWILIO_AUTH_TOKEN"]), transcriber_config=DeepgramTranscriberConfig.from_telephone_input_device( api_key=os.environ["DEEPGRAM_API_KEY"]), synthesizer_config=ElevenLabsSynthesizerConfig.from_telephone_output_device( api_key=os.environ["ELEVEN_LABS_API_KEY"], voice_id="EXAVITQu4vr4xnSDxMaL"))])

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

app.include_router(server.get_router()) ```

Step 2 — Plug Ollama in via the OpenAI shim

Vocode's ChatGPTAgentConfig accepts openai_api_base. Ollama exposes /v1/chat/completions, so it's plug-and-play. Add openai_api_key="ollama" (any non-empty string passes the SDK validator).

Step 3 — Use Coqui XTTS instead of ElevenLabs (optional)

```python from vocode.streaming.synthesizer.coqui_synthesizer import CoquiSynthesizerConfig synth = CoquiSynthesizerConfig.from_telephone_output_device( voice_id="...", voice_name="amy") ```

This avoids per-character TTS spend at the cost of latency and clone-licence headache.

Step 4 — In-memory config manager

```python from vocode.streaming.telephony.config_manager.in_memory_config_manager import InMemoryConfigManager config_manager = InMemoryConfigManager() ```

For production, switch to RedisConfigManager so call state survives restarts.

Step 5 — Run, tunnel, and wire Twilio

```bash uvicorn server:app --host 0.0.0.0 --port 3000 & ngrok http 3000

Set Twilio number's Voice webhook to https:///inbound_call

```

Call your Twilio number — Vocode answers and you're talking to a fully OSS pipeline.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Step 6 — Add an action (tool call)

```python from vocode.streaming.action.base_action import BaseAction from pydantic import BaseModel

class BookSlotParams(BaseModel): iso: str

class BookSlotAction(BaseAction[BookSlotParams, dict]): description = "Book a slot at the given ISO time." parameters_type = BookSlotParams async def run(self, action_input): # write to your DB / CRM here return {"booked": True, "iso": action_input.params.iso}

agent_config.actions = [BookSlotAction()] ```

Vocode actions are how you give the agent real-world side effects.

Common pitfalls

  • base_url strips scheme. Don't include https:// — Vocode adds it.
  • Twilio media format. Vocode handles mulaw 8 kHz end-to-end; don't transcode.
  • Long greetings. Twilio will hang up a stalled call after 5s of silence; keep initial_message short.

How CallSphere does this in production

CallSphere serves 37 specialist agents across 6 verticals — Healthcare's 14 tools on FastAPI :8084 with OpenAI Realtime, OneRoof's 10 specialists on Pion WebRTC, plus Salon, Dental, F&B, Behavioral — backed by 90+ tools and 115+ Postgres tables. Pricing is flat $149 / $499 / $1499 with a 14-day trial, a 22% affiliate program and full SOC 2 controls. See /pricing and /demo.

FAQ

Vocode vs Pipecat? Vocode is more telephony-focused; Pipecat is more pipeline-flexible.

Vocode hosted API still alive? Yes — but the OSS core has parity for self-hosters.

Is Twilio cheap enough? ~$0.0085/min inbound + ~$0.013/min Media Streams in the US.

Can I use Vonage instead? Yes — vocode.streaming.telephony.server.vonage_*.

Tools / actions? First-class via BaseAction; works with both OpenAI and Ollama.

Sources

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.