---
title: "Build a Voice Agent with Vocode Open-Source (Telephony, 2026)"
description: "Vocode-core is the modular open-source voice framework with first-class Twilio + Vonage telephony. Here's a phone-ready Vocode agent talking to Ollama with a Deepgram fallback."
canonical: https://callsphere.ai/blog/vw4h-build-voice-agent-vocode-open-source
category: "AI Voice Agents"
tags: ["Vocode", "Open Source", "Telephony", "Twilio", "Tutorial"]
author: "CallSphere Team"
published: 2026-04-13T00:00:00.000Z
updated: 2026-05-07T16:13:46.237Z
---

# Build a Voice Agent with Vocode Open-Source (Telephony, 2026)

> Vocode-core is the modular open-source voice framework with first-class Twilio + Vonage telephony. Here's a phone-ready Vocode agent talking to Ollama with a Deepgram fallback.

> **TL;DR** — Vocode-core is the framework you reach for when you want a phone number on day one. Twilio and Vonage are first-class transports; STT/TTS/LLM are pluggable. The OSS package has feature parity with the hosted Vocode API for self-hosters.

## What you'll build

A Twilio-connected Vocode `StreamingConversation` that uses Deepgram STT, an Ollama-backed OpenAI shim for the LLM, and ElevenLabs (or Coqui) TTS. Inbound calls hit a FastAPI webhook; the agent answers and chats.

## Prerequisites

1. Python 3.11, `pip install "vocode[all]" fastapi uvicorn`.
2. Twilio account + a phone number with a Voice webhook.
3. Deepgram API key (free tier works).
4. Ollama with `llama3.1:8b`.
5. An ngrok tunnel (or stable HTTPS URL) for Twilio's webhook.

## Architecture

```mermaid
flowchart LR
  PSTN[PSTN Caller] --> TW[Twilio Programmable Voice]
  TW -->|Media Streams WSS| VOC[Vocode StreamingConversation]
  VOC --> DG[Deepgram STT]
  VOC --> OLL[Ollama OpenAI shim]
  VOC --> EL[ElevenLabs TTS]
  EL --> TW
```

## Step 1 — Vocode TelephonyServer skeleton

```python

# server.py

from fastapi import FastAPI
from vocode.streaming.telephony.server.base import TelephonyServer, TwilioInboundCallConfig
from vocode.streaming.models.telephony import TwilioConfig
from vocode.streaming.agent.openai_chat_agent_config import OpenAIChatAgentConfig
from vocode.streaming.models.agent import ChatGPTAgentConfig
from vocode.streaming.models.message import BaseMessage
from vocode.streaming.transcriber.deepgram_transcriber import DeepgramTranscriberConfig
from vocode.streaming.synthesizer.eleven_labs_synthesizer import ElevenLabsSynthesizerConfig
import os

app = FastAPI()

agent_config = ChatGPTAgentConfig(
    initial_message=BaseMessage(text="Hi, this is your AI assistant. How can I help?"),
    prompt_preamble="You are a polite, concise phone assistant. Reply in 1-2 sentences.",
    model_name="llama3.1:8b",
    openai_api_base="[http://127.0.0.1:11434/v1](http://127.0.0.1:11434/v1)",
    openai_api_key="ollama",
    end_conversation_on_goodbye=True)

config_manager = ...  # see Step 4

server = TelephonyServer(
    base_url=os.environ["BASE_URL"].lstrip("https://"),
    config_manager=config_manager,
    inbound_call_configs=[TwilioInboundCallConfig(
        url="/inbound_call",
        agent_config=agent_config,
        twilio_config=TwilioConfig(
            account_sid=os.environ["TWILIO_ACCOUNT_SID"],
            auth_token=os.environ["TWILIO_AUTH_TOKEN"]),
        transcriber_config=DeepgramTranscriberConfig.from_telephone_input_device(
            api_key=os.environ["DEEPGRAM_API_KEY"]),
        synthesizer_config=ElevenLabsSynthesizerConfig.from_telephone_output_device(
            api_key=os.environ["ELEVEN_LABS_API_KEY"],
            voice_id="EXAVITQu4vr4xnSDxMaL"))])

app.include_router(server.get_router())
```

## Step 2 — Plug Ollama in via the OpenAI shim

Vocode's `ChatGPTAgentConfig` accepts `openai_api_base`. Ollama exposes `/v1/chat/completions`, so it's plug-and-play. Add `openai_api_key="ollama"` (any non-empty string passes the SDK validator).

## Step 3 — Use Coqui XTTS instead of ElevenLabs (optional)

```python
from vocode.streaming.synthesizer.coqui_synthesizer import CoquiSynthesizerConfig
synth = CoquiSynthesizerConfig.from_telephone_output_device(
    voice_id="...", voice_name="amy")
```

This avoids per-character TTS spend at the cost of latency and clone-licence headache.

## Step 4 — In-memory config manager

```python
from vocode.streaming.telephony.config_manager.in_memory_config_manager import InMemoryConfigManager
config_manager = InMemoryConfigManager()
```

For production, switch to `RedisConfigManager` so call state survives restarts.

## Step 5 — Run, tunnel, and wire Twilio

```bash
uvicorn server:app --host 0.0.0.0 --port 3000 &
ngrok http 3000

# Set Twilio number's Voice webhook to https:///inbound_call

```

Call your Twilio number — Vocode answers and you're talking to a fully OSS pipeline.

## Step 6 — Add an action (tool call)

```python
from vocode.streaming.action.base_action import BaseAction
from pydantic import BaseModel

class BookSlotParams(BaseModel):
    iso: str

class BookSlotAction(BaseAction[BookSlotParams, dict]):
    description = "Book a slot at the given ISO time."
    parameters_type = BookSlotParams
    async def run(self, action_input):
        # write to your DB / CRM here
        return {"booked": True, "iso": action_input.params.iso}

agent_config.actions = [BookSlotAction()]
```

Vocode actions are how you give the agent real-world side effects.

## Common pitfalls

- **`base_url` strips scheme.** Don't include `https://` — Vocode adds it.
- **Twilio media format.** Vocode handles `mulaw 8 kHz` end-to-end; don't transcode.
- **Long greetings.** Twilio will hang up a stalled call after 5s of silence; keep `initial_message` short.

## How CallSphere does this in production

CallSphere serves 37 specialist agents across 6 verticals — Healthcare's 14 tools on FastAPI :8084 with OpenAI Realtime, OneRoof's 10 specialists on Pion WebRTC, plus Salon, Dental, F&B, Behavioral — backed by 90+ tools and 115+ Postgres tables. Pricing is flat $149 / $499 / $1499 with a [14-day trial](/trial), a [22% affiliate program](/affiliate) and full SOC 2 controls. See [/pricing](/pricing) and [/demo](/demo).

## FAQ

**Vocode vs Pipecat?** Vocode is more telephony-focused; Pipecat is more pipeline-flexible.

**Vocode hosted API still alive?** Yes — but the OSS core has parity for self-hosters.

**Is Twilio cheap enough?** ~$0.0085/min inbound + ~$0.013/min Media Streams in the US.

**Can I use Vonage instead?** Yes — `vocode.streaming.telephony.server.vonage_*`.

**Tools / actions?** First-class via `BaseAction`; works with both OpenAI and Ollama.

## Sources

- [vocode-core on GitHub](https://github.com/vocodedev/vocode-core)
- [Vocode docs](https://docs.vocode.dev/welcome)
- [Vocode local conversation guide](https://docs.vocode.dev/open-source/local-conversation)
- [vocode on PyPI](https://pypi.org/project/vocode/)

---

Source: https://callsphere.ai/blog/vw4h-build-voice-agent-vocode-open-source
