---
title: "Build a Voice Agent with Bolna (Open-Source Production Stack)"
description: "Bolna 0.10 wires LiteLLM, Deepgram, ElevenLabs, Twilio and Plivo into one OSS orchestrator. Deploy a full conversational voice agent in under 200 lines of YAML + Python."
canonical: https://callsphere.ai/blog/vw4h-build-voice-agent-bolna-open-source
category: "AI Voice Agents"
tags: ["Bolna", "Open Source", "LiteLLM", "Voice Agent", "Tutorial"]
author: "CallSphere Team"
published: 2026-04-16T00:00:00.000Z
updated: 2026-05-07T16:13:46.492Z
---

# Build a Voice Agent with Bolna (Open-Source Production Stack)

> Bolna 0.10 wires LiteLLM, Deepgram, ElevenLabs, Twilio and Plivo into one OSS orchestrator. Deploy a full conversational voice agent in under 200 lines of YAML + Python.

> **TL;DR** — Bolna is an end-to-end OSS framework specifically for voice-driven LLM agents. Where Vocode and Pipecat give you primitives, Bolna gives you a YAML-driven assistant that wires STT, LLM (via LiteLLM — OpenAI/DeepSeek/Llama/Cohere/Mistral), TTS and telephony in one config.

## What you'll build

A Bolna assistant that answers an inbound Twilio call, qualifies a real-estate lead via a structured prompt, and writes the result to Postgres via a webhook tool.

## Prerequisites

1. Python 3.11, `pip install bolna fastapi uvicorn psycopg2-binary`.
2. Redis running (Bolna uses it for state).
3. Twilio number with Voice + Media Streams.
4. API keys for Deepgram and ElevenLabs (or LiteLLM-compatible alternatives).
5. Ollama running with `llama3.1:8b` (we'll point LiteLLM at it).

## Architecture

```mermaid
flowchart LR
  PSTN[Caller] --> TW[Twilio]
  TW -->|WSS| BOL[Bolna Orchestrator]
  BOL --> DG[Deepgram STT]
  BOL --> LL[LiteLLM -> Ollama]
  BOL --> EL[ElevenLabs TTS]
  BOL --> RD[(Redis state)]
  BOL -->|webhook| API[Your API]
```

## Step 1 — `.env` configuration

```bash

# .env

TWILIO_ACCOUNT_SID=...
TWILIO_AUTH_TOKEN=...
DEEPGRAM_AUTH_TOKEN=...
ELEVENLABS_API_KEY=...
REDIS_URL=redis://localhost:6379/0

# LiteLLM points at Ollama

OPENAI_API_BASE=[http://127.0.0.1:11434/v1](http://127.0.0.1:11434/v1)
OPENAI_API_KEY=ollama
```

## Step 2 — Define the assistant

```python

# create_agent.py

import requests, json
agent = {
  "agent_config": {
    "agent_name": "RealEstate Qualifier",
    "agent_type": "other",
    "agent_welcome_message": "Hi, this is the property concierge. Are you looking to buy, sell, or rent today?",
    "tasks": [{
      "task_type": "conversation",
      "tools_config": {
        "input": {"format": "wav", "provider": "twilio"},
        "output": {"format": "wav", "provider": "twilio"},
        "transcriber": {"provider": "deepgram", "model": "nova-2", "language": "en", "stream": True, "endpointing": 500},
        "synthesizer": {"provider": "elevenlabs", "model": "eleven_turbo_v2", "stream": True,
                         "voice_id": "EXAVITQu4vr4xnSDxMaL"},
        "llm_agent": {"provider": "openai", "model": "llama3.1:8b",
                       "max_tokens": 200, "temperature": 0.4,
                       "extra_config": {"base_url": "[http://127.0.0.1:11434/v1"}}](http://127.0.0.1:11434/v1%22%7D%7D)
      },
      "task_config": {"hangup_after_silence": 12, "ambient_noise": "office"}
    }],
    "agent_prompts": {"system_prompt":
      "Qualify the caller in 4 questions: intent, budget, timeline, contact. "
      "When done, call the webhook tool 'save_lead' with the JSON payload, then politely end the call."}
  }
}
r = requests.post("[http://127.0.0.1:5001/agent](http://127.0.0.1:5001/agent)", json=agent)
print(r.json())
```

## Step 3 — Add a webhook tool

```python
agent["agent_config"]["tasks"][0]["tools_config"]["api_tools"] = [{
  "name": "save_lead",
  "description": "Save the qualified lead to CRM.",
  "url": "[https://your.api/leads](https://your.api/leads)",
  "method": "POST",
  "param_schema": {"type":"object","required":["intent","budget","timeline","contact"],
    "properties":{"intent":{"type":"string"},"budget":{"type":"string"},
                   "timeline":{"type":"string"},"contact":{"type":"string"}}}}]
```

Bolna will call this URL with the agent's structured output as the JSON body when the LLM emits the tool.

## Step 4 — Run the orchestrator

```bash
docker compose up -d  # bolna server, redis
```

`docker-compose.yml` from the repo wires the Python server, Twilio bridge, and Redis. Hit `POST /agent` to register your config from Step 2.

## Step 5 — Trigger a call

```python
import requests
r = requests.post("[http://127.0.0.1:5001/call](http://127.0.0.1:5001/call)", json={
  "agent_id": "",
  "recipient_phone_number": "+15551234567",
  "from_number": "+18885550000"  # Your Twilio DID
})
```

The recipient phone rings; Bolna handles the rest.

## Step 6 — Inspect transcripts

```python
r = requests.get(f"[http://127.0.0.1:5001/executions/{execution_id}").json()](http://127.0.0.1:5001/executions/%7Bexecution_id%7D%22).json())
for turn in r["transcript"]: print(turn["role"], "→", turn["content"])
```

## Common pitfalls

- **Redis-required.** Without Redis, Bolna can't track multi-turn state — calls reset on each utterance.
- **LiteLLM model naming.** `llama3.1:8b` works only if you've set `OPENAI_API_BASE` to Ollama; otherwise LiteLLM tries OpenAI's catalog.
- **Twilio Media Streams ingress.** Make sure your Bolna is reachable on a public WSS URL.

## How CallSphere does this in production

CallSphere runs 37 specialist agents in 6 verticals on a tighter-coupled stack (OpenAI Realtime + ElevenLabs + Pion WebRTC + Postgres). Bolna is a great open alternative for teams that want the YAML-config experience and a self-hostable LiteLLM gateway. Healthcare uses 14 HIPAA tools on FastAPI :8084; OneRoof's 10 property specialists are a perfect parallel to the qualifier agent above. Flat $149/$499/$1499 · [14-day trial](/trial) · [22% affiliate](/affiliate) · [/industries/real-estate](/industries/real-estate).

## FAQ

**Bolna vs Vocode?** Bolna is config-driven; Vocode is code-driven.

**Plivo support?** Yes — swap `twilio` for `plivo` under `tools_config.input.provider`.

**Local TTS?** Set `synthesizer.provider` to `coqui` or `piper` (community plugins).

**Multi-language?** Deepgram nova-2-multi + ElevenLabs multilingual.

**Latency?** ~700–900 ms in our tests with Ollama on the same box.

## Sources

- [bolna on PyPI](https://pypi.org/project/bolna/)
- [bolna-ai/bolna on GitHub](https://github.com/bolna-ai/bolna)
- [voxos-ai/bolna fork](https://github.com/voxos-ai/bolna)
- [Bolna OpenAI integration docs](https://www.bolna.ai/docs/providers/llm-model/openai)

---

Source: https://callsphere.ai/blog/vw4h-build-voice-agent-bolna-open-source