Build a Voice Agent with Bolna (Open-Source Production Stack)
Bolna 0.10 wires LiteLLM, Deepgram, ElevenLabs, Twilio and Plivo into one OSS orchestrator. Deploy a full conversational voice agent in under 200 lines of YAML + Python.
TL;DR — Bolna is an end-to-end OSS framework specifically for voice-driven LLM agents. Where Vocode and Pipecat give you primitives, Bolna gives you a YAML-driven assistant that wires STT, LLM (via LiteLLM — OpenAI/DeepSeek/Llama/Cohere/Mistral), TTS and telephony in one config.
What you'll build
A Bolna assistant that answers an inbound Twilio call, qualifies a real-estate lead via a structured prompt, and writes the result to Postgres via a webhook tool.
Prerequisites
- Python 3.11,
pip install bolna fastapi uvicorn psycopg2-binary. - Redis running (Bolna uses it for state).
- Twilio number with Voice + Media Streams.
- API keys for Deepgram and ElevenLabs (or LiteLLM-compatible alternatives).
- Ollama running with
llama3.1:8b(we'll point LiteLLM at it).
Architecture
flowchart LR
PSTN[Caller] --> TW[Twilio]
TW -->|WSS| BOL[Bolna Orchestrator]
BOL --> DG[Deepgram STT]
BOL --> LL[LiteLLM -> Ollama]
BOL --> EL[ElevenLabs TTS]
BOL --> RD[(Redis state)]
BOL -->|webhook| API[Your API]
Step 1 — .env configuration
```bash
.env
TWILIO_ACCOUNT_SID=... TWILIO_AUTH_TOKEN=... DEEPGRAM_AUTH_TOKEN=... ELEVENLABS_API_KEY=... REDIS_URL=redis://localhost:6379/0
LiteLLM points at Ollama
OPENAI_API_BASE=http://127.0.0.1:11434/v1 OPENAI_API_KEY=ollama ```
Step 2 — Define the assistant
```python
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
create_agent.py
import requests, json agent = { "agent_config": { "agent_name": "RealEstate Qualifier", "agent_type": "other", "agent_welcome_message": "Hi, this is the property concierge. Are you looking to buy, sell, or rent today?", "tasks": [{ "task_type": "conversation", "tools_config": { "input": {"format": "wav", "provider": "twilio"}, "output": {"format": "wav", "provider": "twilio"}, "transcriber": {"provider": "deepgram", "model": "nova-2", "language": "en", "stream": True, "endpointing": 500}, "synthesizer": {"provider": "elevenlabs", "model": "eleven_turbo_v2", "stream": True, "voice_id": "EXAVITQu4vr4xnSDxMaL"}, "llm_agent": {"provider": "openai", "model": "llama3.1:8b", "max_tokens": 200, "temperature": 0.4, "extra_config": {"base_url": "http://127.0.0.1:11434/v1"}} }, "task_config": {"hangup_after_silence": 12, "ambient_noise": "office"} }], "agent_prompts": {"system_prompt": "Qualify the caller in 4 questions: intent, budget, timeline, contact. " "When done, call the webhook tool 'save_lead' with the JSON payload, then politely end the call."} } } r = requests.post("http://127.0.0.1:5001/agent", json=agent) print(r.json()) ```
Step 3 — Add a webhook tool
```python agent["agent_config"]["tasks"][0]["tools_config"]["api_tools"] = [{ "name": "save_lead", "description": "Save the qualified lead to CRM.", "url": "https://your.api/leads", "method": "POST", "param_schema": {"type":"object","required":["intent","budget","timeline","contact"], "properties":{"intent":{"type":"string"},"budget":{"type":"string"}, "timeline":{"type":"string"},"contact":{"type":"string"}}}}] ```
Bolna will call this URL with the agent's structured output as the JSON body when the LLM emits the tool.
Step 4 — Run the orchestrator
```bash docker compose up -d # bolna server, redis ```
docker-compose.yml from the repo wires the Python server, Twilio bridge, and Redis. Hit POST /agent to register your config from Step 2.
Step 5 — Trigger a call
```python import requests r = requests.post("http://127.0.0.1:5001/call", json={ "agent_id": "<id from step 2>", "recipient_phone_number": "+15551234567", "from_number": "+18885550000" # Your Twilio DID }) ```
The recipient phone rings; Bolna handles the rest.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Step 6 — Inspect transcripts
```python r = requests.get(f"http://127.0.0.1:5001/executions/{execution_id}").json() for turn in r["transcript"]: print(turn["role"], "→", turn["content"]) ```
Common pitfalls
- Redis-required. Without Redis, Bolna can't track multi-turn state — calls reset on each utterance.
- LiteLLM model naming.
llama3.1:8bworks only if you've setOPENAI_API_BASEto Ollama; otherwise LiteLLM tries OpenAI's catalog. - Twilio Media Streams ingress. Make sure your Bolna is reachable on a public WSS URL.
How CallSphere does this in production
CallSphere runs 37 specialist agents in 6 verticals on a tighter-coupled stack (OpenAI Realtime + ElevenLabs + Pion WebRTC + Postgres). Bolna is a great open alternative for teams that want the YAML-config experience and a self-hostable LiteLLM gateway. Healthcare uses 14 HIPAA tools on FastAPI :8084; OneRoof's 10 property specialists are a perfect parallel to the qualifier agent above. Flat $149/$499/$1499 · 14-day trial · 22% affiliate · /industries/real-estate.
FAQ
Bolna vs Vocode? Bolna is config-driven; Vocode is code-driven.
Plivo support? Yes — swap twilio for plivo under tools_config.input.provider.
Local TTS? Set synthesizer.provider to coqui or piper (community plugins).
Multi-language? Deepgram nova-2-multi + ElevenLabs multilingual.
Latency? ~700–900 ms in our tests with Ollama on the same box.
Sources
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.