---
title: "Build a Voice Agent with LiveKit Agents in Python (2026 Tutorial)"
description: "Wire LiveKit Agents 1.x, Deepgram STT, GPT-4o, and ElevenLabs TTS into a sub-700ms WebRTC voice agent. Real Python code, room dispatch, and prod pitfalls."
canonical: https://callsphere.ai/blog/vw9h-build-voice-agent-livekit-agents-python-2026
category: "AI Voice Agents"
tags: ["LiveKit", "Voice Agent", "Python", "WebRTC", "Deepgram"]
author: "CallSphere Team"
published: 2026-03-15T00:00:00.000Z
updated: 2026-05-08T03:13:52.230Z
---

# Build a Voice Agent with LiveKit Agents in Python (2026 Tutorial)

> Wire LiveKit Agents 1.x, Deepgram STT, GPT-4o, and ElevenLabs TTS into a sub-700ms WebRTC voice agent. Real Python code, room dispatch, and prod pitfalls.

> **TL;DR** — LiveKit Agents 1.x is the most-adopted open-source voice runtime of 2026. Two Python files (one entrypoint + one Dockerfile) give you a WebRTC voice agent with STT-LLM-TTS pipelining, server-side VAD, interruption handling, and one-command deploy to LiveKit Cloud or self-hosted SFU.

## What you'll build

A LiveKit room participant that auto-joins any room called `support-*`, transcribes the caller with Deepgram Nova-3, reasons with GPT-4o, and speaks back through ElevenLabs Turbo v2.5 — all under 700ms voice-to-voice on the default plan.

## Architecture

```mermaid
flowchart LR
  CL[Caller browser/SIP] -- WebRTC --> SFU[LiveKit SFU]
  SFU -- audio track --> AG[Python agents worker]
  AG -- STT --> DG[Deepgram Nova-3]
  AG -- LLM --> OA[OpenAI GPT-4o]
  AG -- TTS --> EL[ElevenLabs Turbo 2.5]
  AG -- audio track --> SFU --> CL
```

## Step 1 — Install + bootstrap

```bash
pip install "livekit-agents[deepgram,openai,elevenlabs,silero]~=1.0"
lk app create --template agent-starter-python my-agent
cd my-agent && cp .env.example .env  # add LIVEKIT_URL, OPENAI_API_KEY, etc.
```

## Step 2 — Define the agent class

```python
from livekit import agents
from livekit.agents import Agent, AgentSession, RoomInputOptions
from livekit.plugins import openai, deepgram, elevenlabs, silero

class Concierge(Agent):
    def **init**(self) -> None:
        super().**init**(
            instructions="You are a friendly clinic concierge. "
                         "Confirm the appointment, then ask if anything else.",
        )
```

## Step 3 — Wire the session

```python
async def entrypoint(ctx: agents.JobContext):
    await ctx.connect()
    session = AgentSession(
        stt=deepgram.STT(model="nova-3", language="en-US"),
        llm=openai.LLM(model="gpt-4o"),
        tts=elevenlabs.TTS(voice="rachel", model="eleven_turbo_v2_5"),
        vad=silero.VAD.load(),
    )
    await session.start(
        room=ctx.room,
        agent=Concierge(),
        room_input_options=RoomInputOptions(noise_cancellation=True),
    )
    await session.generate_reply(instructions="Greet the caller warmly.")
```

## Step 4 — Add a tool with the function decorator

```python
from livekit.agents.llm import function_tool

class Concierge(Agent):
    @function_tool
    async def book_slot(self, iso_time: str) -> str:
        """Book the requested ISO-8601 slot in the clinic calendar."""
        # call your real backend here
        return f"Booked {iso_time}"
```

## Step 5 — Run + auto-dispatch

```python
if **name** == "**main**":
    agents.cli.run_app(
        agents.WorkerOptions(entrypoint_fnc=entrypoint,
                             agent_name="concierge"),
    )
```

```bash
python agent.py dev    # local hot-reload
python agent.py start  # production worker
```

## Step 6 — Deploy

LiveKit Cloud: `lk cloud agents deploy --agent concierge`. Self-hosted: any container platform — agents pull jobs over WebSocket from your LiveKit server, so no inbound port is needed.

## Step 7 — Wire SIP/PSTN

Use `lk sip create-trunk` + a `SIPDispatchRule` that routes inbound DIDs to the same room pattern; the worker auto-attaches.

## Pitfalls

- **VAD model size**: Silero VAD is fine, but on >40 concurrent rooms per worker switch to LiveKit's hosted turn-detector for lower CPU.
- **Plugin version drift**: Pin `livekit-agents~=1.0` and refresh plugins together — STT and TTS plugins share the worker contract.
- **Egress to PSTN**: SIP works, but Twilio Elastic SIP requires `secure: false` or a TLS cert chain on your trunk.
- **Cold start**: Each Python worker takes ~3s to load Silero — pre-warm with `num_idle_processes=2`.

## How CallSphere does this

CallSphere ships **37 production agents** across **6 verticals** with **90+ tools** and **115+ Postgres tables**. Healthcare, OneRoof real-estate, Salon, Sales, Behavioral Health, and Trades stacks all run a LiveKit Agents fleet handling 1.2M voice minutes/month at ~720ms p95 voice-to-voice. Pricing is **$149/$499/$1,499** with a **14-day no-card trial** and a **22% recurring affiliate**.

## FAQ

**Can I bring my own LLM?** Yes — `openai.LLM(base_url=...)` works with Groq, Together, vLLM, Ollama, and Anthropic via gateways.

**Does it support Realtime API?** `openai.realtime.RealtimeModel()` replaces STT+LLM+TTS with one model — drop the three plugins and pass `llm=...`.

**Latency tuning?** Use turn-detector + VAD pre-emption + ElevenLabs streaming WebSocket; expect 600-750ms p50.

**Multi-tenant isolation?** Each room gets its own worker process; use job metadata to route by tenant.

## Sources

- LiveKit Docs - Voice AI Quickstart - [https://docs.livekit.io/agents/quickstarts/voice-agent/](https://docs.livekit.io/agents/quickstarts/voice-agent/)
- LiveKit Blog - Build Your First AI Voice Agent in Python - [https://livekit.com/blog/build-your-first-ai-voice-agent-python](https://livekit.com/blog/build-your-first-ai-voice-agent-python)
- GitHub - livekit/agents - [https://github.com/livekit/agents](https://github.com/livekit/agents)
- ForaSoft - LiveKit AI Agents 2026 Playbook - [https://www.forasoft.com/blog/article/livekit-ai-agents-guide](https://www.forasoft.com/blog/article/livekit-ai-agents-guide)

---

Source: https://callsphere.ai/blog/vw9h-build-voice-agent-livekit-agents-python-2026
