TL;DR — LiveKit Agents 1.5 ships an ML model that distinguishes real interruptions from "mm-hmm", coughs, and background noise. Combine that with native MCP tool support and the framework is the most polished OSS voice-agent stack in May 2026.

What you'll build

A self-hosted LiveKit server + an Agents 1.5 Python worker that joins rooms, runs Deepgram STT + OpenAI LLM + ElevenLabs TTS (all swappable), and exposes a calculator MCP tool. Browser test page included.

Prerequisites

Docker (for the LiveKit server) + Python 3.11.
pip install "livekit-agents[deepgram,openai,elevenlabs,silero,turn-detector]" python-dotenv.
API keys for Deepgram, OpenAI, ElevenLabs (or swap to local providers).
livekit-server Docker image.

Architecture

flowchart LR
  BR[Browser] -->|WebRTC| LK[LiveKit Server]
  LK <-->|Room| AG[Agents Worker 1.5]
  AG --> STT[Deepgram]
  AG --> LLM[OpenAI gpt-4o-mini]
  AG --> TTS[ElevenLabs]
  AG --> MCP[MCP Server -> tools]

Step 1 — Run LiveKit server

```bash docker run --rm -p 7880:7880 -p 7881:7881 -p 7882:7882/udp \ -e LIVEKIT_KEYS="devkey: secret" \ livekit/livekit-server --dev ```

This is fine for local dev; for production use Helm + a dedicated SFU.

Step 2 — Define the agent

```python

agent.py

import os from dotenv import load_dotenv from livekit import agents from livekit.agents import Agent, AgentSession, JobContext, RoomInputOptions, WorkerOptions from livekit.plugins import openai, deepgram, elevenlabs, silero from livekit.plugins.turn_detector.multilingual import MultilingualModel load_dotenv()

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

class Assistant(Agent): def init(self): super().init(instructions=( "You are a polite, concise voice assistant. Keep replies under 2 sentences. " "If asked to compute, call the calculator tool."))

async def entrypoint(ctx: JobContext): session = AgentSession( stt=deepgram.STT(model="nova-2", language="en"), llm=openai.LLM(model="gpt-4o-mini"), tts=elevenlabs.TTS(voice="EXAVITQu4vr4xnSDxMaL"), vad=silero.VAD.load(), turn_detection=MultilingualModel()) # 1.5 ML interruption model await session.start(agent=Assistant(), room=ctx.room, room_input_options=RoomInputOptions()) await ctx.connect() await session.generate_reply(instructions="Greet the user and ask what they need.")

if name == "main": agents.cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint)) ```

Step 3 — `.env`

```bash LIVEKIT_URL=ws://127.0.0.1:7880 LIVEKIT_API_KEY=devkey LIVEKIT_API_SECRET=secret OPENAI_API_KEY=sk-... DEEPGRAM_API_KEY=... ELEVEN_API_KEY=... ```

Step 4 — Run the agent worker

```bash python agent.py dev ```

The worker registers with LiveKit and waits for rooms.

Step 5 — Add an MCP tool (1.5 native)

```python from livekit.agents import function_tool, RunContext

@function_tool async def calculator(ctx: RunContext, expression: str) -> str: """Evaluate a basic arithmetic expression.""" import ast, operator as op OPS = {ast.Add: op.add, ast.Sub: op.sub, ast.Mult: op.mul, ast.Div: op.truediv} def ev(n): if isinstance(n, ast.Constant): return n.value if isinstance(n, ast.BinOp): return OPStype(n.op), ev(n.right)) raise ValueError("bad") return str(ev(ast.parse(expression, mode="eval").body))

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

class Assistant(Agent): def init(self): super().init(instructions="...", tools=[calculator]) ```

For production, point AgentSession(mcp_servers=[...]) at a remote MCP server (Anthropic, GitHub, your own).

Step 6 — Browser test client

Use the agent-starter-react repo or a quick livekit-client snippet to join the same room. The agent answers automatically.

Common pitfalls

Worker not registering. LIVEKIT_URL must match the host the browser uses (ws://127.0.0.1:7880, not localhost if your client is on a phone).
Turn detector model download. First run downloads ~120 MB of weights; pre-cache in CI.
Interruption model on slow CPUs. It's fine on M-series and modern x86; on Pi-class hardware, disable.

How CallSphere does this in production

CallSphere runs OneRoof Property's 10 specialists on a Pion-based WebRTC mesh — same architecture pattern as LiveKit, custom-tuned for property workflows. Healthcare uses 14 HIPAA-aware tools on FastAPI :8084 with OpenAI Realtime; Salon, Dental, F&B and Behavioral round out 6 verticals. Total: 37 agents · 90+ tools · 115+ DB tables. Flat pricing $149/$499/$1499 — 14-day trial · 22% affiliate · /industries/real-estate · /demo.

FAQ

Cloud vs self-hosted LiveKit? Cloud is faster to try; self-hosted is cheaper at scale.

Sub-agent / multi-agent patterns? Yes — session.update_agent(NewAgent()) mid-call.

Realtime API instead of STT/LLM/TTS? Yes — openai.realtime.RealtimeModel().

Mobile? Native iOS/Android SDKs.

Phone numbers? LiveKit Cloud Build plan includes one US DID.

Build a Voice Agent with LiveKit Agents Python SDK 1.5 (2026)

What you'll build

Prerequisites

Architecture

Step 1 — Run LiveKit server

Step 2 — Define the agent

agent.py

Step 3 — `.env`

Step 4 — Run the agent worker

Step 5 — Add an MCP tool (1.5 native)

Step 6 — Browser test client

Common pitfalls

How CallSphere does this in production

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

MCP Servers for SaaS Tools: A 2026 Registry Walkthrough for Voice Agent Teams

Build a Chat Agent with Haystack RAG + Open LLM (Llama 3.2, 2026)

Build a Voice Agent on Cloudflare Workers AI (No External LLM)

WebRTC Mobile Testing with BrowserStack + Sauce Labs (2026)

OpenAI's May 2026 WebRTC Rearchitecture: How Voice Latency Got Real

Building a Custom Calling Platform: Enterprise Guide

What you'll build

Prerequisites

Architecture

Step 1 — Run LiveKit server

Step 2 — Define the agent

agent.py

Step 3 — .env

Step 4 — Run the agent worker

Step 5 — Add an MCP tool (1.5 native)

Step 6 — Browser test client

Common pitfalls

How CallSphere does this in production

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

MCP Servers for SaaS Tools: A 2026 Registry Walkthrough for Voice Agent Teams

Build a Chat Agent with Haystack RAG + Open LLM (Llama 3.2, 2026)

Build a Voice Agent on Cloudflare Workers AI (No External LLM)

WebRTC Mobile Testing with BrowserStack + Sauce Labs (2026)

OpenAI's May 2026 WebRTC Rearchitecture: How Voice Latency Got Real

Building a Custom Calling Platform: Enterprise Guide

Step 3 — `.env`