By Sagar Shankaran, Founder of CallSphere
LiveKit Agents 1.5 (April 2026) added an audio-based interruption model and native MCP tools. Here's a full self-hosted LiveKit voice agent with adaptive turn detection.
Key takeaways
TL;DR — LiveKit Agents 1.5 ships an ML model that distinguishes real interruptions from "mm-hmm", coughs, and background noise. Combine that with native MCP tool support and the framework is the most polished OSS voice-agent stack in May 2026.
A self-hosted LiveKit server + an Agents 1.5 Python worker that joins rooms, runs Deepgram STT + OpenAI LLM + ElevenLabs TTS (all swappable), and exposes a calculator MCP tool. Browser test page included.
pip install "livekit-agents[deepgram,openai,elevenlabs,silero,turn-detector]" python-dotenv.livekit-server Docker image.flowchart LR
BR[Browser] -->|WebRTC| LK[LiveKit Server]
LK <-->|Room| AG[Agents Worker 1.5]
AG --> STT[Deepgram]
AG --> LLM[OpenAI gpt-4o-mini]
AG --> TTS[ElevenLabs]
AG --> MCP[MCP Server -> tools]
```bash docker run --rm -p 7880:7880 -p 7881:7881 -p 7882:7882/udp \ -e LIVEKIT_KEYS="devkey: secret" \ livekit/livekit-server --dev ```
This is fine for local dev; for production use Helm + a dedicated SFU.
```python
import os from dotenv import load_dotenv from livekit import agents from livekit.agents import Agent, AgentSession, JobContext, RoomInputOptions, WorkerOptions from livekit.plugins import openai, deepgram, elevenlabs, silero from livekit.plugins.turn_detector.multilingual import MultilingualModel load_dotenv()
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
class Assistant(Agent): def init(self): super().init(instructions=( "You are a polite, concise voice assistant. Keep replies under 2 sentences. " "If asked to compute, call the calculator tool."))
async def entrypoint(ctx: JobContext): session = AgentSession( stt=deepgram.STT(model="nova-2", language="en"), llm=openai.LLM(model="gpt-4o-mini"), tts=elevenlabs.TTS(voice="EXAVITQu4vr4xnSDxMaL"), vad=silero.VAD.load(), turn_detection=MultilingualModel()) # 1.5 ML interruption model await session.start(agent=Assistant(), room=ctx.room, room_input_options=RoomInputOptions()) await ctx.connect() await session.generate_reply(instructions="Greet the user and ask what they need.")
if name == "main": agents.cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint)) ```
.env```bash LIVEKIT_URL=ws://127.0.0.1:7880 LIVEKIT_API_KEY=devkey LIVEKIT_API_SECRET=secret OPENAI_API_KEY=sk-... DEEPGRAM_API_KEY=... ELEVEN_API_KEY=... ```
```bash python agent.py dev ```
The worker registers with LiveKit and waits for rooms.
```python from livekit.agents import function_tool, RunContext
@function_tool async def calculator(ctx: RunContext, expression: str) -> str: """Evaluate a basic arithmetic expression.""" import ast, operator as op OPS = {ast.Add: op.add, ast.Sub: op.sub, ast.Mult: op.mul, ast.Div: op.truediv} def ev(n): if isinstance(n, ast.Constant): return n.value if isinstance(n, ast.BinOp): return OPStype(n.op), ev(n.right)) raise ValueError("bad") return str(ev(ast.parse(expression, mode="eval").body))
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
class Assistant(Agent): def init(self): super().init(instructions="...", tools=[calculator]) ```
For production, point AgentSession(mcp_servers=[...]) at a remote MCP server (Anthropic, GitHub, your own).
Use the agent-starter-react repo or a quick livekit-client snippet to join the same room. The agent answers automatically.
ws://127.0.0.1:7880, not localhost if your client is on a phone).CallSphere runs OneRoof Property's 10 specialists on a Pion-based WebRTC mesh — same architecture pattern as LiveKit, custom-tuned for property workflows. Healthcare uses 14 HIPAA-aware tools on FastAPI :8084 with OpenAI Realtime; Salon, Dental, F&B and Behavioral round out 6 verticals. Total: 37 agents · 90+ tools · 115+ DB tables. Flat pricing $149/$499/$1499 — 14-day trial · 22% affiliate · /industries/real-estate · /demo.
Cloud vs self-hosted LiveKit? Cloud is faster to try; self-hosted is cheaper at scale.
Sub-agent / multi-agent patterns? Yes — session.update_agent(NewAgent()) mid-call.
Realtime API instead of STT/LLM/TTS? Yes — openai.realtime.RealtimeModel().
Mobile? Native iOS/Android SDKs.
Phone numbers? LiveKit Cloud Build plan includes one US DID.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
How to design a multi-agent system using MCP for tools and A2A for cross-vendor coordination, with a CallSphere voice agent as a participating node.
MCP is agent-to-tool. A2A is agent-to-agent. Here is a clear 2026 decision guide for builders choosing between (and combining) the two protocols.
Google's May 2026 MCP 1.0 + A2A developers guide is the cleanest protocol picker we have seen. The takeaways, in plain English, with a CallSphere lens.
A2A unlocks cross-vendor agent coordination, but most enterprise voice/chat workloads still ship faster on a single-vendor stack. Here is how to choose.
Haystack 2.7's Agent component plus an Ollama-served Llama 3.2 gives you tool-calling RAG with citations. Here's a complete pipeline against your own document store.
BrowserStack offers 30,000+ real devices; Sauce Labs ships deep Appium automation. Here is how AI voice agent teams use both for WebRTC mobile QA in 2026.
© 2026 CallSphere LLC. All rights reserved.
Watch how CallSphere handles real customer calls, schedules appointments, and processes payments — live.
Try Live DemoBook a DemoCalculate Your ROI