By Sagar Shankaran, Founder of CallSphere
Wire LiveKit Agents 1.x, Deepgram STT, GPT-4o, and ElevenLabs TTS into a sub-700ms WebRTC voice agent. Real Python code, room dispatch, and prod pitfalls.
Key takeaways
TL;DR — LiveKit Agents 1.x is the most-adopted open-source voice runtime of 2026. Two Python files (one entrypoint + one Dockerfile) give you a WebRTC voice agent with STT-LLM-TTS pipelining, server-side VAD, interruption handling, and one-command deploy to LiveKit Cloud or self-hosted SFU.
A LiveKit room participant that auto-joins any room called support-*, transcribes the caller with Deepgram Nova-3, reasons with GPT-4o, and speaks back through ElevenLabs Turbo v2.5 — all under 700ms voice-to-voice on the default plan.
flowchart LR
CL[Caller browser/SIP] -- WebRTC --> SFU[LiveKit SFU]
SFU -- audio track --> AG[Python agents worker]
AG -- STT --> DG[Deepgram Nova-3]
AG -- LLM --> OA[OpenAI GPT-4o]
AG -- TTS --> EL[ElevenLabs Turbo 2.5]
AG -- audio track --> SFU --> CL
```bash pip install "livekit-agents[deepgram,openai,elevenlabs,silero]~=1.0" lk app create --template agent-starter-python my-agent cd my-agent && cp .env.example .env # add LIVEKIT_URL, OPENAI_API_KEY, etc. ```
```python from livekit import agents from livekit.agents import Agent, AgentSession, RoomInputOptions from livekit.plugins import openai, deepgram, elevenlabs, silero
class Concierge(Agent): def init(self) -> None: super().init( instructions="You are a friendly clinic concierge. " "Confirm the appointment, then ask if anything else.", ) ```
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
```python async def entrypoint(ctx: agents.JobContext): await ctx.connect() session = AgentSession( stt=deepgram.STT(model="nova-3", language="en-US"), llm=openai.LLM(model="gpt-4o"), tts=elevenlabs.TTS(voice="rachel", model="eleven_turbo_v2_5"), vad=silero.VAD.load(), ) await session.start( room=ctx.room, agent=Concierge(), room_input_options=RoomInputOptions(noise_cancellation=True), ) await session.generate_reply(instructions="Greet the caller warmly.") ```
```python from livekit.agents.llm import function_tool
class Concierge(Agent): @function_tool async def book_slot(self, iso_time: str) -> str: """Book the requested ISO-8601 slot in the clinic calendar.""" # call your real backend here return f"Booked {iso_time}" ```
```python if name == "main": agents.cli.run_app( agents.WorkerOptions(entrypoint_fnc=entrypoint, agent_name="concierge"), ) ```
```bash python agent.py dev # local hot-reload python agent.py start # production worker ```
LiveKit Cloud: lk cloud agents deploy --agent concierge. Self-hosted: any container platform — agents pull jobs over WebSocket from your LiveKit server, so no inbound port is needed.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Use lk sip create-trunk + a SIPDispatchRule that routes inbound DIDs to the same room pattern; the worker auto-attaches.
livekit-agents~=1.0 and refresh plugins together — STT and TTS plugins share the worker contract.secure: false or a TLS cert chain on your trunk.num_idle_processes=2.CallSphere ships 37 production agents across 6 verticals with 90+ tools and 115+ Postgres tables. Healthcare, OneRoof real-estate, Salon, Sales, Behavioral Health, and Trades stacks all run a LiveKit Agents fleet handling 1.2M voice minutes/month at ~720ms p95 voice-to-voice. Pricing is $149/$499/$1,499 with a 14-day no-card trial and a 22% recurring affiliate.
Can I bring my own LLM? Yes — openai.LLM(base_url=...) works with Groq, Together, vLLM, Ollama, and Anthropic via gateways.
Does it support Realtime API? openai.realtime.RealtimeModel() replaces STT+LLM+TTS with one model — drop the three plugins and pass llm=....
Latency tuning? Use turn-detector + VAD pre-emption + ElevenLabs streaming WebSocket; expect 600-750ms p50.
Multi-tenant isolation? Each room gets its own worker process; use job metadata to route by tenant.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
The voice AI market hits $47.5B by 2034. For gyms and PT studios, voice agents now make economic sense for member intake, upsells, and reactivation campaigns.
With the voice AI market at $47.5B by 2034 and OpenAI's realtime release this week, every dealership and service shop should be evaluating voice agents. Here's how.
Spring 2026 AC season starts now. With the voice AI market at $47.5B by 2034, HVAC shops without after-hours voice agents will lose to those that have them.
OpenAI's GPT-Realtime-Whisper launches at $0.017/min for streaming STT. Side-by-side latency, accuracy, and cost math vs Deepgram and the field.
BrowserStack offers 30,000+ real devices; Sauce Labs ships deep Appium automation. Here is how AI voice agent teams use both for WebRTC mobile QA in 2026.
OpenAI's GPT-Realtime-Translate handles 70 input languages live at $0.034/min. Here is what that means for multilingual restaurant takeout — and how CallSphere ships it.
© 2026 CallSphere LLC. All rights reserved.
Watch how CallSphere handles real customer calls, schedules appointments, and processes payments — live.
Try Live DemoBook a DemoCalculate Your ROI