By Sagar Shankaran, Founder of CallSphere
Expo Go cannot run WebRTC. EAS Dev Client can. Here is the 2026 config-plugin recipe for shipping AI voice agents on Expo without abandoning the managed workflow.
Key takeaways
Expo Go is the cleanest mobile dev loop on Earth, but it cannot run WebRTC. The trade-off in 2026 is to swap Expo Go for an EAS Dev Client and use the @config-plugins/react-native-webrtc plugin. You keep the managed workflow, you lose 30 seconds at the install step, and you gain real audio.
react-native-webrtc requires custom native code (libwebrtc.so/.framework). Expo Go ships a fixed set of native modules and cannot dynamically load new ones — so any RN app that needs WebRTC must move to a Dev Client built with EAS. This has been true since Expo SDK 43, when the @config-plugins/react-native-webrtc plugin was first released. In 2026 the plugin works with Expo SDK 50+ and react-native-webrtc 124.x, with two caveats: it disables Bitcode on iOS (required) and bumps Android minSdkVersion to 24 (which can break older library deps).
For AI voice agent apps, "Dev Client + config plugin" is the cleanest way to keep Expo's update OTA story (EAS Update), the managed AndroidManifest/Info.plist generation, and the React Native WebRTC stack on the same project.
```mermaid flowchart LR Dev[Developer] -- expo install --> Project[Expo Project] Project -- @config-plugins/react-native-webrtc --> EAS[EAS Build] EAS -- Dev Client IPA/APK --> Device[Physical Device] Device -- WebRTC --> Gateway[Pion Go gateway 1.23] Gateway -- NATS --> Pod[6-container agent pod] ```
CallSphere uses Expo for prototyping vertical-specific clients before promoting them to bare RN or native:
37 agents · 90+ tools · 115+ DB tables · 6 verticals · HIPAA + SOC 2 · $149/$499/$1499 · 14-day /trial · 22% affiliate at /affiliate.
```bash npx create-expo-app@latest agent-app cd agent-app npx expo install expo-dev-client npx expo install react-native-webrtc @config-plugins/react-native-webrtc ```
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
```json // app.json { "expo": { "ios": { "infoPlist": { "NSMicrophoneUsageDescription": "Voice agent needs mic" } }, "android": { "permissions": ["RECORD_AUDIO", "MODIFY_AUDIO_SETTINGS"] }, "plugins": [ ["@config-plugins/react-native-webrtc"] ] } } ```
```bash
eas build --profile development --platform ios eas build --profile development --platform android ```
After installing the Dev Client, run `npx expo start --dev-client` and the JS hot-reload still works exactly like Expo Go — only the bundled native modules differ.
Can I still ship EAS Update? Yes — JS bundle updates work fine; native plugin updates require a new Dev Client build.
Does Expo support CallKit/Telecom? Yes via react-native-callkeep with its own config plugin in app.json.
Can I avoid Dev Client entirely? No — there is no path to WebRTC inside Expo Go.
How long is an EAS build? First iOS build is 12-25 minutes; subsequent builds with cache are 4-8 minutes.
Does it work with the New Architecture? Yes since react-native-webrtc 118 and Expo SDK 51.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Try CallSphere agents at /demo, see /pricing, or start a /trial.
One layer below what Expo + WebRTC for AI Voice Apps (2026): What Works, What Doesn't covers, the practical question every team hits is multi-turn handoffs between specialist agents without losing slot state, sentiment, or escalation context. Treat this as a voice-first system from the first prompt: the agent's persona, its tool surface, and its escalation rules all flow from that single decision. Teams that ship fast tend to instrument the loop end-to-end before they tune any single component, because the bottleneck is rarely where intuition puts it.
A production-grade voice stack at CallSphere stitches Twilio Programmable Voice (PSTN ingress, TwiML, bidirectional Media Streams) to a realtime reasoning layer — typically OpenAI Realtime or ElevenLabs Conversational AI — with sub-second response as a hard SLO. Anything north of one second of perceived silence and callers either repeat themselves or hang up; that single number drives the whole architecture. Server-side VAD with proper barge-in support is non-negotiable, otherwise the agent talks over the caller and the conversation collapses. Streaming TTS with phoneme-aligned interruption keeps the cadence natural even when the user changes their mind mid-sentence. Post-call, every transcript is run through a structured pipeline: sentiment, intent classification, lead score, escalation flag, and a normalized slot extraction (name, callback number, reason, urgency). For healthcare workloads, the BAA-covered storage path, audit logs, encryption-at-rest, and PHI-safe transcript redaction are wired in from day one, not bolted on at compliance review. The end state is a system where every call produces a row of structured data, not just a recording.
What is the fastest path to a voice agent the way Expo + WebRTC for AI Voice Apps (2026): What Works, What Doesn't describes?
Treat the architecture in this post as a starting point and instrument it before you tune it. The metrics that matter most early on are end-to-end latency (target < 1s for voice, < 3s for chat), barge-in correctness, tool-call success rate, and post-conversation lead score distribution. Optimize whatever the data flags as the bottleneck, not whatever feels slowest in your head.
What are the gotchas around voice agent deployments at scale?
The two failure modes that bite hardest are silent context loss across multi-turn handoffs and tool calls that succeed in dev but get rate-limited in production. Both are solvable with a proper agent backplane that pins state to a session ID, retries with backoff, and writes every tool invocation to an audit log you can replay.
What does the CallSphere outbound sales calling product do that a regular dialer does not?
It uses the ElevenLabs "Sarah" voice, runs up to 5 concurrent outbound calls per operator, and ships with a browser-based dialer that transfers warm calls back to a human in one click. Dispositions, transcripts, and lead scores write back to the CRM automatically.
Book a 30-minute working session at calendly.com/sagar-callsphere/new-meeting and bring a real call flow — we will walk it through the live outbound sales dialer at sales.callsphere.tech and show you exactly where the production wiring sits.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
A founder's guide to texto a voz (text-to-speech in Spanish): LATAM vs Castilian voices, free options, and how CallSphere ships Spanish agents.
A founder's guide to the female voice generator landscape: AI female voices, Japanese voices, robot voices, and how CallSphere ships 57+ voices live.
A founder's guide to the Siri voice generator landscape: how AI voice cloning works, what is legal, and how CallSphere uses 57+ voices in production.
A founder's guide to AI voice assistants for ecommerce: customer service, order lookup, and how CallSphere fits in versus virtual receptionists.
Robot text to speech in 2026: how I pick TTS APIs, when robotic voices help, and how CallSphere ships 57+ language voice agents. Hands-on guide.
The customer support specialist role in 2026 is half human, half AI. Here is what the job looks like, the AI tools that pair with it, and how we ship it.
© 2026 CallSphere LLC. All rights reserved.