By Sagar Shankaran, Founder of CallSphere
AVAudioSession is the make-or-break for AI voice agents on iOS. Here is the 2026 production playbook for category, mode, options, and interruption recovery.
Key takeaways
Most "WebRTC works in dev but not in production" tickets on iOS are AVAudioSession misconfiguration. The category, mode, options, and interruption-handling chain has to match Apple's intent for VoIP — not the defaults.
`AVAudioSession` is iOS's process-wide arbiter for audio. It decides which app gets the mic, whether playback ducks Spotify, whether AirPods route correctly, and what happens when an actual phone call comes in. WebRTC ships a wrapper class `RTCAudioSession` that owns the configuration on its behalf. For AI voice agents in 2026, the canonical setup is `playAndRecord` category + `voiceChat` mode + `allowBluetooth` and `duckOthers` options. Diverge from that at your peril.
The hardest part is interruption handling. A real phone call is a higher-priority audio session than your VoIP app. iOS will pause your session, fire `AVAudioSessionInterruptionNotification`, and expect you to gracefully resume — or hang up — when it ends.
```mermaid flowchart LR App[App] --> RTCAudio[RTCAudioSession] RTCAudio --> AVAudio[AVAudioSession] AVAudio --> CoreAudio[CoreAudio] AVAudio -. interruption .-> Handler[InterruptionHandler] Handler --> WebRTC[WebRTC PeerConnection] WebRTC --> Gateway[Pion Go gateway 1.23] ```
The iOS clients across CallSphere's six verticals (real estate, healthcare, behavioral health, legal, salon, insurance) share a single AVAudioSession config module:
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
37 agents · 90+ tools · 115+ DB tables · 6 verticals · HIPAA + SOC 2 · $149/$499/$1499 · 14-day /trial · 22% affiliate at /affiliate.
```swift import WebRTC import AVFoundation
func configureAudioSession() { let session = RTCAudioSession.sharedInstance() session.lockForConfiguration() do { try session.setCategory( AVAudioSession.Category.playAndRecord.rawValue, with: [.allowBluetooth, .allowBluetoothA2DP, .duckOthers] ) try session.setMode(AVAudioSession.Mode.voiceChat.rawValue) } catch { print("Audio session config failed: (error)") } session.unlockForConfiguration() }
class InterruptionHandler: NSObject { override init() { super.init() NotificationCenter.default.addObserver( self, selector: #selector(handleInterruption(_:)), name: AVAudioSession.interruptionNotification, object: nil) }
@objc func handleInterruption(_ notification: Notification) { guard let info = notification.userInfo, let typeValue = info[AVAudioSessionInterruptionTypeKey] as? UInt, let type = AVAudioSession.InterruptionType(rawValue: typeValue) else { return } switch type { case .began: WebRTCManager.shared.pause() case .ended: let opts = AVAudioSession.InterruptionOptions( rawValue: info[AVAudioSessionInterruptionOptionKey] as? UInt ?? 0) if opts.contains(.shouldResume) { WebRTCManager.shared.resume() } @unknown default: break } } } ```
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Why .voiceChat and not .videoChat? `.videoChat` raises sample rate to 48 kHz and prefers full-duplex but disables some echo-cancellation tuning. For audio-only AI voice, `.voiceChat` is correct.
Should I use mixWithOthers? Only if you want music apps to keep playing under your voice; for AI voice agents, no.
What about the .duckOthers option? Yes — it ducks Spotify so the user can hear the AI clearly.
How do I detect AirPods? Listen to `AVAudioSession.routeChangeNotification` and inspect `currentRoute.outputs`.
Does it work in the simulator? No — the simulator's audio is unreliable; always test on device.
Try the iOS WebRTC stack at /demo, see /pricing, or start a /trial.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
A founder's guide to texto a voz (text-to-speech in Spanish): LATAM vs Castilian voices, free options, and how CallSphere ships Spanish agents.
A founder's guide to the female voice generator landscape: AI female voices, Japanese voices, robot voices, and how CallSphere ships 57+ voices live.
A founder's guide to the Siri voice generator landscape: how AI voice cloning works, what is legal, and how CallSphere uses 57+ voices in production.
A founder's guide to AI voice assistants for ecommerce: customer service, order lookup, and how CallSphere fits in versus virtual receptionists.
Robot text to speech in 2026: how I pick TTS APIs, when robotic voices help, and how CallSphere ships 57+ language voice agents. Hands-on guide.
The customer support specialist role in 2026 is half human, half AI. Here is what the job looks like, the AI tools that pair with it, and how we ship it.
© 2026 CallSphere LLC. All rights reserved.