Most "WebRTC works in dev but not in production" tickets on iOS are AVAudioSession misconfiguration. The category, mode, options, and interruption-handling chain has to match Apple's intent for VoIP — not the defaults.

Background

`AVAudioSession` is iOS's process-wide arbiter for audio. It decides which app gets the mic, whether playback ducks Spotify, whether AirPods route correctly, and what happens when an actual phone call comes in. WebRTC ships a wrapper class `RTCAudioSession` that owns the configuration on its behalf. For AI voice agents in 2026, the canonical setup is `playAndRecord` category + `voiceChat` mode + `allowBluetooth` and `duckOthers` options. Diverge from that at your peril.

The hardest part is interruption handling. A real phone call is a higher-priority audio session than your VoIP app. iOS will pause your session, fire `AVAudioSessionInterruptionNotification`, and expect you to gracefully resume — or hang up — when it ends.

Architecture

```mermaid flowchart LR App[App] --> RTCAudio[RTCAudioSession] RTCAudio --> AVAudio[AVAudioSession] AVAudio --> CoreAudio[CoreAudio] AVAudio -. interruption .-> Handler[InterruptionHandler] Handler --> WebRTC[WebRTC PeerConnection] WebRTC --> Gateway[Pion Go gateway 1.23] ```

CallSphere implementation

The iOS clients across CallSphere's six verticals (real estate, healthcare, behavioral health, legal, salon, insurance) share a single AVAudioSession config module:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

Real Estate (OneRoof) — Field-rep iPhones often have a real call interrupt the AI handoff; the interruption handler re-negotiates DTLS-SRTP with the Pion Go gateway 1.23 and resumes through NATS to the 6-container pod (CRM, MLS, calendar, SMS, audit, transcript). See /industries/real-estate.
Healthcare — HIPAA mandates that interruption recovery cannot leak audio to the speaker route by accident; we explicitly clear the buffer on interruption end. See /industries/healthcare.
/demo browser path — Browser audio session is handled by the browser itself; less control, fewer edge cases. See /demo.

37 agents · 90+ tools · 115+ DB tables · 6 verticals · HIPAA + SOC 2 · $149/$499/$1499 · 14-day /trial · 22% affiliate at /affiliate.

Build steps with code

```swift import WebRTC import AVFoundation

func configureAudioSession() { let session = RTCAudioSession.sharedInstance() session.lockForConfiguration() do { try session.setCategory( AVAudioSession.Category.playAndRecord.rawValue, with: [.allowBluetooth, .allowBluetoothA2DP, .duckOthers] ) try session.setMode(AVAudioSession.Mode.voiceChat.rawValue) } catch { print("Audio session config failed: (error)") } session.unlockForConfiguration() }

class InterruptionHandler: NSObject { override init() { super.init() NotificationCenter.default.addObserver( self, selector: #selector(handleInterruption(_:)), name: AVAudioSession.interruptionNotification, object: nil) }

@objc func handleInterruption(_ notification: Notification) { guard let info = notification.userInfo, let typeValue = info[AVAudioSessionInterruptionTypeKey] as? UInt, let type = AVAudioSession.InterruptionType(rawValue: typeValue) else { return } switch type { case .began: WebRTCManager.shared.pause() case .ended: let opts = AVAudioSession.InterruptionOptions( rawValue: info[AVAudioSessionInterruptionOptionKey] as? UInt ?? 0) if opts.contains(.shouldResume) { WebRTCManager.shared.resume() } @unknown default: break } } } ```

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Pitfalls

Calling AVAudioSession.setActive(true) directly when CallKit is in play — let CallKit own activation; you handle category/mode in `provider:didActivate:`.
Forgetting allowBluetoothA2DP — Speakers paired over A2DP-only (cars, some BT speakers) will route to the iPhone speaker instead.
Using `.default` mode — That mode applies AGC tuned for music, not voice. Always set `.voiceChat` for VoIP.
Ignoring shouldResume option — Some interruptions (a Siri invocation) explicitly say "do not resume"; honor it.
Resuming without re-checking ICE — A long interruption can let UDP NAT bindings expire; you may need ICE restart.

FAQ

Why .voiceChat and not .videoChat? `.videoChat` raises sample rate to 48 kHz and prefers full-duplex but disables some echo-cancellation tuning. For audio-only AI voice, `.voiceChat` is correct.

Should I use mixWithOthers? Only if you want music apps to keep playing under your voice; for AI voice agents, no.

What about the .duckOthers option? Yes — it ducks Spotify so the user can hear the AI clearly.

How do I detect AirPods? Listen to `AVAudioSession.routeChangeNotification` and inspect `currentRoute.outputs`.

Does it work in the simulator? No — the simulator's audio is unreliable; always test on device.

Sources

Try the iOS WebRTC stack at /demo, see /pricing, or start a /trial.

iOS Audio Session Config for AI Voice (2026): Interruption Handling Done Right

Background

Architecture

CallSphere implementation

Build steps with code

Pitfalls

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

Texto a Voz: AI Voice Generators for Spanish Markets in 2026

Female Voice Generator: AI Voices That Sound Human in 2026

Siri Voice Generator: How AI Voice Cloning Actually Works in 2026

AI Voice Assistants for Ecommerce and Small Business in 2026

Robot Text to Speech in 2026: A Founder's Guide to TTS Voices

Customer Support Specialist in 2026: AI-Augmented Role Guide

Product

Resources

Company

Legal

Industries

Integrations

Solutions

Compare

Pillar Guides