iOS Audio Session Config for AI Voice (2026): Interruption Handling Done Right
AVAudioSession is the make-or-break for AI voice agents on iOS. Here is the 2026 production playbook for category, mode, options, and interruption recovery.
Most "WebRTC works in dev but not in production" tickets on iOS are AVAudioSession misconfiguration. The category, mode, options, and interruption-handling chain has to match Apple's intent for VoIP — not the defaults.
Background
`AVAudioSession` is iOS's process-wide arbiter for audio. It decides which app gets the mic, whether playback ducks Spotify, whether AirPods route correctly, and what happens when an actual phone call comes in. WebRTC ships a wrapper class `RTCAudioSession` that owns the configuration on its behalf. For AI voice agents in 2026, the canonical setup is `playAndRecord` category + `voiceChat` mode + `allowBluetooth` and `duckOthers` options. Diverge from that at your peril.
The hardest part is interruption handling. A real phone call is a higher-priority audio session than your VoIP app. iOS will pause your session, fire `AVAudioSessionInterruptionNotification`, and expect you to gracefully resume — or hang up — when it ends.
Architecture
```mermaid flowchart LR App[App] --> RTCAudio[RTCAudioSession] RTCAudio --> AVAudio[AVAudioSession] AVAudio --> CoreAudio[CoreAudio] AVAudio -. interruption .-> Handler[InterruptionHandler] Handler --> WebRTC[WebRTC PeerConnection] WebRTC --> Gateway[Pion Go gateway 1.23] ```
CallSphere implementation
The iOS clients across CallSphere's six verticals (real estate, healthcare, behavioral health, legal, salon, insurance) share a single AVAudioSession config module:
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
- Real Estate (OneRoof) — Field-rep iPhones often have a real call interrupt the AI handoff; the interruption handler re-negotiates DTLS-SRTP with the Pion Go gateway 1.23 and resumes through NATS to the 6-container pod (CRM, MLS, calendar, SMS, audit, transcript). See /industries/real-estate.
- Healthcare — HIPAA mandates that interruption recovery cannot leak audio to the speaker route by accident; we explicitly clear the buffer on interruption end. See /industries/healthcare.
- /demo browser path — Browser audio session is handled by the browser itself; less control, fewer edge cases. See /demo.
37 agents · 90+ tools · 115+ DB tables · 6 verticals · HIPAA + SOC 2 · $149/$499/$1499 · 14-day /trial · 22% affiliate at /affiliate.
Build steps with code
```swift import WebRTC import AVFoundation
func configureAudioSession() { let session = RTCAudioSession.sharedInstance() session.lockForConfiguration() do { try session.setCategory( AVAudioSession.Category.playAndRecord.rawValue, with: [.allowBluetooth, .allowBluetoothA2DP, .duckOthers] ) try session.setMode(AVAudioSession.Mode.voiceChat.rawValue) } catch { print("Audio session config failed: (error)") } session.unlockForConfiguration() }
class InterruptionHandler: NSObject { override init() { super.init() NotificationCenter.default.addObserver( self, selector: #selector(handleInterruption(_:)), name: AVAudioSession.interruptionNotification, object: nil) }
@objc func handleInterruption(_ notification: Notification) { guard let info = notification.userInfo, let typeValue = info[AVAudioSessionInterruptionTypeKey] as? UInt, let type = AVAudioSession.InterruptionType(rawValue: typeValue) else { return } switch type { case .began: WebRTCManager.shared.pause() case .ended: let opts = AVAudioSession.InterruptionOptions( rawValue: info[AVAudioSessionInterruptionOptionKey] as? UInt ?? 0) if opts.contains(.shouldResume) { WebRTCManager.shared.resume() } @unknown default: break } } } ```
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Pitfalls
- Calling AVAudioSession.setActive(true) directly when CallKit is in play — let CallKit own activation; you handle category/mode in `provider:didActivate:`.
- Forgetting allowBluetoothA2DP — Speakers paired over A2DP-only (cars, some BT speakers) will route to the iPhone speaker instead.
- Using `.default` mode — That mode applies AGC tuned for music, not voice. Always set `.voiceChat` for VoIP.
- Ignoring shouldResume option — Some interruptions (a Siri invocation) explicitly say "do not resume"; honor it.
- Resuming without re-checking ICE — A long interruption can let UDP NAT bindings expire; you may need ICE restart.
FAQ
Why .voiceChat and not .videoChat? `.videoChat` raises sample rate to 48 kHz and prefers full-duplex but disables some echo-cancellation tuning. For audio-only AI voice, `.voiceChat` is correct.
Should I use mixWithOthers? Only if you want music apps to keep playing under your voice; for AI voice agents, no.
What about the .duckOthers option? Yes — it ducks Spotify so the user can hear the AI clearly.
How do I detect AirPods? Listen to `AVAudioSession.routeChangeNotification` and inspect `currentRoute.outputs`.
Does it work in the simulator? No — the simulator's audio is unreliable; always test on device.
Sources
- https://medium.com/@tsivilko/mastering-voip-audio-with-callkit-and-webrtc-on-ios-0f2092402331
- https://developer.apple.com/library/archive/documentation/Audio/Conceptual/AudioSessionProgrammingGuide/HandlingAudioInterruptions/HandlingAudioInterruptions.html
- https://developer.apple.com/library/archive/documentation/Audio/Conceptual/AudioSessionProgrammingGuide/AudioSessionBasics/AudioSessionBasics.html
- https://groups.google.com/g/discuss-webrtc/c/UJnJTZL2oPg
- https://github.com/afeng0007/webrtc/blob/master/webrtc/modules/audio_device/ios/objc/RTCAudioSession.mm
Try the iOS WebRTC stack at /demo, see /pricing, or start a /trial.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.