WebRTC SDP Munging: When AI Voice Agents Actually Need It (2026)
SDP munging is a footgun. But for AI voice agents it is sometimes the only way to control codecs, strip private candidates, or pin Opus DTX. Here is the 2026 playbook.
SDP munging — editing the SDP string between createOffer and setLocalDescription — is officially discouraged. In practice every serious AI voice deployment does it for at least one specific reason. Knowing which reasons are legitimate is the difference between a stable agent and a broken one.
Why SDP munging still exists
The WebRTC Working Group's official position is that anything you can do by editing SDP you should do via `RTCRtpSender.setParameters()`, transceiver direction, or codec preferences. That has been true since 2022 — except for a small list of behaviors that the API still does not expose:
- Forcing Opus options like `stereo=0`, `useinbandfec=1`, `usedtx=1`, or `maxaveragebitrate=24000`.
- Stripping non-relay ICE candidates from a server-side offer to keep VPC IPs out of the answer.
- Removing `a=extmap` lines for unsupported header extensions when bridging legacy SIP.
- Preferring a non-default codec order on Firefox, where `setCodecPreferences` is partial.
- Pinning a specific payload-type number when bridging to an SBC that expects PT 0/8 for PCMU/PCMA.
For AI voice in 2026 these tweaks really matter. Forcing FEC and DTX on Opus shaves 30% off bitrate at no quality cost; stripping VPC candidates shaves 4–6 seconds off failed connection times because the browser stops trying to reach 10.x.x.x addresses it can never see.
Architecture pattern
```mermaid flowchart LR Offer[createOffer] --> Edit[Munger - regex on SDP] Edit --> SetLocal[setLocalDescription] SetLocal --> Send[send to peer] ```
The munger sits between offer creation and local description set. Always edit before `setLocalDescription`; editing after is a guaranteed bug because DTLS fingerprints and BUNDLE groups are already locked.
CallSphere implementation
CallSphere uses two specific munges:
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
- Opus tuning — Every browser-side peer sets `a=fmtp:111 minptime=10;useinbandfec=1;usedtx=1;stereo=0;maxaveragebitrate=24000`. The bitrate and DTX settings cut bandwidth by ~30% on cellular without measurable MOS impact for English speech, which is the regime our 37 agents serve.
- Candidate stripping — Our Pion Go gateway 1.23 inside the VPC strips host and srflx candidates from server-side offers, keeping only relay candidates. This avoids the browser spending 6 s trying to connect to 10.x.x.x addresses it can never reach. The 6-container pod (CRM, MLS, calendar, SMS, audit, transcript) never sees the SDP.
We run the same munges in Real Estate (OneRoof, /industries/real-estate), healthcare, behavioral health, legal, salon, and insurance. Across 90+ tools and 115+ DB tables, the munge code is one shared library audited under SOC 2 + HIPAA controls. Pricing $149/$499/$1499 with the 14-day trial; affiliates 22% — see /affiliate.
Code snippet
```ts function tuneOpus(sdp: string): string { return sdp.replace(/a=fmtp:111 .*/, [ "a=fmtp:111 minptime=10", "useinbandfec=1", "usedtx=1", "stereo=0", "maxaveragebitrate=24000", ].join(";")); }
function stripNonRelay(sdp: string): string { return sdp.split("\n") .filter((l) => !l.startsWith("a=candidate:") || l.includes("typ relay")) .join("\n"); }
const offer = await pc.createOffer(); offer.sdp = tuneOpus(offer.sdp!); await pc.setLocalDescription(offer); ```
Build steps
- Audit your need: can you achieve this with `setCodecPreferences`, `transceiver.direction`, or `sender.setParameters`? If yes, do that.
- If munging is the only path, do it before `setLocalDescription` — never after.
- Never reorder codecs after `a=fingerprint` or `a=group:BUNDLE` — you will invalidate cryptographic state.
- Test the munge on Chromium and Firefox; Safari rejects more aggressive edits.
- Snapshot before/after SDP into your audit log; SOC 2 reviewers expect to see what you changed.
- Re-run the munge on every renegotiation; the offer is regenerated each time.
- Add a unit test that diffs SDP before/after — regressions slip in via library upgrades.
Common pitfalls
- Editing after setLocalDescription — irrecoverable. The fingerprint is already signed.
- Reordering m= sections — breaks BUNDLE. Never touch order; only edit fmtp/candidates within sections.
- Using a substring match for codec PT — Opus PT can be 111, 109, or anything else dynamic. Match by codec name first, then PT.
- Munging the answer instead of the offer — also legal but harder to reason about; pick one side.
- Ignoring `a=rtcp-fb` — removing feedback messages can disable NACK and FEC silently.
FAQ
Is munging unsafe? Done before setLocalDescription, no. Done after, you will break DTLS and ICE state.
Will the spec ever expose Opus options? Probably not — the WG's stance is that `setParameters` is the right place. Munging will continue to be needed for a while.
Does it break with E2EE? No — munging touches the signaling layer, SFrame touches the media layer; they are independent.
What about `a=mid`? Never edit. Editing mids breaks BUNDLE.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Should I write a parser? For complex munges, yes — sdp-transform is the canonical npm library.
Does `setCodecPreferences` work in Pion? Yes, partially — Pion exposes APIs for many of these without munging. Use them when you can.
Will my munge survive Chromium's next refactor? Maybe. Test on Canary continuously; Chromium has reorganized SDP attribute ordering twice in the last two years.
Does the SFU need to know I munged? Generally no — Opus tuning is browser-side; SDP is only between the two peers.
Production playbook for AI voice teams in 2026
Three rules from shipping munges in production:
- Centralize and version. One munging library, semver-tagged, audit-logged. Random regex sprinkled through the app makes regressions invisible.
- Always pair with metrics. A munge for "lower bandwidth" should land with a getStats dashboard for bandwidth before/after. Otherwise you have no idea if it worked.
- Keep a "kill switch". A feature flag that disables munging entirely. When OpenAI or LiveKit changes something upstream, you flip the flag while you fix the regex.
OpenAI's split-relay refactor in 2025 was a great test of this discipline: teams who had centralized munges flipped the kill switch in five minutes. Teams who had not lost a weekend.
Watch list 2026
- `setCodecPreferences` parity across browsers improves slowly. Track it; many munges become unnecessary as parity arrives.
- Opus 1.6 is in pre-release and may add more controls to fmtp, removing some current munges.
- WebRTC Statistics API adds finer-grained codec selection signals so you can detect when your munge actually shipped.
- AV1 audio extensions (still experimental) might require new SDP munges within the next 18 months.
A final caution: SDP munging is a trailing indicator that the WebRTC API surface is incomplete for your use case. Every munge you keep is technical debt against a future browser change. We tag each one with an issue link to the relevant W3C feature request so we know when to delete it. The Opus DTX/FEC munge has been on our list for three years; the candidate-stripping munge will probably outlive most of our infrastructure.
Sources
- https://getstream.io/resources/projects/webrtc/advanced/sdp-munging/
- https://hamming.ai/resources/debug-webrtc-voice-agents-troubleshooting-guide
- https://datatracker.ietf.org/doc/html/rfc7587
- https://openai.com/index/delivering-low-latency-voice-ai-at-scale/
- https://reference-server.pipecat.ai/en/latest/api/pipecat.transports.smallwebrtc.request_handler.html
- https://medium.com/@viplav.fauzdar/%EF%B8%8F-building-a-gpt-realtime-voice-assistant-with-webrtc-fe6dd4c8f488
See the optimized voice path on /demo, pricing in /pricing, or start a /trial.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.