PCM16 at 24kHz burns 384 kbps. Opus at 32 kbps delivers indistinguishable quality. The bandwidth math says use Opus. The vendor API math sometimes says use PCM16. Here is when each wins.

The cost problem

flowchart LR
  Twilio["Twilio Media Streams"] -- "WS · μlaw 8kHz" --> Bridge["FastAPI Bridge :8084"]
  Bridge -- "PCM16 24kHz" --> OAI["OpenAI Realtime"]
  OAI --> Bridge
  Bridge --> Twilio
  Bridge --> Logs[(structured logs · OTel)]

CallSphere reference architecture

In 2026, voice agent infrastructure has two dominant audio formats: PCM16 (uncompressed linear PCM, 16-bit) and Opus (the modern WebRTC codec). Vendors handle them inconsistently — OpenAI Realtime accepts both but bills the same per-token rate; ElevenLabs prefers Opus for streaming TTS; Twilio Media Streams sends mu-law 8kHz by default.

The bandwidth gap is enormous (10× or more), but the cost picture is more complicated than just bytes-on-wire. Egress, codec CPU, latency, and quality all interact.

How each one prices it

PCM16 at 24kHz (typical for OpenAI Realtime):

Bitrate: 24,000 samples/sec × 16 bits = 384 kbps (48 kB/s)
5-minute call: 14.4 MB of audio one-way
No codec CPU on the wire side

Opus at 24kHz wideband, 32 kbps:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

Bitrate: 32 kbps (4 kB/s)
5-minute call: 1.2 MB of audio one-way
Opus encode/decode CPU cost: minimal (~1% of one core)

mu-law 8kHz (PSTN/Twilio default):

Bitrate: 64 kbps (8 kB/s)
5-minute call: 2.4 MB of audio
Quality cap: phone-quality, no high frequencies

Honest math: bandwidth cost at 10k concurrent calls

Pretend 10,000 concurrent voice agents, average 5-minute call, 60/40 caller-agent talk.

PCM16 24kHz both directions:

10k × 5 min × 60 s × 48 kB/s × 2 directions = ~28.8 TB/month if at peak constantly
Realistic monthly bandwidth: 200–600 GB/day depending on duty cycle
AWS egress at $0.09/GB: $540–$1,620/month
Cloudflare egress: free for traffic via CF

Opus 24kHz both directions:

Bandwidth: 1/12 of PCM16 = 17–50 GB/day
AWS egress: $45–$135/month

mu-law 8kHz:

Bandwidth: 1/6 of PCM16 = 33–100 GB/day
AWS egress: $90–$270/month
BUT: phone-quality cap, hurts STT accuracy

So at 10k concurrent, Opus saves $400–$1,500/month on egress vs PCM16 with no quality penalty.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

When PCM16 still wins

OpenAI Realtime input/output. OpenAI's API accepts PCM16 24kHz natively; sending Opus requires the API to decode internally (which is fine, but no token-cost savings).
Inside a data center. If audio never leaves the VPC, egress is free; PCM16 simplicity wins on CPU.
GPU-side processing. Some inference paths prefer pre-decoded PCM for direct tensor input; Opus decode adds a CPU hop.
Recording/archival of regulatory tier. PCM16 is lossless; some compliance regimes want originals.

When Opus wins

Across the open internet. WebRTC media flows: always Opus.
Mobile clients on 4G/5G. PCM16 chews battery and data; Opus does not.
High-concurrency egress. Above ~5k concurrent calls leaving the cloud, Opus saves real money.
Lossy networks. Opus has built-in PLC (packet loss concealment) and FEC (forward error correction); PCM16 over UDP is rough on lossy links.

How CallSphere optimizes

CallSphere uses a per-segment codec policy across the 6 verticals — 37 agents, 90+ tools, 115+ DB tables:

Caller ↔ edge transceiver: Opus 24kHz, 32 kbps, WebRTC. This is the open-internet hop where bandwidth matters most.
Edge transceiver ↔ inference plane: PCM16 24kHz over a private VPC link. No egress cost, simplest tensor input.
Healthcare Voice Agent on OpenAI Realtime: PCM16 24kHz end-to-end because OpenAI's API takes PCM16 natively.
Salon GlamBook on ElevenLabs: Opus 24kHz for the TTS stream because ElevenLabs streams Opus efficiently.
Twilio inbound: mu-law 8kHz from carriers, transcoded once to PCM16 24kHz at the bridge — never round-tripped.

The egress savings on the public-internet hops fund the lower-tier pricing tiers ($149 / $499 / $1499) — bandwidth is a real line item at our scale. The 14-day no-card trial lets you A/B Opus vs PCM16 on the demo cards.

Optimization checklist

Use Opus 24kHz for any audio crossing the public internet.
Use PCM16 24kHz inside your VPC where egress is free.
Transcode once at the boundary — never round-trip.
Pick 32 kbps Opus for voice-only; 48 kbps for high-quality stereo.
Enable Opus FEC on lossy networks (mobile, public WiFi).
Use mu-law 8kHz only for legacy PSTN bridges; never for AI input.
Transcode mu-law → PCM16 24kHz before STT for accuracy.
Monitor per-call bandwidth and CPU — codecs are cheap but not free.
For OpenAI Realtime, native PCM16 saves a decode hop.
For WebRTC edges, native Opus saves egress and battery.

FAQ

Is Opus quality really indistinguishable from PCM at 32 kbps? Yes for voice — listening tests show Opus 32 kbps wideband is transparent for speech. Music needs more.

Why does OpenAI Realtime want PCM16 24kHz? Direct tensor input, no decode CPU on the inference path. Simplifies their pipeline.

Is mu-law dead? For AI input, yes. Use it only for PSTN bridges and transcode once.

What about Opus 48kHz or 16kHz? 24kHz is the sweet spot for voice AI. 48kHz is overkill for speech. 16kHz is too narrow for natural-sounding TTS.

Does codec choice affect STT accuracy? Slightly. PCM16 24kHz wins by 0.5–1.5 percentage points WER on hard accents vs Opus 32kbps. The gap closes at 64 kbps Opus.

Sources

Opus Codec official site — https://opus-codec.org/
Opus Recommended Settings — https://wiki.xiph.org/Opus_Recommended_Settings
Telnyx Voice AI HD codecs — https://telnyx.com/resources/voice-ai-hd-codecs
VoIP codec list and bandwidth comparison — https://telnyx.com/resources/voip-codec-list

Opus 24kHz vs PCM16: Bandwidth and Cost Tradeoff in 2026

The cost problem

How each one prices it

Honest math: bandwidth cost at 10k concurrent calls

When PCM16 still wins

When Opus wins

How CallSphere optimizes

Optimization checklist

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

VoIP Telephone Number: The 2026 Guide to Getting a VoIP Number (Costs, Providers, Setup)

Latency vs Cost: A Decision Matrix for Voice AI Spend in 2026

AWS Bedrock + Transcribe + Polly Stitched vs Realtime: Real Cost

Agent Memory Cost Modeling in 2026: An Honest Numbers Walkthrough

Latency-Quality-Cost Triangle for LLM Selection in 2026

Caching Strategies for AI Apps: Multi-Layer Cache Design

Product

Resources

Company

Legal

Industries

Integrations

Solutions

Compare

Pillar Guides