By Sagar Shankaran, Founder of CallSphere
SDES + TLS works for SIP trunks; DTLS-SRTP wins for WebRTC. For HIPAA-bound AI voice that crosses both worlds, the SBC is where the translation happens. Here is the 2026 reality.
Key takeaways
"We use SRTP" is half an answer. The real question is how the keys get exchanged: in the SDP body protected by signaling TLS, or on the media path through DTLS handshake. Both can satisfy HIPAA. Picking wrong creates a quiet vulnerability that no auditor will catch but a determined attacker will.
flowchart LR
Phone["PSTN caller"] --> Carrier["Carrier"]
Carrier -- "SIP INVITE" --> SBC["Session Border Controller"]
SBC -- "SIP" --> PBX["Twilio / Asterisk"]
PBX -- "RTP · Opus" --> Bridge["AI Voice Gateway"]
Bridge --> AI["OpenAI Realtime"]
AI --> Bridge
Bridge --> PBXSecure RTP (RFC 3711) encrypts and authenticates RTP and RTCP packets using AES-CTR or AES-GCM with HMAC-SHA1 or HMAC-SHA256. The standard does not specify how the master key gets to both endpoints; that is the job of a separate key exchange protocol.
Two main options dominate in 2026. SDES (RFC 4568) carries the SRTP master key inside the SDP body of SIP signaling, base64-encoded in an a=crypto: attribute. It works only when the SIP signaling is protected with TLS - otherwise the key flies in cleartext. DTLS-SRTP (RFC 5763 and RFC 5764) runs a DTLS handshake on the media path itself; the SRTP keys are derived from the DTLS master secret. The SDP carries only DTLS fingerprints, not keys.
SDES is dominant on traditional SIP trunks. DTLS-SRTP is mandatory in WebRTC. For HIPAA-bound AI voice products, both are acceptable encryption mechanisms; the trick is making sure the boundary between SIP and WebRTC translates correctly.
The SDES exchange in an INVITE looks like:
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
m=audio 49170 RTP/SAVP 0 8
a=crypto:1 AES_CM_128_HMAC_SHA1_80
inline:DhxqDt+Pw/3qD6FnJ4kEwImGRJ5lTZUjxVrn5Q==
a=crypto:2 AES_256_CM_HMAC_SHA1_80
inline:S2RXxN+wKxIhLcF7HxJzC5pH4QrJzG7lA9XbF...
If the carrying INVITE is over UDP/5060 cleartext, this key is visible in any packet capture. SDES is only safe over SIP/TLS or SIPS.
DTLS-SRTP is different:
m=audio 49170 UDP/TLS/RTP/SAVPF 111
a=fingerprint:sha-256 D2:9A:5A:7E:...:6F
a=setup:actpass
a=ice-ufrag:F7gI
a=ice-pwd:x9cml/YzichV2+XlhiMu8g
Only the certificate fingerprint travels in SDP. The actual DTLS handshake happens on the RTP port; the resulting master secret is run through SRTP-extractor to derive client/server SRTP keys. Even if signaling is fully cleartext, the media is still safe (assuming the fingerprint in the SDP was authentic; SIP-over-TLS still helps, but it is not load-bearing).
For HIPAA the key control is "encryption in transit for ePHI". Both approaches satisfy the rule. The auditor questions to expect:
CallSphere is HIPAA-aligned and SOC 2-aligned. Every leg uses Twilio Programmable Voice with SIP/TLS and SRTP enforced; SDES with TLS protects the keys. For Healthcare AI on FastAPI :8084 the WebSocket bridge to OpenAI Realtime runs WSS (TLS 1.3); RTP-equivalent media frames inside the WebSocket are TLS-protected end-to-end. Sales Calling AI runs 5 concurrent outbound calls per tenant, all SRTP. After-Hours AI uses Twilio simul call+SMS with a 120-second timeout where every leg is encrypted. Across 37 agents, 90+ tools, 115+ DB tables, $149/$499/$1499 pricing, 14-day trial, and 22% affiliate, the encryption posture is uniform per vertical with quarterly cipher-suite review and audit-log retention.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
a=crypto line on TLS-protected SIP.Is SDES less secure than DTLS-SRTP for HIPAA? Functionally similar when SIP/TLS is enforced. DTLS-SRTP is more robust because it does not depend on signaling-layer encryption.
Does HIPAA require a specific key exchange? No. HIPAA Security Rule 164.312(e)(1) requires "encryption in transit when reasonable and appropriate" without specifying mechanism. SDES + TLS is reasonable.
What about ZRTP or MIKEY? ZRTP is interesting (peer-to-peer, no PKI) but rare in 2026. MIKEY is mostly IMS/3GPP. SDES and DTLS-SRTP cover 99% of real deployments.
Will downgrade attacks affect us? Configure your SBC to refuse non-TLS signaling and non-SRTP media; that closes the obvious downgrade vectors.
Are there forward-secrecy concerns with SDES? Yes. SDES keys persist in the SDP record; a compromise of session storage exposes past sessions. DTLS-SRTP with ephemeral DHE provides forward secrecy.
Start a 14-day trial on a HIPAA-aligned voice stack, see pricing for $149/$499/$1499 tiers, or contact us about HIPAA compliance for AI voice.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
Using GPT-Realtime-2 for healthcare voice agents. BAA scope, PHI handling, retention, logging, and why a managed platform usually wins this build.
The 2024 NPRM proposes mandatory penetration tests every 12 months and vulnerability scans every 6 months. Here is how an AI voice agent should be tested in 2026.
Inside NVIDIA OpenShell — the open-source secure runtime for autonomous desktop agents. Sandboxing, policy enforcement, and why it matters in 2026.
How to build a safety eval pipeline that runs known jailbreak corpora, prompt-injection attacks, and tool-misuse scenarios on every release — and gates merges on it.
Stop the agent BEFORE it does the wrong thing. How to wire input and output guardrails in the OpenAI Agents SDK with cheap classifiers and an eval suite that proves they work.
NeMo Guardrails and LlamaGuard solve overlapping problems with different architectures. The trade-offs once you push them past 100 RPS in production agent stacks.
© 2026 CallSphere LLC. All rights reserved.