"We use SRTP" is half an answer. The real question is how the keys get exchanged: in the SDP body protected by signaling TLS, or on the media path through DTLS handshake. Both can satisfy HIPAA. Picking wrong creates a quiet vulnerability that no auditor will catch but a determined attacker will.

Background

flowchart LR
  Phone["PSTN caller"] --> Carrier["Carrier"]
  Carrier -- "SIP INVITE" --> SBC["Session Border Controller"]
  SBC -- "SIP" --> PBX["Twilio / Asterisk"]
  PBX -- "RTP · Opus" --> Bridge["AI Voice Gateway"]
  Bridge --> AI["OpenAI Realtime"]
  AI --> Bridge
  Bridge --> PBX

CallSphere reference architecture

Secure RTP (RFC 3711) encrypts and authenticates RTP and RTCP packets using AES-CTR or AES-GCM with HMAC-SHA1 or HMAC-SHA256. The standard does not specify how the master key gets to both endpoints; that is the job of a separate key exchange protocol.

Two main options dominate in 2026. SDES (RFC 4568) carries the SRTP master key inside the SDP body of SIP signaling, base64-encoded in an a=crypto: attribute. It works only when the SIP signaling is protected with TLS - otherwise the key flies in cleartext. DTLS-SRTP (RFC 5763 and RFC 5764) runs a DTLS handshake on the media path itself; the SRTP keys are derived from the DTLS master secret. The SDP carries only DTLS fingerprints, not keys.

SDES is dominant on traditional SIP trunks. DTLS-SRTP is mandatory in WebRTC. For HIPAA-bound AI voice products, both are acceptable encryption mechanisms; the trick is making sure the boundary between SIP and WebRTC translates correctly.

Technical deep-dive

The SDES exchange in an INVITE looks like:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

m=audio 49170 RTP/SAVP 0 8
a=crypto:1 AES_CM_128_HMAC_SHA1_80
  inline:DhxqDt+Pw/3qD6FnJ4kEwImGRJ5lTZUjxVrn5Q==
a=crypto:2 AES_256_CM_HMAC_SHA1_80
  inline:S2RXxN+wKxIhLcF7HxJzC5pH4QrJzG7lA9XbF...

If the carrying INVITE is over UDP/5060 cleartext, this key is visible in any packet capture. SDES is only safe over SIP/TLS or SIPS.

DTLS-SRTP is different:

m=audio 49170 UDP/TLS/RTP/SAVPF 111
a=fingerprint:sha-256 D2:9A:5A:7E:...:6F
a=setup:actpass
a=ice-ufrag:F7gI
a=ice-pwd:x9cml/YzichV2+XlhiMu8g

Only the certificate fingerprint travels in SDP. The actual DTLS handshake happens on the RTP port; the resulting master secret is run through SRTP-extractor to derive client/server SRTP keys. Even if signaling is fully cleartext, the media is still safe (assuming the fingerprint in the SDP was authentic; SIP-over-TLS still helps, but it is not load-bearing).

For HIPAA the key control is "encryption in transit for ePHI". Both approaches satisfy the rule. The auditor questions to expect:

What cipher suites are negotiated? AES-CM is acceptable; AES-GCM preferred.
Is signaling protected? TLS 1.2+ minimum, TLS 1.3 strongly preferred.
Are keys logged anywhere? They should not be. Pen-test the bridge for key leakage in logs.
How is the SBC configured at the SIP/WebRTC boundary? It must terminate one and originate the other; cleartext mid-bridge is a violation.

CallSphere implementation

CallSphere is HIPAA-aligned and SOC 2-aligned. Every leg uses Twilio Programmable Voice with SIP/TLS and SRTP enforced; SDES with TLS protects the keys. For Healthcare AI on FastAPI :8084 the WebSocket bridge to OpenAI Realtime runs WSS (TLS 1.3); RTP-equivalent media frames inside the WebSocket are TLS-protected end-to-end. Sales Calling AI runs 5 concurrent outbound calls per tenant, all SRTP. After-Hours AI uses Twilio simul call+SMS with a 120-second timeout where every leg is encrypted. Across 37 agents, 90+ tools, 115+ DB tables, $149/$499/$1499 pricing, 14-day trial, and 22% affiliate, the encryption posture is uniform per vertical with quarterly cipher-suite review and audit-log retention.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Implementation steps

Force SIP/TLS on every external trunk; reject UDP/5060 entirely. Use TLS 1.2 minimum, prefer 1.3.
Enforce SRTP on the SDP answer; reject calls without an a=crypto line on TLS-protected SIP.
For browser-to-AI sessions, use DTLS-SRTP exclusively (WebRTC mandate).
At the SBC/bridge between SIP and WebRTC, terminate SDES on the SIP side and originate DTLS-SRTP on the WebRTC side; never run cleartext between them.
Pin AES-256 GCM where supported (RFC 7714); fall back to AES-128 CM only for legacy interop.
Audit your logs - no SDP body containing key material should ever appear in plaintext logs.
Run a packet capture from outside; verify no SRTP key is visible and that DTLS handshake actually happens.
Schedule quarterly cipher-suite review and annual third-party pen test.

FAQ

Is SDES less secure than DTLS-SRTP for HIPAA? Functionally similar when SIP/TLS is enforced. DTLS-SRTP is more robust because it does not depend on signaling-layer encryption.

Does HIPAA require a specific key exchange? No. HIPAA Security Rule 164.312(e)(1) requires "encryption in transit when reasonable and appropriate" without specifying mechanism. SDES + TLS is reasonable.

What about ZRTP or MIKEY? ZRTP is interesting (peer-to-peer, no PKI) but rare in 2026. MIKEY is mostly IMS/3GPP. SDES and DTLS-SRTP cover 99% of real deployments.

Will downgrade attacks affect us? Configure your SBC to refuse non-TLS signaling and non-SRTP media; that closes the obvious downgrade vectors.

Are there forward-secrecy concerns with SDES? Yes. SDES keys persist in the SDP record; a compromise of session storage exposes past sessions. DTLS-SRTP with ephemeral DHE provides forward secrecy.

Sources

Start a 14-day trial on a HIPAA-aligned voice stack, see pricing for $149/$499/$1499 tiers, or contact us about HIPAA compliance for AI voice.

SRTP Key Exchange for HIPAA in 2026: DTLS-SRTP, SDES, and the Bridge Between Them

Background

Technical deep-dive

CallSphere implementation

Implementation steps

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

GPT-Realtime-2 For Healthcare Voice: HIPAA and BAA Considerations

HIPAA Pen-Test and Risk Assessment for AI Voice in 2026

NVIDIA OpenShell Deep Dive: The Secure Runtime Behind Project Arc

Safety Evaluation for Agents: Jailbreak, Prompt Injection, and Tool-Misuse Test Suites in 2026

Input and Output Guardrails in the OpenAI Agents SDK: A Production Pattern (2026)

NeMo Guardrails vs LlamaGuard: Side-by-Side Comparison in 2026

Product

Resources

Company

Legal

Industries

Integrations

Solutions

Compare

Pillar Guides