WebRTC won browser voice. But SIP over WebSocket (sipws) is still the right answer for embedding a phone inside a SaaS app where you do not need NAT traversal and you do need a familiar SIP trunk. The choice in 2026 is not WebRTC vs sipws but WebRTC vs WebRTC + sipws-control plane.

Background

flowchart TD
  Out[Outbound campaign] --> Twilio[Twilio Voice API]
  Twilio --> STIR[STIR/SHAKEN attestation]
  STIR --> Carrier[Originating carrier]
  Carrier --> Term[Terminating carrier]
  Term --> Recipient[Recipient phone]
  Recipient --> Webhook[/voice webhook/]
  Webhook --> Agent[AI sales agent]

CallSphere reference architecture

SIP over WebSocket (RFC 7118) was specified in 2014 to let SIP user agents run inside browsers. The browser opens a WSS connection to a SIP-over-WebSocket-aware SIP server (Kamailio, OpenSIPS, Asterisk PJSIP), and SIP messages flow over that WebSocket as plain SIP text. Media still uses WebRTC (DTLS-SRTP over UDP) - sipws is a signaling transport only.

For AI voice in 2026 the canonical browser flow is WebRTC + Realtime API direct: the browser establishes a peer connection to OpenAI's edge, audio flows over DTLS-SRTP, no SIP involved. But for products that need PSTN dial-out, transfer to a human, or an existing SIP-trunk billing relationship, sipws is still useful. JsSIP, SIP.js, and Twilio Voice SDK all use sipws (or its WebRTC-bridge equivalent) for signaling.

Technical deep-dive

A sipws REGISTER from a browser:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

REGISTER sip:callsphere.ai SIP/2.0
Via: SIP/2.0/WSS df7jal23ls0d.invalid;branch=z9hG4bK-5
From: <sip:[email protected]>;tag=abc123
To: <sip:[email protected]>
Call-ID: 1234@browser
CSeq: 1 REGISTER
Contact: <sip:[email protected];transport=ws>
Expires: 600

The "df7jal23ls0d.invalid" hostname is a placeholder; sipws clients use the WebSocket-Sec-Key as a stand-in for the network address since browsers do not expose their own IP/port to JavaScript. The server tracks the WebSocket connection internally for routing.

For AI voice agent dial-out from the browser:

INVITE sip:[email protected] SIP/2.0
Via: SIP/2.0/WSS df7jal23ls0d.invalid;branch=z9hG4bK-7
From: <sip:[email protected]>;tag=abc123
To: <sip:[email protected]>
Contact: <sip:[email protected];transport=ws>
Content-Type: application/sdp

v=0
m=audio 9 UDP/TLS/RTP/SAVPF 111
a=rtcp:9 IN IP4 0.0.0.0
a=fingerprint:sha-256 D2:9A:...:38
a=setup:actpass
a=ice-ufrag:F7gI

Note the SDP advertises DTLS-SRTP for media even though the SIP signaling is over WSS. The SBC on the server side bridges sipws-to-SIP and WebRTC-to-SRTP-on-trunk.

CallSphere implementation

CallSphere uses Twilio Programmable Voice across all six verticals. For browser-initiated calls (admin dashboard click-to-call, demo widget) we use Twilio Voice SDK which handles sipws-equivalent signaling internally and presents a clean JavaScript API. For Healthcare AI on FastAPI :8084 we never expose sipws to end users; calls always come through Twilio's PSTN edge or via Twilio Voice SDK. Sales Calling AI's 5 concurrent outbound calls per tenant fire from the server side, no browser sipws needed. After-Hours AI uses simul call+SMS to on-call staff with a 120-second timeout, server-originated. Across 37 agents, 90+ tools, 115+ DB tables, HIPAA + SOC 2 alignment, $149/$499/$1499 pricing, 14-day trial, and 22% affiliate, the policy is "Voice SDK for browser, REST + Twilio for server-side, no DIY sipws stack".

Implementation steps

Decide if you actually need SIP-trunk-style billing or interop. If yes, sipws helps; if no, use Twilio Voice SDK or direct WebRTC.
Choose a sipws-aware server: Kamailio with WS module, OpenSIPS, Asterisk with res_pjsip and PJSIP transport ws, or FreeSWITCH with mod_sofia ws transport.
Use a vetted client library: JsSIP, SIP.js, or Twilio Voice SDK.
Run sipws over WSS only - never WS - and front it with a TLS-terminating reverse proxy (nginx, HAProxy).
Authenticate the browser before issuing SIP credentials; do not embed REGISTER passwords in JavaScript bundles.
Use short-lived REGISTER tokens (10-60 minutes) tied to the user's logged-in session.
Bridge the call to your AI agent server-side via REFER or a B2BUA pattern - keep the AI prompt and tools out of the browser.
Test with sngrep on the server side; verify sipws traffic looks normal SIP after the WebSocket frame strip.

FAQ

Is sipws faster than WebRTC for AI voice? No. The signaling transport is irrelevant to media latency. WebRTC media is the same regardless.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Why not use sipws + WebRTC media? That is exactly what JsSIP/SIP.js do. The choice is whether you build the JsSIP-style stack yourself or use an SDK like Twilio Voice SDK that abstracts it.

Does sipws support SIP REFER for transfer? Yes. Transfer behavior is identical to wired SIP; the WebSocket is just a transport.

What about WebTransport? Experimental in 2026. Sipws over WebTransport is on draft work in the IETF SIPCORE working group but no production deployments yet.

Are there NAT traversal issues with sipws? Less than wired SIP because the browser opens an outbound WSS to your server, traversing most NATs cleanly. Media still needs ICE.

Sources

Start a 14-day trial on a browser-ready voice stack, see pricing, or contact us about embedded AI voice for SaaS apps.

SIP over WebSocket for Browser AI Voice in 2026: When sipws Beats WebRTC

Background

Technical deep-dive

CallSphere implementation

Implementation steps

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

WebRTC Mobile Testing with BrowserStack + Sauce Labs (2026)

OpenAI's May 2026 WebRTC Rearchitecture: How Voice Latency Got Real

MOS Call Quality Scoring for AI Voice Operations in 2026: Beyond 4.2

Building a Custom Calling Platform: Enterprise Guide

WebRTC vs WebSocket Voice: CallSphere Architecture Edge Over Vapi

Build a Voice Agent with LiveKit Agents Python SDK 1.5 (2026)