Skip to content
AI Infrastructure
AI Infrastructure10 min read0 views

NATS as a WebSocket Bus: Realtime Message Routing for AI Agents

Why NATS 2.14 became a credible alternative to Redis pub/sub for WebSocket fan-out in 2026: native browser support, JetStream durability, and how it wires into AI agents.

Redis pub/sub is fire-and-forget. NATS JetStream is fire-and-replay. For voice agents that need to recover gracefully from a missed event, that distinction is worth a dedicated bus.

Why is NATS a credible WebSocket bus?

flowchart LR
  Twilio["Twilio Media Streams"] -- "WS · μlaw 8kHz" --> Bridge["FastAPI Bridge :8084"]
  Bridge -- "PCM16 24kHz" --> OAI["OpenAI Realtime"]
  OAI --> Bridge
  Bridge --> Twilio
  Bridge --> Logs[(structured logs · OTel)]
CallSphere reference architecture

Because NATS 2.14 (released April 30, 2026) speaks WebSocket natively, supports TLS and Origin checking out of the box, and ships a first-class browser client (nats.ws) that reaches the same subjects as backend services. That means you can have a single message-bus topology where browsers, mobile, and microservices subscribe to the same subjects, with the broker enforcing auth and routing.

Compared to Redis pub/sub, NATS gives you:

  • Durable streams via JetStream — replay missed messages on reconnect.
  • Subject-based routing with hierarchical wildcards (agent.healthcare.*.transcript) instead of flat channels.
  • Built-in clustering and gateways for multi-region without sticky sessions.
  • Authorization on subject — clients can subscribe only to subjects their JWT authorizes.

The cost is more operational complexity than Redis. But for a multi-tenant voice agent platform, the JetStream replay alone is worth it.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

How does the architecture look?

A typical NATS-backed WebSocket platform has three layers:

  1. NATS cluster — three nodes minimum, with JetStream enabled on a dedicated stream per tenant or per subject pattern.
  2. WebSocket-facing pods — either NATS native WebSocket (clients connect directly to the broker) or a thin gateway (your service translates application protocol to NATS subjects).
  3. AI agent services — subscribe to agent.<tenant>.<session>.input, publish to agent.<tenant>.<session>.output. Stateless, horizontally scalable.

For multi-region, NATS gateways federate clusters: a publish in us-east-1 propagates to us-west-2 only for subjects with subscribers there, which is dramatically cheaper than full Redis replication.

CallSphere's implementation

CallSphere is migrating one specific workload to NATS: the multi-tenant analytics fan-out for the Sales Calling dashboard. Each tenant subscribes to tenant.<id>.calls.* from the dashboard. JetStream gives us a 24-hour replay window so a manager opening the dashboard at 9 a.m. sees every event from the overnight shift without us hitting Postgres.

The hot voice paths still use Socket.IO + Redis adapter and direct OpenAI WebSocket because the throughput is higher and latency budget is tighter. NATS owns the durable, replay-friendly, multi-tenant fan-out where audit completeness matters more than absolute throughput.

Code: NATS WebSocket subscriber in the browser

import { connect, JetStreamManager } from "nats.ws";

const nc = await connect({
  servers: ["wss://nats.callsphere.ai:9222"],
  token: shortLivedJwt,
});

const js = nc.jetstream();
const sub = await js.subscribe("tenant.acme.calls.*", {
  config: { deliver_policy: "last_per_subject" },
});

for await (const msg of sub) {
  const evt = msg.json();
  dashboard.update(evt);
  msg.ack();
}

Build steps

  1. Provision a 3-node NATS cluster with JetStream enabled. Use nats-server 2.14+ for native WebSocket and shard support.
  2. Configure WebSocket on port 9222 with TLS, Origin allowlist, and JWT-based authorization.
  3. Define streams per tenant or per subject pattern; set retention policy (typically limits with 24h max age).
  4. Use nats.ws from the browser, identical nats from Node services. Same subjects either way.
  5. Wire JetStream consumers with explicit ACK so retry semantics are correct.
  6. Monitor nats_stream_messages, nats_consumer_pending_messages, and nats_websocket_clients in Prometheus.

FAQ

Can NATS replace Redis entirely? For pub/sub, yes. For caching and rate-limit counters, no — keep Redis for those.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Is NATS slower than Redis pub/sub? Higher per-message overhead (subject parsing, ACKs), but at scale it scales better because routing is hierarchical, not flat.

What about message ordering? JetStream preserves per-subject ordering. Redis pub/sub does not guarantee ordering across subscribers.

Can clients connect directly? Yes — NATS WebSocket is designed for direct browser connections with token-based auth.

Does it work cross-region? Yes via NATS gateways. Configure subject filtering so you only replicate the subjects you actually need across regions.

CallSphere connects 115+ database tables and 90+ tools across six verticals — message routing is the connective tissue. Start the 14-day trial at $149/$499/$1499.

Sources

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.