By Sagar Shankaran, Founder of CallSphere
How to fan out WebSocket events across multiple regions with Redis pub/sub: shard-aware topology, federation, and the latency math that decides where to put the broker.
Key takeaways
Stretching one Redis cluster across continents is the most expensive way to learn that pub/sub is not a database. Run a cluster per region and federate the subjects you actually need.
flowchart TD
Client[Client] --> Edge[Cloudflare Worker]
Edge -->|WS upgrade| DO[Durable Object]
DO --> AI[(OpenAI Realtime WS)]
AI --> DO
DO --> Client
DO -.hibernation.-> Storage[(Persisted state)]Because pub/sub assumes the broker is "near" the subscribers. A New York publish that has to round-trip to Sydney before fanning out adds 200 ms minimum. Multiply by every event in a voice conversation and the agent feels broken.
The 2026 best practice is co-located brokers per region plus a federation layer that bridges only the subjects that need cross-region delivery. Most subjects (per-session call audio, per-tenant dashboard) never leave their region. A few (global presence, system-wide announcements) are explicitly bridged.
A standard pattern looks like:
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
For voice agents specifically, you almost never want to replicate audio events. You want to replicate session metadata so a user who moves regions mid-call can resume.
CallSphere runs in two regions: us-east-1 (primary) and us-west-2 (failover and west-coast latency). The topology:
The dashboard works whether the manager is on the east or west coast because their session pins to the closer region; the data they see is replicated via federated subjects. We stress-tested across 115+ database tables at peak 37-agent load and the cross-region tail latency settled at 80 ms.
import { createClient } from "redis";
import { randomUUID } from "crypto";
async function federatedPublish(
channel: string,
payload: object,
regions: string[],
) {
const envelope = JSON.stringify({
id: randomUUID(),
ts: Date.now(),
payload,
});
await Promise.all(
regions.map(async (r) => {
const client = await getRegionalClient(r);
await client.publish(channel, envelope);
}),
);
}
sessionId → region in a globally consistent store and route reconnects accordingly.Can I just use a single AWS ElastiCache global datastore? It works for cache, not for pub/sub. Pub/sub messages are ephemeral and global datastore replication is async, so subscribers in the secondary region miss events.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
What is the latency cost of federation? 60–120 ms inter-region for AWS within the US, 200–300 ms US ↔ EU. Plan accordingly.
Should I use Kafka for federation instead? Kafka is overkill for short-lived realtime events but excellent if you also need durable replay. NATS JetStream gives you both at lower ops cost.
How do I handle a region failure? Failover the WebSocket DNS, let connections drop, clients reconnect to the surviving region with their session ID. The session ownership table tells the new region "yes, replay this session."
Can I run hot-hot? Yes — every region accepts connections, federation keeps them in sync. Cost: double the broker capacity.
CallSphere is built for six verticals with multi-region failover — the realtime infrastructure is one piece. Start a 14-day trial at $149/$499/$1499 or book a demo.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
How to actually observe a WebSocket fleet: ping/pong heartbeats, Prometheus metrics that matter, dead-man switches, and the alerts that fire before customers notice.
The 2024 NPRM proposes mandatory penetration tests every 12 months and vulnerability scans every 6 months. Here is how an AI voice agent should be tested in 2026.
By April 2026 CoreWeave shares are trading roughly 60% above its March 2024 IPO price, with Q1 2026 earnings re-rating the AI infrastructure cohort.
Infrastructure-level look at Claude Sonnet 4.6 Bedrock, including AWS AI, deployment topology, region availability, and cost considerations.
Infrastructure-level look at Claude Vertex Oregon, including Pacific Northwest cloud, deployment topology, region availability, and cost considerations.
Infrastructure-level look at Claude AWS Ohio, including Midwest cloud AI, deployment topology, region availability, and cost considerations.
© 2026 CallSphere LLC. All rights reserved.