By Sagar Shankaran, Founder of CallSphere
What it actually takes to get a Socket.IO cluster past 100,000 concurrent connections in 2026: sharded Redis adapter, namespace partitioning, and the bottlenecks nobody warns you about.
Key takeaways
A single Node.js Socket.IO process tops out around 30k–40k connections. Everything past that is architecture, not configuration.
flowchart TD
Client[Client] --> Edge[Cloudflare Worker]
Edge -->|WS upgrade| DO[Durable Object]
DO --> AI[(OpenAI Realtime WS)]
AI --> DO
DO --> Client
DO -.hibernation.-> Storage[(Persisted state)]Because Socket.IO is not just a WebSocket library — it is an event protocol with rooms, namespaces, and reconnection semantics. The naive scaling story ("just add more pods") breaks because each pod only knows about the clients connected to itself. When agent-7 emits to a room that includes agent-12, but agent-12 lives on a different pod, the message is silently dropped unless your cluster has a shared message bus.
The bus is what the Redis adapter provides. With it, every io.to("call:abc").emit(...) becomes a Redis publish that every Socket.IO process subscribes to and locally fans out. Without it, your room broadcasts work in dev and silently fail in prod.
The 2026 best practice is the sharded Redis adapter, which uses Redis 7's sharded pub/sub feature. Standard Redis pub/sub is a single keyspace — every publish replicates to every subscriber on every shard, which makes the broker the bottleneck around 60k–80k connections per cluster.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Sharded pub/sub partitions channels by hash slot, so each Redis shard only carries traffic for the rooms it owns. We see linear scaling to 500k+ connections across an 8-shard Redis cluster because the broker no longer fans out globally — it fans out to the shard owners only.
Add namespace partitioning on top. A namespace in Socket.IO is an isolated event channel; you can put your dashboard namespace on one fleet of pods and your call-streaming namespace on another, with completely separate Redis clusters. This bounds the blast radius and lets you scale the hot path independently.
CallSphere runs Socket.IO across two surfaces: the Sales Calling dashboard (used by 37 agents and their managers) and the After-hours operator console. We use the sharded Redis adapter on a 3-shard ElastiCache cluster, two Socket.IO namespaces (/calls and /dashboard), and a NestJS gateway behind an AWS NLB with no sticky sessions.
Why no sticky sessions? Because the Redis adapter handles cross-pod broadcast, every connection can land on any pod, and the load balancer stays simple. We provisioned for 50k concurrent dashboard connections per region; the actual peak across six verticals is closer to 12k, which leaves room for the affiliate program announced last quarter.
import { Server } from "socket.io";
import { createShardedAdapter } from "@socket.io/redis-adapter";
import { createCluster } from "redis";
const pub = createCluster({ rootNodes: REDIS_NODES });
const sub = pub.duplicate();
await Promise.all([pub.connect(), sub.connect()]);
const io = new Server(httpServer, { transports: ["websocket"] });
io.adapter(createShardedAdapter(pub, sub));
io.of("/calls").on("connection", (socket) => {
socket.on("join", ({ callId }) => socket.join(`call:${callId}`));
});
@socket.io/redis-adapter 8.x or newer; older versions do not support sharded pub/sub.transports: ["websocket"] on the server. Long-polling triples your CPU at scale.socket_io_connected Prometheus gauge and redis_pubsub_channels to catch broker pressure.Do I still need sticky sessions for upgrade? Only if you allow long-polling fallback. Pure WebSocket transport opens once and stays — sticky is unnecessary.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
What is the per-pod ceiling? With uws enabled, ~50k. With the default engine, ~30k. RAM, not CPU, is usually the bound — 8 KB per connection is realistic.
Can I use a single Redis instance? Up to about 60k connections, yes. Beyond that, switch to clustered/sharded pub/sub.
How do I rate limit per user? The adapter exposes a connection middleware; record user_id → connection_count in Redis and reject above a threshold.
What about cross-region? Use a separate Redis cluster per region with a federation layer (NATS or Kafka) bridging publishes you actually need cross-region. Do not stretch one Redis cluster across regions.
CallSphere supports 115+ database tables and 90+ tools across six verticals — the dashboard fan-out is one piece of that. Start the 14-day trial at $149/$499/$1499.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
How to actually observe a WebSocket fleet: ping/pong heartbeats, Prometheus metrics that matter, dead-man switches, and the alerts that fire before customers notice.
The 2024 NPRM proposes mandatory penetration tests every 12 months and vulnerability scans every 6 months. Here is how an AI voice agent should be tested in 2026.
By April 2026 CoreWeave shares are trading roughly 60% above its March 2024 IPO price, with Q1 2026 earnings re-rating the AI infrastructure cohort.
Infrastructure-level look at Claude Sonnet 4.6 Bedrock, including AWS AI, deployment topology, region availability, and cost considerations.
Infrastructure-level look at Claude Vertex Oregon, including Pacific Northwest cloud, deployment topology, region availability, and cost considerations.
Infrastructure-level look at Claude AWS Ohio, including Midwest cloud AI, deployment topology, region availability, and cost considerations.
© 2026 CallSphere LLC. All rights reserved.