---
title: "Cloudflare Workers + Durable Objects at 10k Concurrent: Real Cost"
description: "We modeled 10,000 concurrent voice agent WebSockets on Cloudflare. With hibernation and the 20:1 message ratio, the bill lands surprisingly low. Here is the line-by-line math."
canonical: https://callsphere.ai/blog/vw2c-cloudflare-workers-durable-objects-10k-concurrent-cost
category: "AI Infrastructure"
tags: ["Cloudflare", "Durable Objects", "WebSockets", "Cost", "Edge"]
author: "CallSphere Team"
published: 2026-04-12T00:00:00.000Z
updated: 2026-05-07T09:32:11.123Z
---

# Cloudflare Workers + Durable Objects at 10k Concurrent: Real Cost

> We modeled 10,000 concurrent voice agent WebSockets on Cloudflare. With hibernation and the 20:1 message ratio, the bill lands surprisingly low. Here is the line-by-line math.

> We modeled 10,000 concurrent voice agent WebSockets on Cloudflare. With hibernation and the 20:1 message ratio, the bill lands surprisingly low. Here is the line-by-line math.

## The cost problem

```mermaid
flowchart LR
  Twilio["Twilio Media Streams"] -- "WS · μlaw 8kHz" --> Bridge["FastAPI Bridge :8084"]
  Bridge -- "PCM16 24kHz" --> OAI["OpenAI Realtime"]
  OAI --> Bridge
  Bridge --> Twilio
  Bridge --> Logs[(structured logs · OTel)]
```

CallSphere reference architecture

If you are building a chat or voice agent platform that needs to hold persistent WebSocket connections — for control messages, transcript streaming, or session state — the cheapest place to do that in 2026 is almost always Cloudflare Workers + Durable Objects.

But the pricing has three knobs (requests, GB-seconds, WebSocket message ratios) and people confuse "incoming WebSocket message" with "request" and end up with billing surprises. Let us walk it.

## How Cloudflare prices it

**Workers Paid plan ($5/month minimum) includes:**

- 10M Workers requests/month
- 30M CPU-ms/month

**Durable Objects pricing on top of Workers Paid:**

- 1M DO requests/month included; $0.15 per million after
- 400k GB-seconds/month included; $12.50 per million GB-s after
- WebSocket incoming messages: 20:1 billing ratio (20 messages = 1 billable request)
- Outgoing messages and protocol pings: free
- Each new WebSocket connection counts as 1 request

**Storage (SQLite-backed DO, billed January 2026 onward):**

- 25B row reads/month free, then $0.001/M
- 50M row writes/month free, then $1.00/M
- 5 GB-month included, then $0.20/GB-month

**Hibernation API:**

- Clients stay connected while the DO is hibernated
- GB-second charges do NOT accrue during hibernation

## Honest math: 10,000 concurrent WebSockets

Pretend a typical voice agent control plane:

- 10,000 concurrent connections held for an average of 8 minutes each
- 5 control messages per second per connection (transcript chunks, tool events)
- Each connection makes 80 storage row writes (turn-by-turn log)

**Connection count math:**

- 10,000 concurrent × (60 / 8) connections per hour per slot = 75,000 new connections/hour
- 75k × 24 × 30 = **54M connections/month**

**Connection cost (each new = 1 request):**

- 54M × $0.15 / 1M = **$8.10**

**Incoming WebSocket message cost:**

- 54M conns × 8 min × 60s × 5 msgs/s = 1.296B incoming messages
- 1.296B × (1 / 20) ratio = 64.8M billable requests
- 64.8M × $0.15 / 1M = **$9.72**

**GB-seconds (assume 32MB per DO instance, hibernated 50% of the time):**

- Active DO-seconds: 54M × 8 min × 60s × 0.5 = 12.96B DO-seconds active
- Active DO GB-seconds: 12.96B × 0.032 = 415M GB-s
- Cost: (415M − 0.4M free) × $12.50 / 1M = **$5,180**

That is the big line item: GB-seconds. **Hibernation matters enormously here** — if you hibernate 80% of the time instead of 50%, GB-seconds drop to ~$2,070.

**Storage:**

- 54M conns × 80 writes = 4.32B row writes/month
- (4.32B − 50M free) × $1 / 1M = **$4,270**

**Storage reads (assume 5x per write):**

- 21.6B reads, free at 25B/month included for typical pricing → **~$0**

**Egress / Workers requests:**

- 54M × 1 = $0.15/M handled in DO request cost above

**Total at 10k concurrent:** ~$5,200 on GB-seconds + ~$4,270 storage writes + ~~$18 requests = **~~$9,488/month**.

That is roughly **$0.95 per 1,000 concurrent voice sessions** — extraordinary if you are coming from Pusher, Ably, or self-hosted Erlang.

## Optimization wins

1. **Aggressive hibernation.** The 80% hibernated case cuts the bill by 40%.
2. **Batch row writes.** 80 per call to 12 per call cuts storage from $4,270 to ~$640.
3. **Use Workers WebSockets directly without a DO** when you do not need state — that path bills at flat Workers rates and avoids the 20:1 ratio entirely. Best for fanout-only patterns.

## How CallSphere optimizes

CallSphere uses Cloudflare Workers + Durable Objects for the chat agent control plane on three of the 6 verticals (Sales, Salon GlamBook, OneRoof Real Estate) — voice audio itself flows over OpenAI Realtime or LiveKit, but the session state, transcript streaming, and per-tenant routing live on Cloudflare.

We hit ~85% hibernation rate on idle DOs, batch row writes to 8 per call, and use a single Worker route for all 6 verticals (multi-tenant) with the tenant ID hashed into the DO ID. Net cost across 6 verticals — 37 agents, 90+ tools, 115+ DB tables — is well under $400/mo on Cloudflare for the realtime control plane.

That savings is part of why our [pricing tiers](/pricing) ($149 / $499 / $1499) work for SMB margins and the [affiliate program](/affiliate) is sustainable. Try the [14-day no-card trial](/trial) to see the snappy chat product cards on [/demo](/demo) — that is the Cloudflare-DO pipeline in action.

## Optimization checklist

1. Use Hibernation API everywhere idle WebSocket connections sit.
2. Batch row writes to once per turn instead of per message.
3. Compact transcript snapshots — store deltas, not full state.
4. Use Workers WebSockets without a DO if you do not need state (for fanout).
5. Avoid storing audio in DO storage — push it to R2 or upstream.
6. Pin DO instances per tenant — better cache locality, lower CPU-ms.
7. Use SQLite-backed DO over KV-backed (cheaper, better included tier).
8. Watch the 20:1 ratio: chatty clients eat your request budget.
9. Use heartbeats only on idle paths — frequent pings still wake the DO.
10. Re-test cost monthly — Cloudflare added storage billing in January 2026.

## FAQ

**What is the 20:1 WebSocket ratio?**
Cloudflare counts 20 incoming WebSocket messages as 1 billable request — making chatty real-time apps cheaper.

**Does hibernation work mid-call?**
Yes — if no JavaScript handler is actively running, the DO can hibernate and the WebSocket stays open. Costs only resume when a handler runs.

**Can I run STT and LLM in a Worker?**
You can call out to OpenAI/Deepgram from Workers, but you should not run inference inside a Worker — use Workers AI or external GPU.

**Is this cheaper than self-hosted Erlang/Phoenix?**
At under 50k concurrent, Cloudflare wins by 5–10× on TCO. Above 250k, self-hosted starts to compete.

**What about R2 for audio storage?**
$0.015/GB-month with zero egress is the cheapest place to keep call recordings. Pair with DO for control plane.

## Sources

- Cloudflare Durable Objects Pricing — [https://developers.cloudflare.com/durable-objects/platform/pricing/](https://developers.cloudflare.com/durable-objects/platform/pricing/)
- Cloudflare Workers Pricing — [https://developers.cloudflare.com/workers/platform/pricing/](https://developers.cloudflare.com/workers/platform/pricing/)
- Cloudflare Hibernation API — [https://developers.cloudflare.com/durable-objects/best-practices/websockets/](https://developers.cloudflare.com/durable-objects/best-practices/websockets/)
- New Workers Pricing announcement — [https://blog.cloudflare.com/workers-pricing-scale-to-zero/](https://blog.cloudflare.com/workers-pricing-scale-to-zero/)

---

Source: https://callsphere.ai/blog/vw2c-cloudflare-workers-durable-objects-10k-concurrent-cost
