By Sagar Shankaran, Founder of CallSphere
Each Cloudflare agent runs on a Durable Object with its own SQLite, WebSockets, and scheduling. Agents Week 2026 shipped MCP, Code Mode, and 10GB SQLite per agent.
Key takeaways
TL;DR — Cloudflare's Agents SDK puts each agent on its own Durable Object with a stateful SQLite database, WebSockets, and a scheduler. Agents Week 2026 shipped MCP support, Code Mode, 10GB SQLite per object, and Durable Object Facets for dynamically-created tenants. The economic argument — pay per request at the edge — gets very hard to beat at scale.
flowchart TD
Client[MCP client · Claude Desktop] --> MCP[MCP server]
MCP --> Tool1[Tool: Calendar]
MCP --> Tool2[Tool: CRM]
MCP --> Tool3[Tool: KB search]
Tool1 --> SaaS1[(Calendly)]
Tool2 --> SaaS2[(Salesforce)]
Tool3 --> SaaS3[(Notion)]A Cloudflare Agent is a class that extends Agent. When you instantiate it, Cloudflare assigns it to a Durable Object — a single-instance, globally-addressable, stateful micro-server. That Durable Object has its own SQLite database (now 10GB per object), can hold open WebSockets to clients, can schedule itself for future work, and can call any other agent or service. Deploy once and the platform runs your agents across its global network, scaling to tens of millions of instances.
The mental model is closer to actor-based concurrency (Erlang, Akka) than to traditional serverless. Each agent is a long-lived actor with its own state.
Pick Cloudflare Agents when:
Skip when:
The classic MCP pattern: the model sees 200 tools in its context, picks one, calls it. That eats tokens.
Code Mode flips it: instead of describing every operation as a separate tool, expose a typed SDK to the model. The model writes a few lines of code (executed safely in the Worker isolate) that calls multiple SDK methods at once. One thousand tokens of SDK definitions replaces hundreds of tool definitions, and the model can compose operations naturally.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
This is a particularly good fit for the Cloudflare API itself, which has hundreds of endpoints. Code Mode wraps them all in 1,000 tokens.
CallSphere's voice runtime is OpenAI Realtime over WebRTC, hosted on AWS. We don't run our voice agents on Cloudflare today because the model provider is the bottleneck, not the orchestration layer.
But for our public-facing widget agents — the chat widget on every CallSphere landing page, the SEO content sandbox, the per-customer dashboard agent — Cloudflare Agents are the right architecture: a Durable Object per visitor session, WebSocket-native, no cold start, MCP-enabled. We've prototyped this for the chat widget and the latency story is excellent.
For our affiliate program we're considering moving the affiliate-side analytics agent to Cloudflare — each affiliate gets a Durable Object Facet with their own data, the agent answers their questions in real time, and we don't operate any of it.
Pricing: $149 / $499 / $1499. 14-day trial.
npm create cloudflare@latest and pick the Agent template.Agent, with onMessage, onConnect, and onSchedule handlers.this.schedule(date, "method") to wake the agent at a future time.wrangler deploy.import { Agent } from "agents";
export class TriageAgent extends Agent<Env, { count: number }> {
initialState = { count: 0 };
async onMessage(message: { content: string }) {
const c = this.state.count + 1;
await this.setState({ count: c });
await this.sql`INSERT INTO turns (content, at) VALUES (${message.content}, ${Date.now()})`;
if (c % 10 === 0) await this.schedule(Date.now() + 86400000, "summarize");
return `Got message ${c}`;
}
async summarize() {
const rows = await this.sql`SELECT * FROM turns ORDER BY at DESC LIMIT 100`;
// ...summarize and send to user
}
}
Facets let a single Worker instantiate Durable Objects with isolated SQLite databases per facet. The mental model: one parent Worker, many tenant facets, each with its own data namespace and lifecycle.
Why this matters for AI products: every customer can have their own agent with their own memory and history, without you operating per-tenant infrastructure. The platform handles instantiation; you write one agent class and it's automatically multi-tenant.
This is what unlocks "build-your-own-agent" surfaces — let your customer's agent live on a Facet they generate, with their own data isolated from everyone else's. No shared-tenancy security questions; the boundary is in the platform.
Cloudflare Workflows is the durable execution engine that lives next to Agents. When your agent needs to do a multi-step process that survives restarts (research a prospect → draft email → wait for approval → send), you delegate to a Workflow. The Workflow checkpoints at each step; if anything dies it resumes from the last checkpoint.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
The 2026 control-plane rearchitecture removed the older concurrency caps. You can now create Workflows at much higher rates and run far more concurrently — useful for any agent that fans out to multiple parallel steps.
Cloudflare's MCP support is bidirectional: agents can act as MCP servers (exposing their tools to other LLMs) and as MCP clients (mounting external MCP servers as tool sources). Auth and transport are built-in, including OAuth flows for end-user-authorized access.
The Code Mode MCP server for the entire Cloudflare API is the canonical example of a "big" MCP surface compressed into 1,000 tokens. Steal the pattern for your own large API surfaces.
At low volumes Cloudflare Agents can be cheaper than running your own Postgres + Node service. At very high volumes (hundreds of millions of requests/month) the per-request pricing crosses over and a self-managed cluster wins. We model this for each new agent before picking the platform.
Is the SDK GA or preview? Preview as of Agents Week 2026, with stable surfaces shipping incrementally. Production-ready for greenfield projects; expect API tweaks.
What does it cost? Durable Objects pricing + Workers requests. SQLite storage starts billing in January 2026; 10GB per object included on the standard plan.
Can I run Python here? Limited via Pyodide. JS/TS is the native path.
Does it integrate with the OpenAI Agents SDK? You can use OpenAI's models from within a Cloudflare Agent; the wrapping is in JS.
Where do I see this on CallSphere? Book a demo and we'll show our chat widget Cloudflare Agent prototype.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
How to design a multi-agent system using MCP for tools and A2A for cross-vendor coordination, with a CallSphere voice agent as a participating node.
MCP is agent-to-tool. A2A is agent-to-agent. Here is a clear 2026 decision guide for builders choosing between (and combining) the two protocols.
Google's May 2026 MCP 1.0 + A2A developers guide is the cleanest protocol picker we have seen. The takeaways, in plain English, with a CallSphere lens.
A2A unlocks cross-vendor agent coordination, but most enterprise voice/chat workloads still ship faster on a single-vendor stack. Here is how to choose.
The Official MCP Registry hit API freeze v0.1. Smithery has 7,000+ servers, mcp.so has 19,700+, PulseMCP is hand-curated. We compare discovery, install, and security across the major catalogs.
The public MCP registry crossed 9,400 servers in April 2026. Here is a curated walkthrough of the SaaS MCP servers CallSphere mounts in production, with OAuth 2.1 PKCE patterns.
© 2026 CallSphere LLC. All rights reserved.
Watch how CallSphere handles real customer calls, schedules appointments, and processes payments — live.
Try Live DemoBook a DemoCalculate Your ROI