TL;DR — Cloudflare's Agents SDK puts each agent on its own Durable Object with a stateful SQLite database, WebSockets, and a scheduler. Agents Week 2026 shipped MCP support, Code Mode, 10GB SQLite per object, and Durable Object Facets for dynamically-created tenants. The economic argument — pay per request at the edge — gets very hard to beat at scale.

The model in one paragraph

flowchart TD
  Client[MCP client · Claude Desktop] --> MCP[MCP server]
  MCP --> Tool1[Tool: Calendar]
  MCP --> Tool2[Tool: CRM]
  MCP --> Tool3[Tool: KB search]
  Tool1 --> SaaS1[(Calendly)]
  Tool2 --> SaaS2[(Salesforce)]
  Tool3 --> SaaS3[(Notion)]

CallSphere reference architecture

A Cloudflare Agent is a class that extends Agent. When you instantiate it, Cloudflare assigns it to a Durable Object — a single-instance, globally-addressable, stateful micro-server. That Durable Object has its own SQLite database (now 10GB per object), can hold open WebSockets to clients, can schedule itself for future work, and can call any other agent or service. Deploy once and the platform runs your agents across its global network, scaling to tens of millions of instances.

The mental model is closer to actor-based concurrency (Erlang, Akka) than to traditional serverless. Each agent is a long-lived actor with its own state.

What Agents Week 2026 shipped

Agents SDK preview — the next edition of the SDK, with built-in real-time voice, MCP, scheduling, tools, and persistent state.
MCP support — agents can expose their tools to other agents and LLMs via MCP. Build remote MCP clients with transport and auth built-in.
Code Mode — an MCP server that exposes the entire Cloudflare API in 1,000 tokens. The model writes code against a typed SDK rather than calling 500 individual tools. Big context-budget win.
Durable Object Facets — Dynamic Workers can instantiate Durable Objects with isolated SQLite databases, enabling per-tenant agent infrastructure that's generated on-the-fly.
Storage billing for SQLite-backed Durable Objects — enabled in January 2026 with up to 10GB per Durable Object.
Workflows control plane — higher concurrency and creation rate limits for the durable execution engine.

When Cloudflare Agents wins

Pick Cloudflare Agents when:

Your agents need to persist state per-user or per-conversation without you operating Postgres.
Your workload is bursty and globally distributed — pay per request, not for idle.
You want WebSocket-native real-time baked into the agent, not bolted on.
You're already on the Cloudflare stack (Workers, R2, D1, Queues).

Skip when:

You need GPU inference in the same process as the agent (Cloudflare's AI is fine but not your custom GPU model).
You need deep Python ecosystem support — the platform is JS/TS first.
Your data must stay in a single region with strict residency that Cloudflare doesn't offer (rare).

Code Mode — the token-efficiency trick

The classic MCP pattern: the model sees 200 tools in its context, picks one, calls it. That eats tokens.

Code Mode flips it: instead of describing every operation as a separate tool, expose a typed SDK to the model. The model writes a few lines of code (executed safely in the Worker isolate) that calls multiple SDK methods at once. One thousand tokens of SDK definitions replaces hundreds of tool definitions, and the model can compose operations naturally.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

This is a particularly good fit for the Cloudflare API itself, which has hundreds of endpoints. Code Mode wraps them all in 1,000 tokens.

How CallSphere thinks about Cloudflare Agents

CallSphere's voice runtime is OpenAI Realtime over WebRTC, hosted on AWS. We don't run our voice agents on Cloudflare today because the model provider is the bottleneck, not the orchestration layer.

But for our public-facing widget agents — the chat widget on every CallSphere landing page, the SEO content sandbox, the per-customer dashboard agent — Cloudflare Agents are the right architecture: a Durable Object per visitor session, WebSocket-native, no cold start, MCP-enabled. We've prototyped this for the chat widget and the latency story is excellent.

For our affiliate program we're considering moving the affiliate-side analytics agent to Cloudflare — each affiliate gets a Durable Object Facet with their own data, the agent answers their questions in real time, and we don't operate any of it.

Pricing: $149 / $499 / $1499. 14-day trial.

Build steps — your first Cloudflare Agent

npm create cloudflare@latest and pick the Agent template.
Define the agent class extending Agent, with onMessage, onConnect, and onSchedule handlers.
Use `this.sql`` for the per-object SQLite database.
Wire MCP servers as tool sources.
Use this.schedule(date, "method") to wake the agent at a future time.
Deploy: wrangler deploy.
Wire to a frontend over WebSocket; the agent address routes to the same Durable Object every time.

Code: a tiny stateful agent

import { Agent } from "agents";

export class TriageAgent extends Agent<Env, { count: number }> {
  initialState = { count: 0 };

  async onMessage(message: { content: string }) {
    const c = this.state.count + 1;
    await this.setState({ count: c });
    await this.sql`INSERT INTO turns (content, at) VALUES (${message.content}, ${Date.now()})`;
    if (c % 10 === 0) await this.schedule(Date.now() + 86400000, "summarize");
    return `Got message ${c}`;
  }

  async summarize() {
    const rows = await this.sql`SELECT * FROM turns ORDER BY at DESC LIMIT 100`;
    // ...summarize and send to user
  }
}

Facets let a single Worker instantiate Durable Objects with isolated SQLite databases per facet. The mental model: one parent Worker, many tenant facets, each with its own data namespace and lifecycle.

Why this matters for AI products: every customer can have their own agent with their own memory and history, without you operating per-tenant infrastructure. The platform handles instantiation; you write one agent class and it's automatically multi-tenant.

This is what unlocks "build-your-own-agent" surfaces — let your customer's agent live on a Facet they generate, with their own data isolated from everyone else's. No shared-tenancy security questions; the boundary is in the platform.

Workflows for long-running steps

Cloudflare Workflows is the durable execution engine that lives next to Agents. When your agent needs to do a multi-step process that survives restarts (research a prospect → draft email → wait for approval → send), you delegate to a Workflow. The Workflow checkpoints at each step; if anything dies it resumes from the last checkpoint.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

The 2026 control-plane rearchitecture removed the older concurrency caps. You can now create Workflows at much higher rates and run far more concurrently — useful for any agent that fans out to multiple parallel steps.

MCP at the edge

Cloudflare's MCP support is bidirectional: agents can act as MCP servers (exposing their tools to other LLMs) and as MCP clients (mounting external MCP servers as tool sources). Auth and transport are built-in, including OAuth flows for end-user-authorized access.

The Code Mode MCP server for the entire Cloudflare API is the canonical example of a "big" MCP surface compressed into 1,000 tokens. Steal the pattern for your own large API surfaces.

Cost reality check

At low volumes Cloudflare Agents can be cheaper than running your own Postgres + Node service. At very high volumes (hundreds of millions of requests/month) the per-request pricing crosses over and a self-managed cluster wins. We model this for each new agent before picking the platform.

FAQ

Is the SDK GA or preview? Preview as of Agents Week 2026, with stable surfaces shipping incrementally. Production-ready for greenfield projects; expect API tweaks.

What does it cost? Durable Objects pricing + Workers requests. SQLite storage starts billing in January 2026; 10GB per object included on the standard plan.

Can I run Python here? Limited via Pyodide. JS/TS is the native path.

Does it integrate with the OpenAI Agents SDK? You can use OpenAI's models from within a Cloudflare Agent; the wrapping is in JS.

Where do I see this on CallSphere? Book a demo and we'll show our chat widget Cloudflare Agent prototype.

Cloudflare Agents SDK 2026: Durable Objects, MCP, and Code Mode at the Edge

The model in one paragraph

What Agents Week 2026 shipped

When Cloudflare Agents wins

Code Mode — the token-efficiency trick

How CallSphere thinks about Cloudflare Agents

Build steps — your first Cloudflare Agent

Code: a tiny stateful agent

Durable Object Facets — multi-tenant agents the right way

Workflows for long-running steps

MCP at the edge

Cost reality check

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

Building Multi-Agent Systems With MCP, A2A, And CallSphere As A Node

MCP vs A2A: When To Use Which Protocol (2026 Decision Guide)

MCP 1.0 and A2A: Developer Guide Takeaways for 2026 Protocol Picks

Cross-Vendor Agent Coordination: When Enterprises Actually Need A2A

MCP Registry Catalogs in 2026: Official Registry vs Smithery vs mcp.so

MCP Servers for SaaS Tools: A 2026 Registry Walkthrough for Voice Agent Teams

Product

Resources

Company

Legal

Industries

Integrations

Solutions

Compare

Pillar Guides

See AI Voice Agents in Action