---
title: "Cloudflare Agents SDK 2026: Durable Objects, MCP, and Code Mode at the Edge"
description: "Each Cloudflare agent runs on a Durable Object with its own SQLite, WebSockets, and scheduling. Agents Week 2026 shipped MCP, Code Mode, and 10GB SQLite per agent."
canonical: https://callsphere.ai/blog/vw3g-cloudflare-agents-sdk-2026-durable-objects-mcp-code-mode
category: "AI Infrastructure"
tags: ["Cloudflare", "Agents SDK", "Durable Objects", "MCP", "Edge"]
author: "CallSphere Team"
published: 2026-04-30T00:00:00.000Z
updated: 2026-05-07T09:59:38.290Z
---

# Cloudflare Agents SDK 2026: Durable Objects, MCP, and Code Mode at the Edge

> Each Cloudflare agent runs on a Durable Object with its own SQLite, WebSockets, and scheduling. Agents Week 2026 shipped MCP, Code Mode, and 10GB SQLite per agent.

> **TL;DR** — Cloudflare's Agents SDK puts each agent on its own Durable Object with a stateful SQLite database, WebSockets, and a scheduler. Agents Week 2026 shipped MCP support, Code Mode, 10GB SQLite per object, and Durable Object Facets for dynamically-created tenants. The economic argument — pay per request at the edge — gets very hard to beat at scale.

## The model in one paragraph

```mermaid
flowchart TD
  Client[MCP client · Claude Desktop] --> MCP[MCP server]
  MCP --> Tool1[Tool: Calendar]
  MCP --> Tool2[Tool: CRM]
  MCP --> Tool3[Tool: KB search]
  Tool1 --> SaaS1[(Calendly)]
  Tool2 --> SaaS2[(Salesforce)]
  Tool3 --> SaaS3[(Notion)]
```

CallSphere reference architecture

A Cloudflare Agent is a class that extends `Agent`. When you instantiate it, Cloudflare assigns it to a **Durable Object** — a single-instance, globally-addressable, stateful micro-server. That Durable Object has its own **SQLite database** (now 10GB per object), can hold open **WebSockets** to clients, can **schedule** itself for future work, and can call any other agent or service. Deploy once and the platform runs your agents across its global network, scaling to tens of millions of instances.

The mental model is closer to actor-based concurrency (Erlang, Akka) than to traditional serverless. Each agent is a long-lived actor with its own state.

## What Agents Week 2026 shipped

- **Agents SDK preview** — the next edition of the SDK, with built-in real-time voice, MCP, scheduling, tools, and persistent state.
- **MCP support** — agents can expose their tools to other agents and LLMs via MCP. Build remote MCP clients with transport and auth built-in.
- **Code Mode** — an MCP server that exposes the entire Cloudflare API in 1,000 tokens. The model writes code against a typed SDK rather than calling 500 individual tools. Big context-budget win.
- **Durable Object Facets** — Dynamic Workers can instantiate Durable Objects with isolated SQLite databases, enabling per-tenant agent infrastructure that's generated on-the-fly.
- **Storage billing for SQLite-backed Durable Objects** — enabled in January 2026 with up to 10GB per Durable Object.
- **Workflows control plane** — higher concurrency and creation rate limits for the durable execution engine.

## When Cloudflare Agents wins

Pick Cloudflare Agents when:

- Your agents need to **persist state per-user or per-conversation** without you operating Postgres.
- Your workload is **bursty and globally distributed** — pay per request, not for idle.
- You want **WebSocket-native real-time** baked into the agent, not bolted on.
- You're already on the Cloudflare stack (Workers, R2, D1, Queues).

Skip when:

- You need **GPU inference** in the same process as the agent (Cloudflare's AI is fine but not your custom GPU model).
- You need **deep Python ecosystem support** — the platform is JS/TS first.
- Your data must stay in a **single region with strict residency** that Cloudflare doesn't offer (rare).

## Code Mode — the token-efficiency trick

The classic MCP pattern: the model sees 200 tools in its context, picks one, calls it. That eats tokens.

Code Mode flips it: instead of describing every operation as a separate tool, expose a **typed SDK** to the model. The model writes a few lines of code (executed safely in the Worker isolate) that calls multiple SDK methods at once. One thousand tokens of SDK definitions replaces hundreds of tool definitions, and the model can compose operations naturally.

This is a particularly good fit for the Cloudflare API itself, which has hundreds of endpoints. Code Mode wraps them all in 1,000 tokens.

## How CallSphere thinks about Cloudflare Agents

CallSphere's voice runtime is OpenAI Realtime over WebRTC, hosted on AWS. We don't run our voice agents on Cloudflare today because the model provider is the bottleneck, not the orchestration layer.

But for our **public-facing widget agents** — the chat widget on every CallSphere landing page, the SEO content sandbox, the per-customer dashboard agent — Cloudflare Agents are the right architecture: a Durable Object per visitor session, WebSocket-native, no cold start, MCP-enabled. We've prototyped this for the chat widget and the latency story is excellent.

For our [affiliate program](/affiliate) we're considering moving the affiliate-side analytics agent to Cloudflare — each affiliate gets a Durable Object Facet with their own data, the agent answers their questions in real time, and we don't operate any of it.

Pricing: [$149 / $499 / $1499](/pricing). [14-day trial](/trial).

## Build steps — your first Cloudflare Agent

1. `npm create cloudflare@latest` and pick the Agent template.
2. Define the agent class extending `Agent`, with `onMessage`, `onConnect`, and `onSchedule` handlers.
3. Use `this.sql`` for the per-object SQLite database.
4. Wire MCP servers as tool sources.
5. Use `this.schedule(date, "method")` to wake the agent at a future time.
6. Deploy: `wrangler deploy`.
7. Wire to a frontend over WebSocket; the agent address routes to the same Durable Object every time.

## Code: a tiny stateful agent

```typescript
import { Agent } from "agents";

export class TriageAgent extends Agent {
  initialState = { count: 0 };

  async onMessage(message: { content: string }) {
    const c = this.state.count + 1;
    await this.setState({ count: c });
    await this.sql`INSERT INTO turns (content, at) VALUES (${message.content}, ${Date.now()})`;
    if (c % 10 === 0) await this.schedule(Date.now() + 86400000, "summarize");
    return `Got message ${c}`;
  }

  async summarize() {
    const rows = await this.sql`SELECT * FROM turns ORDER BY at DESC LIMIT 100`;
    // ...summarize and send to user
  }
}
```

## Durable Object Facets — multi-tenant agents the right way

Facets let a single Worker instantiate Durable Objects with **isolated SQLite databases per facet**. The mental model: one parent Worker, many tenant facets, each with its own data namespace and lifecycle.

Why this matters for AI products: every customer can have their own agent with their own memory and history, without you operating per-tenant infrastructure. The platform handles instantiation; you write one agent class and it's automatically multi-tenant.

This is what unlocks "build-your-own-agent" surfaces — let your customer's agent live on a Facet they generate, with their own data isolated from everyone else's. No shared-tenancy security questions; the boundary is in the platform.

## Workflows for long-running steps

Cloudflare Workflows is the durable execution engine that lives next to Agents. When your agent needs to do a multi-step process that survives restarts (research a prospect → draft email → wait for approval → send), you delegate to a Workflow. The Workflow checkpoints at each step; if anything dies it resumes from the last checkpoint.

The 2026 control-plane rearchitecture removed the older concurrency caps. You can now create Workflows at much higher rates and run far more concurrently — useful for any agent that fans out to multiple parallel steps.

## MCP at the edge

Cloudflare's MCP support is bidirectional: agents can act as MCP servers (exposing their tools to other LLMs) and as MCP clients (mounting external MCP servers as tool sources). Auth and transport are built-in, including OAuth flows for end-user-authorized access.

The Code Mode MCP server for the entire Cloudflare API is the canonical example of a "big" MCP surface compressed into 1,000 tokens. Steal the pattern for your own large API surfaces.

## Cost reality check

At low volumes Cloudflare Agents can be cheaper than running your own Postgres + Node service. At very high volumes (hundreds of millions of requests/month) the per-request pricing crosses over and a self-managed cluster wins. We model this for each new agent before picking the platform.

## FAQ

**Is the SDK GA or preview?** Preview as of Agents Week 2026, with stable surfaces shipping incrementally. Production-ready for greenfield projects; expect API tweaks.

**What does it cost?** Durable Objects pricing + Workers requests. SQLite storage starts billing in January 2026; 10GB per object included on the standard plan.

**Can I run Python here?** Limited via Pyodide. JS/TS is the native path.

**Does it integrate with the OpenAI Agents SDK?** You can use OpenAI's models from within a Cloudflare Agent; the wrapping is in JS.

**Where do I see this on CallSphere?** Book a [demo](/demo) and we'll show our chat widget Cloudflare Agent prototype.

## Sources

- [Cloudflare Agents Week 2026](https://www.cloudflare.com/agents-week/updates/)
- [cloudflare/agents on GitHub](https://github.com/cloudflare/agents)
- [Cloudflare Agents docs](https://developers.cloudflare.com/agents/)
- [Code Mode: agents in 1,000 tokens](https://blog.cloudflare.com/code-mode-mcp/)

---

Source: https://callsphere.ai/blog/vw3g-cloudflare-agents-sdk-2026-durable-objects-mcp-code-mode
