- Published on
Cloudflare Agents and Durable Objects for AI Apps: A Practical Guide
- Authors

- Name
- Youngju Kim
- @fjvbn20031
Why Cloudflare Agents Fit AI Products So Well
According to Cloudflare’s official docs, each Agent runs on a Durable Object. A Durable Object is a stateful micro-server with its own SQL database, WebSocket connections, and scheduling. In practice, that means the basic building blocks an AI agent needs are already part of the platform: memory, realtime connectivity, long-running tasks, and per-agent state.
That is what makes the architecture appealing. Most AI apps still start from a stateless request-response model. But real agent products need more than that. They need to remember context, call tools, pause for approval, resume later, and stay connected while work is in progress. Cloudflare Agents is designed to make those behaviors feel native instead of bolted on.
Cloudflare says Agents run across its global network and can scale to tens of millions of instances. That is not just a scaling claim. It means you can design around one agent per user, one per ticket, or one per workflow without turning the platform into a coordination puzzle.
What a Durable Object-Backed Agent Means
Think of this architecture as "stateful micro-server" instead of "stateless function." The difference shows up immediately.
| Category | Typical stateless serverless | Cloudflare Agents + Durable Objects |
|---|---|---|
| State | Must be externalized | Lives with the agent |
| Connections | Short-lived by default | WebSocket-friendly and persistent |
| Scheduling | Separate cron or worker needed | Built into the agent model |
| Memory | Reconstructed per request | Kept as agent state |
| Scale | Requires state orchestration | Cloudflare handles global placement |
This matters for products where the unit of work is naturally persistent. Support bots, workflow copilots, approval systems, research assistants, and live collaborative tools all benefit from being able to keep context close to execution.
Cloudflare’s Agents API docs also say Agents require Durable Objects and that each Agent can have millions of instances. That makes the model a strong fit for per-user or per-case isolation.
The Patterns That Matter Most
Cloudflare’s starter patterns line up well with real AI product needs. The docs highlight four especially useful starting points.
- Streaming AI chat
- Server-side and client-side tools
- Human-in-the-loop approval
- Task scheduling
That combination covers the most common agent behaviors. Chat needs streaming, actions need tools, risky steps need approval, and cleanup or follow-up work needs scheduling.
Here is the mental model: a user asks a question, the agent streams a response, it calls a tool to verify data, it pauses if human approval is required, and it schedules a follow-up step if the task cannot finish immediately. All of that can live inside one agent boundary.
import { Agent } from "agents";
export class SupportAgent extends Agent {
async onRequest(request) {
const payload = await request.json();
// Persist state, call tools, and resume later if the workflow needs approval.
return new Response(JSON.stringify({ ok: true, payload }));
}
}
Model Choice Is Flexible
Cloudflare Agents is not locked to a single model provider. The docs say Workers AI is built in, and Agents can also call OpenAI, Anthropic, Google Gemini, or any service that exposes an OpenAI-compatible API.
That flexibility is useful for product design. You can keep the agent state and execution on Cloudflare, while choosing the model that best fits each task. For example:
- Fast replies can use a lightweight model
- Structured extraction can use a model with stronger instruction following
- Long reasoning can use a more capable model
- Cost-sensitive paths can start with Workers AI
This separation makes rollout easier. You can swap model providers without redesigning the entire state and execution layer.
Memory and State Are First-Class
A lot of AI products fail when they try to treat memory as an afterthought. Conversation history gets large, session glue gets messy, and the system becomes hard to reason about. Cloudflare Agents makes state a first-class part of the agent.
The docs say each individual Agent instance has its own SQL database that runs in the same context as the Agent itself. That suggests a practical structure like this:
- Store user preferences and profile data
- Store recent conversation summaries
- Store task progress and checkpoints
- Store approval flags and expiration timestamps
- Store short-lived results from external tools
The important design win is that memory does not have to live only in prompts. The application can keep structured state separately, which makes it much easier to decide what should be remembered, summarized, or discarded.
Long-Running Work and Human Approval
Agent products often do not finish in one turn. Document review, external API checks, approval waits, and retry loops all take time. Cloudflare Agents supports that kind of work more naturally than a plain request handler.
Human-in-the-loop is especially important when the action is risky or irreversible.
- Refunds, payments, and deletions
- Changes that affect external systems
- Actions involving sensitive data
- Steps that policy requires a human to approve
A good pattern is simple: the agent proposes the action, the human approves the narrow scope, and only then does the agent execute it. That gives you automation without losing control.
Propose action:
- Summarize the request
- Show side effects
- Ask for approval
After approval:
- Execute only the approved step
- Persist the decision in agent state
- Schedule a follow-up if needed
Observability Comes Built In
Cloudflare’s observability docs say Agents emit structured events for significant operations, and those events are silent by default with zero overhead when nobody is listening.
That is a strong operational default. You do not need to overload the hot path with custom logging just to understand what happened. Instead, you can subscribe to structured events for RPC calls, state changes, schedule execution, workflow transitions, MCP connections, and more.
For agent products, that is a big deal. Stateful systems are much easier to operate when the platform already exposes meaningful events instead of raw log noise.
MCP Deployment Fits the Same Model
Cloudflare’s MCP client docs say connections persist in the agent’s SQL storage, and when an agent connects to an MCP server, all tools from that server become available automatically.
That matters because MCP is not just a tool call. It is a durable relationship. You need to remember which server is connected, what tools are available, and how to reconnect cleanly. Durable Object state and SQL make that much easier than rebuilding everything from scratch on every request.
For deployment, a simple serve path is attractive. One path receives the request, keeps the state, maintains the MCP connection, and calls tools when needed. That simplicity is one of the main reasons this architecture is practical for agent products.
When This Beats Stateless Serverless
This architecture is a better fit when one or more of these are true:
- The agent needs to remember the user when they come back
- Realtime WebSocket connectivity matters
- Approval is part of the workflow
- Scheduling or retries are required
- Each user or case needs isolated state
- MCP or tool connections must persist
If the work is just a one-off text transformation, a normal Worker or other stateless function may still be enough. The key question is whether state and connection are part of the product’s core behavior.
Rollout Checklist
If we were launching this in production, I would check these items first.
- Define the agent boundary in one sentence. Decide whether it is per user, per ticket, or per workflow.
- Split state into clear buckets. Do not mix memory, progress, approval state, and temporary cache.
- Decide the model path. Pick a default provider, then decide where Workers AI, OpenAI, Anthropic, or Gemini fits.
- Identify actions that need approval before execution.
- List the background work that needs scheduling or retrying.
- Decide where observability events will be consumed.
- If MCP is involved, design persistence and reconnect logic up front.
- Define failure and recovery behavior before launch.
Final Takeaway
Cloudflare Agents and Durable Objects turn AI apps from "a chain of functions" into stateful product units. That is the real appeal. Memory, realtime connectivity, human approval, scheduling, MCP persistence, and structured observability all sit close to the agent instead of being scattered across separate services.
Stateless serverless still has a place, but agent products become much easier to build when state and connection are part of the platform rather than something you constantly rebuild around it. Cloudflare’s model gives you that structure without giving up global scale.