Claude Managed Agents Architecture: Sandboxes and MCP Tunnels
How Claude Managed Agents work end to end: self-hosted sandboxes, MCP tunnels, the control plane, trust boundaries, and where tokens really go.
Most teams adopt Claude Managed Agents because they want an agent that can run real commands, touch real files, and call real internal APIs without someone babysitting a laptop. The moment you give an agent that much reach, the interesting question stops being "can Claude reason about this?" and becomes "where does the code actually execute, and what can it touch?" That is an architecture question, and getting it wrong shows up as data leaks, runaway tool loops, and agents that quietly modify the wrong environment. This post walks the whole stack: the control plane that holds the model, the self-hosted sandbox where work happens, and the MCP tunnel that connects the two without exposing your network.
Key takeaways
- A managed agent splits into three layers: a hosted model/control plane, an execution sandbox you can self-host, and an MCP transport tying them together.
- The sandbox is the trust boundary — it holds credentials and runs commands; the model never gets raw secrets, only tool results.
- An MCP tunnel lets a cloud-hosted Claude reach private MCP servers without opening inbound ports, by reversing the connection direction.
- Context, tool schemas, and conversation state live in the control plane; side effects live in the sandbox. Keeping them separate is the core design principle.
- Token cost is dominated by tool-result payloads flowing back through the tunnel, not by the agent's instructions.
What a managed agent actually is
A Claude Managed Agent is a long-running agent loop where Anthropic hosts the model and orchestration, while you supply the environment it acts in. The agent is the loop: receive a goal, decide on a tool call, execute it, observe the result, and repeat until the goal is met or a stop condition fires. The "managed" part means you do not run the inference, the context assembly, or the tool-dispatch logic yourself — that runs in Anthropic's control plane against models like Opus 4.8 or Sonnet 4.6.
What you do own is the sandbox. A managed-agent sandbox is an isolated execution environment — typically a container or microVM you provision — that exposes a set of tools, often as MCP servers, and runs any command the agent decides to run. This separation is the whole point. The model is a powerful but untrusted planner; the sandbox is the trusted, scoped place where its plans turn into effects. You can reason about blast radius by reasoning about the sandbox alone.
The three layers, end to end
Trace a single step and the structure becomes obvious. The control plane assembles context — system prompt, tool schemas, prior turns — and asks the model what to do. The model emits a tool call. That call travels over the MCP tunnel to your sandbox. The sandbox executes it, captures the result, and ships the structured output back. The control plane appends that result to the running context and asks the model for the next step.
flowchart TD
A["Goal arrives at control plane"] --> B["Assemble context & tool schemas"]
B --> C{"Model picks an action"}
C -->|Tool call| D["Dispatch over MCP tunnel"]
D --> E["Sandbox runs command / hits API"]
E --> F["Structured result back through tunnel"]
F --> G["Append result to context"]
G --> C
C -->|Done| H["Return final answer & artifacts"]
Notice what crosses each boundary. Across the tunnel from control plane to sandbox: a tool name and JSON arguments. Across the tunnel from sandbox to control plane: a tool result, ideally compact and structured. The model's weights, your raw secrets, and your private network never meet directly. Secrets stay in the sandbox; the model only ever sees the redacted, structured consequence of using them.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Why the sandbox is the trust boundary
If you internalize one idea, make it this: the sandbox is where trust is enforced, because it is the only place with both credentials and the ability to cause effects. The control plane is smart but powerless on its own — it cannot reach your database, it can only ask the sandbox to. So your security model collapses to a familiar shape: scope what the sandbox can do, and you have scoped the agent.
Concretely, that means the sandbox holds API keys in its environment, mounts only the directories the task needs, and exposes tools whose schemas describe exactly what is callable. A tool that lists open invoices is safe; a tool that runs arbitrary SQL is a different risk class. Because the model only acts through these tools, the tool surface is your policy surface. Auditing an agent becomes auditing a list of tools plus the sandbox's network egress rules.
How the MCP tunnel works without opening ports
The hard networking problem is that the model runs in Anthropic's cloud and your MCP servers run on private infrastructure with no public inbound access. The tunnel solves this by reversing the connection: the sandbox dials out to the control plane and holds a persistent, authenticated session open. Tool calls then flow down that already-established channel. You never open an inbound firewall rule, never expose an MCP server to the internet, and never hand out a public URL for an internal tool.
This is the same shape as other reverse-tunnel systems, and it has the same operational properties. The session is long-lived, so you watch for reconnects and backoff. Calls are multiplexed over it, so one slow tool can head-of-line block others unless the transport supports concurrency. And because the sandbox initiates the connection, identity is established at dial time — the sandbox authenticates itself once, and every subsequent tool call inherits that trusted session rather than re-authenticating per request.
Where the tokens and the latency go
People assume the expensive part of an agent is the instructions. In a managed agent the dominant cost is usually the tool results streaming back through the tunnel and into context. Every observation the agent makes — a file it read, an API response it got, a log it tailed — becomes input tokens on the next turn, and it stays in context for the rest of the run. A chatty tool that returns 40 KB of JSON when the agent needed three fields will quietly multiply your bill across every subsequent step.
The architectural fix lives in the sandbox, not the prompt. Have tools return the minimum useful shape, paginate large results, and summarize logs server-side before they cross the tunnel. Latency follows the same logic: each loop iteration is at least one round trip across the tunnel plus one model call, so reducing the number of tool calls (by making each one more capable) often beats optimizing any single call. Design tools for the agent's decision, not for a human dashboard.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Common pitfalls
- Treating the model as the trust boundary. The model can be steered by injected content; only the sandbox's tool surface and egress rules actually constrain effects. Enforce limits there.
- Exposing raw, unscoped tools. An "execute shell" tool with full credentials makes your entire sandbox the blast radius. Prefer narrow, named tools that encode intent.
- Letting tool results balloon. Returning full payloads inflates context on every later turn. Trim and paginate at the source.
- Forgetting the tunnel is stateful. A dropped session mid-run can strand an agent. Add reconnect, idempotent tool design, and run-level checkpointing.
- No per-run isolation. Reusing one long-lived sandbox across runs lets state and secrets leak between tasks. Fresh, ephemeral sandboxes per run are safer.
Stand up a managed agent in 6 steps
- Provision an ephemeral sandbox (container or microVM) with only the mounts and credentials the task needs.
- Inside it, run one or more MCP servers exposing narrow, well-named tools with strict JSON schemas.
- Establish the outbound MCP tunnel from the sandbox to the control plane and authenticate the session once.
- Register the tool schemas and a focused system prompt with the agent in the control plane.
- Send a goal, then watch the tool-call trace — confirm the agent uses scoped tools and that results stay compact.
- Tear the sandbox down at the end of the run so no state or secrets persist into the next task.
Layer responsibilities at a glance
| Concern | Control plane | Sandbox |
|---|---|---|
| Holds model & context | Yes | No |
| Holds secrets/credentials | No | Yes |
| Executes side effects | No | Yes |
| Network reachability | Public, hosted | Private, outbound-only |
| Main cost driver | Model calls | Tool-result size |
Frequently asked questions
What is a Claude Managed Agent in one sentence?
A Claude Managed Agent is a long-running agent loop where Anthropic hosts the model and orchestration while you provide a self-hosted sandbox that holds credentials and executes the tools the agent decides to call.
Does the model ever see my secrets?
No. Secrets live in the sandbox's environment. The model only sees the structured result of a tool that used them, so you can redact or shape that result before it crosses the tunnel into context.
Why use an MCP tunnel instead of hosting MCP servers publicly?
A tunnel lets the sandbox dial out and hold an authenticated session open, so the control plane can reach private MCP servers without you opening any inbound port or exposing internal tools to the internet.
What usually drives cost in a managed agent?
Tool-result payloads. Each result becomes input tokens that persist for the rest of the run, so oversized responses compound. Trimming results at the source is the highest-leverage optimization.
Bringing agentic AI to your phone lines
CallSphere takes these same sandbox-and-tunnel patterns to voice and chat — agents that answer every call, securely call your internal tools mid-conversation, and book real work around the clock. See it live at callsphere.ai.
Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.