---
title: "Prompt & Context Design for Claude Managed Agents"
description: "What to put in a Claude agent's context and what to leave out: the inclusion decision, retrieval over inclusion, durable state, lean handoffs."
canonical: https://callsphere.ai/blog/prompt-context-design-for-claude-managed-agents
category: "Agentic AI"
tags: ["agentic ai", "claude", "context engineering", "managed agents", "prompt engineering", "multi-agent", "anthropic"]
author: "CallSphere Team"
published: 2026-04-05T09:32:44.000Z
updated: 2026-06-07T01:37:56.931Z
---

# Prompt & Context Design for Claude Managed Agents

> What to put in a Claude agent's context and what to leave out: the inclusion decision, retrieval over inclusion, durable state, lean handoffs.

The hardest skill in agent engineering is not writing prompts — it is deciding what to leave out. Context is a finite, expensive resource, and an outcome-driven Claude agent lives or dies on what you choose to put in front of it at each step. Stuff the window with everything "just in case" and the model's attention smears across irrelevant tokens; starve it and it hallucinates the gaps. This post is about that editorial judgment: what belongs in an agent's context, what to leave out, and the principles that tell you which is which.

Context engineering is the discipline of curating exactly the information a model needs for a given step — the instructions, tools, examples, and state — while excluding everything that would dilute its attention or inflate its cost. For Managed Agents, where an orchestrator and several subagents each maintain their own window, this discipline compounds. Every subagent is a fresh decision about what to include.

## Key takeaways

- Treat context as a budget you spend, not a bucket you fill — every token competes for the model's attention.
- Put in: the goal, the acceptance test, the minimal inputs, and the tools relevant to *this* step.
- Leave out: other subagents' reasoning, stale transcript history, and data the step can fetch on demand.
- Prefer **retrieval over inclusion** — let the agent pull large data through a tool rather than pre-loading it.
- Use durable state and structured handoffs so context can stay lean without losing facts.

## Why more context makes agents worse

It is tempting to believe that giving the model more information can only help. In practice, irrelevant context actively hurts. The model spends attention on tokens that do not bear on the task, raising the odds it latches onto a stale detail or a tangent. Long, padded contexts also cost more and respond slower. The useful frame is that context is a budget: every token you add must earn its place by changing what the model should do.

This is why a lean subagent with a tight brief routinely outperforms a "fully informed" one carrying the whole conversation. The lean agent's attention is concentrated on the one thing it must get right. The informed agent's attention is spread thin across history it will never use.

There is a cost dimension layered on top of the quality one. In a multi-agent run, context is re-paid at every turn — each tool result extends the window, and a bloated starting context means every subsequent step carries that weight forward. Because multi-agent runs already spend several times more tokens than a single agent, sloppy context turns an expensive pattern into a wasteful one. Disciplined context is not just about accuracy; it is the lever that keeps a fan-out affordable enough to use at all.

## The inclusion decision, visualized

For each piece of information, the decision is mechanical once you have the right test: does this change what the step should do, and can the step fetch it on demand instead? If it does not change behavior, cut it. If it changes behavior but is large and fetchable, retrieve it through a tool rather than pre-loading it.

```mermaid
flowchart TD
  A["Candidate context item"] --> B{"Changes this step's action?"}
  B -->|No| C["Leave it out"]
  B -->|Yes| D{"Large & fetchable on demand?"}
  D -->|Yes| E["Expose via tool, retrieve when needed"]
  D -->|No| F{"Needed across steps?"}
  F -->|Yes| G["Write to durable state, reference by key"]
  F -->|No| H["Include inline in this context"]
```

Running every candidate through this flow turns context design from guesswork into a checklist. Most items resolve to "leave it out" or "retrieve on demand," which is exactly why disciplined agents stay fast and cheap while sloppy ones bloat.

## What to put in context

A well-formed step context has four parts and little else. First, the **goal** — what this step must achieve, stated as an outcome. Second, the **acceptance test** — how the step knows it succeeded, so it can self-check before returning. Third, the **minimal inputs** — only the specific data this step operates on. Fourth, the **relevant tools** — the small set this step might call, each with a description that says when to use it.

That is usually enough. Notice what is absent: the project's full history, sibling subagents' deliberations, and large reference data that the step could fetch if it actually needs it. A context that contains only goal, test, inputs, and tools is one the model can reason about cleanly.

## What to leave out — and where it goes instead

The things you cut do not vanish; they move to a better home. Cross-step facts go into **durable state** and are referenced by key, so a later step can read them without every earlier message tagging along. Large data goes behind a **tool** and is retrieved on demand. Another subagent's reasoning is simply dropped — what the next step needs is that subagent's *result*, structured and self-describing, not its internal monologue.

The practical rule I use: if I cannot articulate how a token changes the model's next action, it does not belong in the context. "Background" is not a justification. Either it informs a decision this step makes, or it is noise wearing a helpful-looking label.

Retrieval over inclusion deserves its own emphasis because it inverts a habit most engineers bring from prompt engineering. With a single prompt you front-load everything the model might need. With an agent, you give the model the *ability* to fetch what it needs and trust it to pull only the relevant slice. A product catalog with ten thousand rows does not belong in context; a `search_catalog` tool does. The agent then retrieves the three rows that matter for the current decision, and the other 9,997 never cost you a token. This shift — from pre-loading data to exposing the means to get it — is the single highest-leverage move in keeping agents lean.

## Designing handoffs between agents

In a multi-agent run, the handoff is where context discipline pays off or collapses. The orchestrator should hand each subagent a curated brief, and each subagent should hand back a structured result — status, payload, a short note — not a transcript. When handoffs are structured, the orchestrator reconciles by reading fields, and no subagent inherits another's clutter. When handoffs are raw transcripts, context compounds across the run and every agent gets slower and dumber than the last.

| Information | Where it belongs | Why |
| --- | --- | --- |
| Step goal & acceptance test | Inline context | Drives the step's action |
| The specific inputs | Inline context | Operated on directly |
| Large reference data | Behind a tool | Retrieve only if needed |
| Cross-step facts | Durable state, by key | Survive without bloating |
| Other agents' reasoning | Left out | Only results matter |

## Common pitfalls

- **"Just in case" context.** Padding the window with maybe-useful data dilutes attention. Include only what changes the action.
- **Transcript handoffs.** Passing full history between agents compounds clutter. Hand off structured results instead.
- **Pre-loading large data.** Stuffing big tables into context is slow and costly. Expose them as a tool and retrieve on demand.
- **Implicit cross-step memory.** Relying on the transcript to carry facts loses them on retries. Write to durable state and reference by key.
- **One giant system prompt for all agents.** A shared mega-prompt forces every subagent to read irrelevant instructions. Scope prompts per role.

## Tighten your context in 5 steps

1. List every item currently in your agent's context.
2. For each, ask: does it change this step's action? If not, cut it.
3. Move large, fetchable items behind a tool.
4. Move cross-step facts into durable state, referenced by key.
5. Replace transcript handoffs with structured, self-describing results.

## Frequently asked questions

### What is context engineering?

Context engineering is the practice of curating exactly the information a model needs for a step — goal, acceptance test, inputs, and relevant tools — while excluding anything that would dilute its attention or inflate cost. In multi-agent systems it is applied per agent, since each maintains its own window.

### Isn't a 1M-token window big enough to just include everything?

A large window lets you include more, but it does not make irrelevant context free. Off-topic tokens still compete for attention, raise cost, and slow responses. The window size is headroom for genuinely large tasks, not a license to skip the editorial decision of what belongs.

### How do I keep facts available without bloating context?

Put them in durable state and reference them by key, or expose them through a tool the agent can call on demand. Both let a later step retrieve exactly what it needs at the moment it needs it, instead of carrying everything in every context from the start.

## Lean context, live conversations

CallSphere applies the same context discipline to **voice and chat** agents — each turn carries only what the caller's current goal demands, so responses stay fast, grounded, and on-task. Experience tightly scoped agents in production at [callsphere.ai](https://callsphere.ai).

---

*Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.*

---

Source: https://callsphere.ai/blog/prompt-context-design-for-claude-managed-agents