Prompt & Context Design for Claude Multi-Agent Systems

The hardest part of building a multi-agent system isn't getting agents to talk to tools or to each other. It's deciding what each agent should know. Too little context and an agent guesses; too much and it fixates on irrelevant detail, costs more, and slows down. Context design is the quiet discipline that separates systems that feel sharp from ones that feel confused, and it gets almost none of the attention that flashy orchestration diagrams do.

This post is entirely about that discipline: what to put in an agent's context, what to deliberately leave out, and the reasoning behind each call. It applies whether you're prompting a single Claude agent or wiring up a dozen subagents, but the stakes rise in multi-agent systems because every agent's context is a separate decision and the mistakes multiply.

The cost of everything you include

Context is not free, and not just in tokens. Context design is the practice of deciding which information enters an agent's working memory so it has what it needs and nothing that distracts it. Every extra paragraph you hand an agent is something it has to weigh, and large language models are susceptible to being pulled toward salient-but-irrelevant details. A subagent told to find slow queries but handed the entire architecture doc may start commenting on the architecture instead.

So the default posture is subtractive. Start from "what is the minimum this agent needs to do its one job?" and add only what's required. This inverts most people's instinct, which is to give the agent everything "just in case." In a multi-agent system the just-in-case approach is especially costly: you're paying to load that context into every subagent, and you're diluting each one's focus. Lean contexts aren't a nice-to-have; they're how the system stays both affordable and accurate.

What belongs in every agent's context

Some things earn their place reliably. The agent's role and boundaries — what it does and explicitly does not do. Its immediate task, stated concretely with a clear definition of done. The specific facts it needs, extracted rather than dumped: the three relevant requirements, not the thirty-page spec. The output contract describing exactly what to return. And the tool guidance for the handful of tools its role uses. That's usually the whole list.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

Notice what this implies for the orchestrator: it has to do extraction work. Rather than forwarding raw material to subagents, a well-designed orchestrator distills the relevant facts and passes those. This costs the orchestrator some effort but keeps every downstream context clean. It also surfaces understanding — an orchestrator that can't extract the relevant facts probably doesn't understand the task well enough to coordinate it, and that's a useful signal to catch early.

flowchart TD
  R["Raw material & full history"] --> O["Orchestrator extracts essentials"]
  O --> B["Build lean subagent brief"]
  B --> INC["Include: role, task, key facts, contract, tools"]
  B --> EXC["Exclude: raw dumps, other agents' transcripts, stale state"]
  INC --> AG["Subagent works focused"]
  EXC --> AG
  AG --> RET["Returns clean result"]
  RET --> O

What to deliberately leave out

The exclusions are where craft shows. Leave out the other agents' transcripts — a subagent doesn't need to read how a sibling reached its conclusion, only the conclusion if it's relevant. Leave out stale state: facts that were true earlier but have since changed are worse than no facts, because the agent will act on them confidently. Leave out raw material the agent doesn't need to read when you can extract the salient bits instead.

Also leave out your own hedging. Long, caveat-laden instructions that try to anticipate every edge case often confuse more than they help, because the agent can't tell which clause is load-bearing. A crisp instruction with one clear rule beats a paragraph of qualifications. If you find yourself writing "but if X, unless Y, except when Z" into a subagent prompt, that's usually a sign the task should be split or the decision should move to the orchestrator.

Designing context for the synthesis step

The orchestrator's synthesis context is its own design problem. It receives all the subagent returns, and the temptation is to also keep the full planning history, the original raw material, and every intermediate note. Resist it. The synthesis step needs the validated returns and the original user goal — that's what it's reconciling against. Loading the raw material again invites the orchestrator to re-derive answers the subagents already produced, wasting tokens and risking contradictions.

A clean synthesis context also makes the orchestrator's job honest. Given only the returns and the goal, it can only compose from what the subagents actually found — which is exactly the constraint that prevents hallucinated connective tissue. When you flood the synthesis step with extra context, you give the model room to invent bridges between findings that nobody verified. Constraint, here, is a feature.

Iterating on context with real runs

You won't get context design right on paper. The reliable method is to run real tasks, log every agent's full context and its result, and look for two failure signatures: an agent that lacked something it needed, and an agent that got distracted by something it didn't. The first tells you to add; the second tells you to cut. Most teams over-include at first, so the early edits are usually deletions.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Treat the context for each agent as a tunable artifact, not a fixed prompt. As the system meets real inputs, you'll discover that one subagent consistently needs a fact you weren't passing, or that another keeps latching onto a section you can drop. Make those edits, re-run, and watch both accuracy and token cost. Context design done well shows up as the rare combination of better answers and a smaller bill — the clearest sign you're including what matters and leaving out what doesn't.

Frequently asked questions

Should I give a subagent the conversation history?

Usually not. Hand it a focused brief with the extracted facts it needs and its return contract instead. Full history adds tokens and distraction without adding capability, and in a multi-agent system you'd be paying that cost in every subagent. Forward only what the specific task requires.

How do I know if an agent has too little context?

Watch for guessing: the agent asks for information it should have, invents a plausible-but-wrong fact, or returns a low-confidence result. Those signatures mean you cut something it actually needed. Log the agent's full context alongside its output so you can see exactly what was missing.

Is more context always safer?

No. Beyond what the task requires, more context raises cost and pulls the model toward irrelevant details, which can lower accuracy. The reliable default is subtractive — start from the minimum and add only what real runs prove necessary.

What should the synthesis step actually see?

The validated subagent returns and the original user goal — and little else. Reloading raw material there invites the orchestrator to re-derive or invent. A lean synthesis context is what keeps the final answer grounded in what the subagents genuinely found.

Context discipline on every call

CallSphere applies this same lean-context discipline to voice and chat agents, giving each turn exactly the customer facts it needs to answer accurately and book work — no bloat, no drift. Hear well-designed agent context at work at callsphere.ai.

Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.

Prompt & Context Design for Claude Multi-Agent Systems

The cost of everything you include

What belongs in every agent's context

What to deliberately leave out

Designing context for the synthesis step

Iterating on context with real runs

Frequently asked questions

Should I give a subagent the conversation history?

How do I know if an agent has too little context?

Is more context always safer?

What should the synthesis step actually see?

Context discipline on every call

Try CallSphere AI Voice Agents

Related Articles You May Like

Where Claude Cowork is heading and how to prepare

Where Claude Code GTM engineering is heading next

Measuring Claude Cowork success: metrics that prove it

How to measure success of Claude Code GTM workflows

Claude Cowork walkthrough: from problem to shipped

End-to-end Claude Code GTM workflow: a real rebuild