Skip to content
Agentic AI
Agentic AI8 min read0 views

Reusable multi-agent patterns: prompts, tools, context

Code-level multi-agent patterns on Claude: role-scoped prompts, per-agent tool scoping, typed handoff contracts, need-to-know context, and critic wrappers.

After you've built two or three multi-agent systems on Claude, you start noticing the same shapes recurring — the same way of writing a subagent prompt, the same trick for scoping tools, the same handoff contract. Those recurring shapes are the real deliverable of experience: they're what lets you stand up a new system in an afternoon instead of rediscovering every pitfall. This post collects the reusable, code-level patterns I reach for every time, organized around the three things you actually control: the prompts, the tools, and the context each agent sees.

None of these are framework-specific. They're patterns you can drop into a Claude Agent SDK project, a Claude Code subagent definition, or a hand-rolled loop. Think of them as a small library of moves rather than a single architecture.

Pattern 1: the role-scoped subagent prompt

A subagent prompt should do exactly three things and nothing else: define a narrow role, state the one task, and specify the exact output format. The anti-pattern is the "mini-orchestrator" prompt that tells a subagent to plan, decide what to do, and report — that smears responsibility and produces inconsistent returns. Instead, write prompts like "You are a verification agent. Given a claim and its source, return SUPPORTED, CONTRADICTED, or UNCLEAR plus one sentence of justification." The role is a fence; everything outside it is out of scope.

The reusable structure is a four-part template: role line, task line, constraints (what not to do), and an output schema. Keep it short — a long subagent prompt competes for the same context you're trying to preserve for the actual work. A good subagent system prompt is often under 150 words. If yours is growing, that's usually a sign the subagent is doing too many jobs and should be split.

Pattern 2: tool scoping per agent

Give each agent only the tools its role needs. A researcher gets search and fetch; a writer gets file-write; a verifier gets read-only lookups. This isn't just tidiness — narrowing the tool set measurably improves tool-selection accuracy, because Claude isn't choosing among twelve plausible options when only two are relevant. It also contains blast radius: a subagent that can't write a file can't corrupt one.

flowchart TD
  A["Task spec"] --> B["Select pattern: role-scoped prompt"]
  B --> C["Attach minimal tool set"]
  C --> D["Inject only need-to-know context"]
  D --> E["Run subagent"]
  E --> F["Validate against output schema"]
  F --> G{"Schema valid?"}
  G -->|No| H["Re-ask with the error"] --> E
  G -->|Yes| I["Emit structured result"]

A reusable trick: define tools in a registry keyed by role, so spawning a subagent is spawn(role, task, context) and the tool set is looked up, not hand-wired per call site. This keeps tool scoping consistent and makes it a one-line change to tighten or loosen a role's permissions across the whole system.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

Pattern 3: the handoff contract as a typed schema

The single most reusable pattern is treating every agent boundary as a typed contract. Define the return shape — a struct with named fields and a status enum — and validate it in code the moment a subagent returns. If validation fails, don't pass the malformed result upstream; re-ask the subagent with the validation error embedded in the prompt ("Your last response was missing the 'sources' field; return valid output"). This self-correction loop, shown as the G branch above, turns flaky structured output into reliable structured output without human intervention.

Why typed contracts beat free text: the orchestrator can sort by confidence, dedupe by source, and route by status without re-parsing prose every round. It also makes the system testable — you can unit-test an orchestrator against synthetic structured findings, completely decoupled from the subagents. Contracts are the seams that let you test and debug each piece in isolation, which is the difference between a system you can maintain and one you can only pray over.

Pattern 4: context shaping with the need-to-know rule

Every token you put in a subagent's context is a token of attention spent and a token of cost paid. The reusable discipline is need-to-know: inject only the context this specific subagent requires to do its one job, and nothing inherited "just in case." A verification subagent needs the claim and the source — not the user's original question, not the other claims, not the orchestrator's plan. Over-injecting context is the most common cause of subagents going off-brief, because extra context invites them to expand their scope.

A practical pattern: build the subagent's context from an explicit allow-list, not by copying the orchestrator's state. Your spawn function takes a context object you assemble deliberately for that role. When you find yourself wanting to pass "everything," that's a design smell — it usually means the subagent's task isn't well-defined enough to know what it actually needs.

Pattern 5: the critic wrapper

A reusable accuracy upgrade for any subagent is to wrap it in a critic. The generator produces output; a second, cheaper or differently-prompted agent reviews it against explicit criteria and returns pass or revise-with-reasons. You apply this selectively — wrap the subagents whose mistakes are costly (code generation, factual claims, irreversible actions) and leave the cheap ones bare. The pattern is composable: critic(generator(task)) is just another spawnable unit, so you can add or remove the review without touching the generator.

The key to making critics worth their token cost is giving them specific criteria, not "is this good?" A code critic should check named properties — does it handle the empty case, are errors caught, does it match the contract. Vague critics rubber-stamp; specific critics catch real defects. Cap the generate-critique loop at two or three rounds so a stubborn disagreement can't spin forever.

Putting the patterns together

These five compose into a small vocabulary. A typical system is: an orchestrator that decomposes, role-scoped subagents each with a minimal tool set and need-to-know context, typed handoff contracts at every boundary with self-correcting re-asks, and critic wrappers on the high-stakes subagents. Once you have the spawn(role, task, context) primitive and a contract validator, building a new system is mostly choosing which roles you need and wiring their handoffs — the hard parts are already solved.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

The meta-pattern underneath all of them is the same: make every boundary explicit. Explicit roles, explicit tools, explicit context, explicit contracts. Multi-agent systems fail in the implicit places — the context a subagent assumed it had, the tool it shouldn't have been able to call, the field the orchestrator expected but didn't get. Patterns that force those things into the open are what make the difference between a demo and a system you'd put in front of users.

Frequently asked questions

What goes in a good subagent prompt?

Four parts: a narrow role line, the single task, explicit constraints on what not to do, and an output schema. Keep it short — often under 150 words — because a long subagent prompt competes for the same context budget you're trying to preserve for the work itself.

Why scope tools per agent instead of sharing one big tool set?

Narrowing tools improves Claude's tool-selection accuracy, since it chooses among two relevant options instead of a dozen plausible ones, and it contains blast radius — a subagent without write access can't corrupt state. A role-keyed tool registry makes this a one-line policy change.

How do I make structured output from subagents reliable?

Define a typed return contract, validate it in code on every return, and when validation fails, re-ask the subagent with the specific error embedded in the prompt. This self-correction loop turns occasionally-malformed output into reliably-valid output without human intervention.

When is a critic wrapper worth the extra tokens?

When the wrapped subagent's mistakes are costly — code generation, factual claims, irreversible actions. Give the critic specific, named criteria rather than "is this good?", and cap the generate-review loop at two or three rounds so disagreements can't spin.

Bringing agentic AI to your phone lines

Role-scoped prompts, typed handoffs, and critic wrappers aren't just coding-agent tricks. CallSphere uses the same reusable patterns to run reliable voice and chat agents that answer every call, use tools mid-conversation, and book work around the clock. See them at callsphere.ai.


Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.