Reusable Claude Agent Patterns for Prompts and Tools
Code-level patterns for structuring Claude agents — sectioned prompts, narrow tools, plan-then-act, and subagents — that survive production.
The first Claude agent you build is held together by intuition. The tenth one needs patterns, because by then you have felt the failure modes: agents that call the wrong tool, prompts that drift as you add features, context windows that balloon until reasoning gets soft. The teams driving real AI transformation with Claude are not the ones with the cleverest single prompt — they are the ones who turned hard-won lessons into reusable structures for prompts, tools, and context. This post collects the patterns that survive contact with production, at the level of code you can actually copy.
Key takeaways
- Structure the system prompt into stable sections — role, constraints, tools, output format — so it stays cacheable and editable.
- Design tools to be narrow and self-describing; a good schema prevents whole classes of wrong calls.
- Use the plan-then-act pattern for multi-step tasks and retrieve-then-answer for knowledge tasks.
- Push noisy subtasks into subagents so the main context stays clean.
- Make outputs machine-checkable — structured JSON or tagged sections — so a verifier can gate them.
Pattern 1: Section your system prompt
A sprawling paragraph prompt is impossible to maintain and impossible to cache well. The durable pattern is a sectioned prompt with stable ordering: role and goal first, then hard constraints, then the tools available, then the required output shape. Keep the volatile, per-request material (the user's actual question, retrieved context) at the end. Because Claude supports prompt caching, putting the long, stable prefix first means you only pay full price for it once and cheaply reuse it on subsequent turns.
SYSTEM:
## Role
You are an operations agent for an e-commerce team.
## Constraints
- Never issue a refund without explicit human approval.
- If data is missing, say so; do not guess order details.
## Tools
You may call lookup_order and get_shipping_status (read-only).
## Output
Return a short answer, then a JSON block: {action, confidence}.
This structure is not cosmetic. Each heading is an anchor you can edit without disturbing the rest, and the explicit constraints section becomes the first place you look when the agent misbehaves.
Pattern 2: Narrow tools beat clever prompts
When an agent calls the wrong tool, the instinct is to add more prompt instructions telling it not to. The better fix is usually at the tool layer. A tool whose name, description, and schema precisely match one job leaves the model little room to err. Prefer several specific tools over one overloaded tool with a mode parameter — the model reasons about discrete capabilities far better than about flags.
flowchart TD
A["Task arrives"] --> B{"Single clear action?"}
B -->|Yes| C["Call one narrow tool"]
B -->|No| D["Plan: list sub-steps"]
D --> E["Execute step, observe result"]
E --> F{"More steps?"}
F -->|Yes| E
F -->|No| G["Emit structured output"]
C --> G
The diagram captures the second pattern, plan-then-act: for anything multi-step, prompt the agent to enumerate a short plan before executing, then work the plan one step at a time, observing each result. This dramatically reduces the rate at which agents charge ahead on a wrong assumption.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Pattern 3: Retrieve-then-answer, not dump-then-hope
For knowledge tasks, the failure mode is stuffing the window with everything that might be relevant. The pattern that holds up is retrieve-then-answer: a retrieval step (often a tool call) pulls the few most relevant chunks, those go into context, and only then does the model answer. The model's job is reasoning over a tight set of facts, not skimming a haystack. This keeps cost down and accuracy up, because every token in the window is reprocessed on each turn.
A practical refinement is to have the agent cite which retrieved chunk supports each claim. That citation requirement is cheap to add and makes hallucinations visible — an answer that cannot point to a source is a flag for review.
Pattern 4: Isolate noise in subagents
Some subtasks are inherently messy — scanning fifty files, trying several queries, parsing logs. If that mess lands in the main agent's context, it crowds out the actual task. The pattern is to spawn a subagent with a clean window, let it do the noisy work, and have it return only a compact summary. The orchestrator never sees the intermediate sludge. A multi-agent system is one in which an orchestrating agent delegates scoped subtasks to subagents that each work in their own context and report back concise results.
Use this deliberately: multi-agent runs typically consume several times more tokens than a single agent, so the isolation has to be worth it. The classic worthwhile case is parallel, independent work — three subagents researching three sources at once, each summarizing for the parent.
Pattern 5: Make outputs checkable
An agent whose output is free-form prose is hard to gate. The pattern is to require a machine-readable tail — a JSON object or a tagged block — alongside the human-readable answer. A verifier can then assert that confidence exceeds a threshold, that action is one of an allowed set, or that required fields are present, before the result is trusted or acted on.
| Pattern | Use when | Payoff |
|---|---|---|
| Sectioned prompt | Always | Maintainable, cacheable |
| Narrow tools | Wrong-tool errors | Fewer mis-calls |
| Plan-then-act | Multi-step tasks | Fewer wrong assumptions |
| Subagents | Noisy/parallel work | Clean main context |
| Checkable output | Gated actions | Automated verification |
Common pitfalls
- Fixing tool problems in the prompt. If the agent calls the wrong tool, sharpen the schema and description before adding more instructions.
- One mega-prompt for everything. A monolithic prompt becomes unmaintainable. Section it and version it like code.
- Subagents by default. They multiply cost; reach for them only when isolation or parallelism is real.
- Free-form output for actions. If an agent triggers real actions, give it a structured output a verifier can check.
- Never measuring. Without an eval set, you cannot tell whether a prompt change helped or hurt. Capture a handful of representative cases and re-run them on every change.
Pattern 6: A few-shot example beats a long instruction
When you need an agent to produce output in a specific shape or handle a tricky edge case, the temptation is to write another paragraph of instructions. Often a single concrete example does the job better. Models generalize sharply from a well-chosen demonstration, so showing one full input-to-output pair — including how to handle the awkward case — teaches faster than describing it abstractly. Keep examples in the prompt few and representative; two strong examples usually outperform six mediocre ones, and every example you add is reprocessed on every turn.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
The pattern pairs well with the structured-output rule from before. Demonstrate the exact JSON tail you want once, and the model will mirror it. This is also how you encode taste — the difference between a curt reply and a warm one, or between an over-eager refund and a careful escalation — without trying to enumerate every rule. Show the behavior, then let the model imitate it. When the example and a written instruction conflict, the example tends to win, so keep them aligned.
Adopt these patterns in 5 steps
- Refactor your system prompt into role, constraints, tools, output sections.
- Audit your tools and split any overloaded one into narrow, single-purpose tools.
- Add a plan-then-act instruction for any task with more than two steps.
- Move your noisiest subtask into a subagent that returns a summary.
- Require a structured output tail and write one verifier assertion against it.
Frequently asked questions
How many tools is too many?
There is no hard limit, but past a dozen or so the model spends reasoning budget choosing among them. If a tool list grows large, consider grouping related tools behind skills so the relevant subset loads only when the task calls for it.
Should the agent always plan before acting?
For single, obvious actions, planning is overhead — call the tool. Reserve explicit plan-then-act for genuinely multi-step tasks, where stating the plan first prevents the agent from committing to a wrong path.
How do I keep prompts from drifting as features pile up?
Version the system prompt in source control, keep it sectioned, and maintain a small eval set you re-run on every change. Treat prompt edits exactly like code changes — reviewed and tested. When a new feature needs new behavior, prefer adding a focused section or a single example over rewriting the whole prompt, so each change stays small and reviewable.
Do these patterns differ across Claude models?
The patterns hold across Opus, Sonnet, and Haiku — they are about structure, not raw capability. What changes is headroom: a smaller, faster model benefits even more from narrow tools and tight context because it has less slack to recover from a noisy window. Build the structure once and it pays off whichever model you route a given step to.
Bringing agentic AI to your phone lines
CallSphere encodes these prompt and tool patterns into voice and chat agents that stay reliable call after call — narrow tools, structured outputs, clean context. See the patterns in action at callsphere.ai.
Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.