Claude Agent Patterns: Prompts, Tools, and Context

After you've shipped a few Claude agents, you start noticing the same shapes recurring. The reliable ones are reliable in the same ways; the flaky ones fail in the same ways. This post collects the reusable, code-level patterns that keep production agents predictable — how to structure the prompt, how to design tools the model uses correctly, and how to shape context so the agent has what it needs and nothing it doesn't. These aren't theory; they're the moves you'll reach for on every project.

A reusable agent pattern is a repeatable way of structuring a prompt, tool, or context block that produces predictable model behavior across tasks. The point of naming them is that once you have the pattern, you stop re-deriving it under deadline pressure and start composing from known-good parts.

Pattern: the layered system prompt

Don't write the system prompt as one wall of text. Layer it into named sections the way a well-structured function reads: role (who the agent is), workflow (the ordered steps for the common case), tools (a one-line rule for when to call each), output contract (the exact shape of a good answer), and boundaries (the hard no's). When you need to change behavior later, you edit one labeled section instead of surgically rewriting a paragraph and hoping you didn't break an adjacent instruction.

The layered structure also makes the prompt diff-friendly and reviewable. A teammate can read the boundaries section in isolation and reason about safety. You can A/B a new workflow without touching the output contract. Prompts that are written as structured documents age far better than prompts written as prose, because every future edit is local.

Pattern: tools as a typed API, not a grab bag

Design your tool set the way you'd design a small, coherent API. Each tool does one thing, has a name that reads as a verb-object (create_invoice, get_account, not handle_billing), and exposes a tight schema with required fields and enums. The model picks tools by reading their descriptions, so the description is part of the interface: write it as an instruction — "Call this after confirming the customer's identity to fetch open invoices" — not as a noun phrase.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

flowchart TD
  A["Incoming task"] --> B["Layered system prompt"]
  B --> C{"Which pattern applies?"}
  C -->|Needs data| D["Typed tool call"]
  C -->|Needs judgment| E["Structured reasoning step"]
  D --> F["Validate & return result"]
  E --> F
  F --> G{"Output contract met?"}
  G -->|No| B
  G -->|Yes| H["Emit final answer"]

The validate-and-loop-back edge in the diagram is itself a pattern: never let a tool result flow straight to the user without the agent checking it against the output contract. A tool that returns an empty list should prompt the agent to reconsider, not to confidently report "nothing found" when the real issue was a bad argument.

Pattern: the few-shot example inside the tool result

One of the most underused patterns is shaping the agent's behavior through what tools return, not just through the prompt. If your search_docs tool returns results in a clean, consistent structure with the most relevant field first, the model learns to cite it well. If it returns a raw blob, the model improvises. You're effectively giving few-shot guidance every time a tool responds. Curate tool outputs as carefully as you curate the prompt — trim irrelevant fields, label the important ones, and keep the shape stable across calls.

This pattern compounds with context budget. A tool that returns five tight, labeled fields keeps the window clean over a long run; one that dumps a 200-line JSON payload every call fills the context with noise and pushes out the facts that matter. Shaping tool output is simultaneously a behavior pattern and a context-management pattern.

Pattern: progressive disclosure with skills

Not every capability belongs in the base prompt. The skills pattern lets you keep a lean default and pull in detailed instructions only when a step needs them. A skill is a folder of instructions, examples, and scripts that the agent loads when the current task matches its trigger. Your refund-handling logic, your tone-of-voice guide, your data-export procedure — each lives as a skill that costs nothing until it's relevant.

The pattern keeps the agent both capable and cheap. Instead of a bloated prompt that tries to anticipate every situation, you have a focused core plus a library of skills the agent reaches into on demand. When you need to add a capability, you write a new skill rather than editing the core prompt and risking the behavior you already trust. This is how a single agent scales to many situations without becoming unmaintainable.

Pattern: explicit reasoning checkpoints

For multi-step tasks, build in checkpoints where the agent states its plan or summarizes what it has established before continuing. "Before drafting the resolution, confirm: plan tier, the relevant order, and the policy that applies." This pattern catches the agent's mistakes early — if it has the wrong order, you find out at the checkpoint instead of in the final answer — and it produces a trace that's far easier to debug because each decision is justified inline.

Checkpoints also stabilize long runs. Without them, an agent's reasoning can drift turn over turn as compaction trims earlier context. A periodic "here's what I know so far" re-anchors the working state in the recent window, where it's most resistant to being summarized away. Use checkpoints sparingly on short tasks and liberally on long ones.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Pattern: fail closed, recover in-loop

Reliable agents fail closed: when a tool errors or returns something unexpected, the handler returns a structured, descriptive error and the agent is prompted to recover within the loop rather than guessing. {"error": "account_id missing required prefix"} lets the model fix its argument and retry; a thrown exception or a silent empty result lets it confabulate. Pair this with a step budget so recovery attempts can't loop forever. The combination — descriptive errors plus a hard ceiling — turns transient failures into self-corrections instead of incidents.

Frequently asked questions

Should I put examples in the prompt or in tool outputs?

Both, for different jobs. Put a couple of canonical examples in the prompt to set the output contract, and shape tool outputs so every result implicitly demonstrates the structure you want cited. Prompt examples teach the overall shape; consistent tool outputs teach the model how to use real data correctly turn after turn.

How many tools is too many for one agent?

There's no hard number, but once the model starts picking the wrong tool, you have too many or your descriptions overlap. Tighten descriptions so each tool's "when to use" is mutually exclusive, and consider splitting into subagents with smaller tool sets if one agent juggles too many domains. Clarity of selection matters more than raw count.

When do I reach for a skill versus a longer prompt?

Use a skill when the capability is detailed, occasionally needed, or independently maintainable — refund procedures, export formats, niche policies. Keep it in the prompt when it applies to nearly every task. The rule of thumb: if it bloats the base prompt but rarely fires, make it a skill so it costs context only when relevant.

How do reasoning checkpoints affect cost?

They add tokens, so use them where the cost of a wrong final answer is high — multi-step tasks, irreversible actions, long runs prone to drift. On short, cheap tasks they're overhead. The trade is real but usually worth it: a checkpoint that catches a wrong order before a refund is far cheaper than the refund.

Bringing agentic AI to your phone lines

CallSphere builds on exactly these patterns — layered prompts, typed tools, skills, and in-loop recovery — for voice and chat agents that answer every call and message, use tools mid-conversation, and book work around the clock. See it live at callsphere.ai.

Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.

Claude Agent Patterns: Prompts, Tools, and Context

Pattern: the layered system prompt

Pattern: tools as a typed API, not a grab bag

Pattern: the few-shot example inside the tool result

Pattern: progressive disclosure with skills

Pattern: explicit reasoning checkpoints

Pattern: fail closed, recover in-loop

Frequently asked questions

Should I put examples in the prompt or in tool outputs?

How many tools is too many for one agent?

When do I reach for a skill versus a longer prompt?

How do reasoning checkpoints affect cost?

Bringing agentic AI to your phone lines

Try CallSphere AI Voice Agents

Related Articles You May Like

Where Claude Code GTM engineering is heading next

Where Claude Cowork is heading and how to prepare

Measuring Claude Cowork success: metrics that prove it

How to measure success of Claude Code GTM workflows

Claude Cowork walkthrough: from problem to shipped

End-to-end Claude Code GTM workflow: a real rebuild