Claude Skill Design Patterns: Prompts, Tools, Context
Reusable code-level patterns for Claude Agent Skills: router descriptions, script-offload, reference-doc tiers, and clean context contracts.
Once you have written a handful of Agent Skills, you start noticing the same shapes recurring — the ones that stay reliable as a team grows and the ones that quietly rot. The difference is rarely the body text; it is the structure. A Skill is a small system with three layers (the trigger, the instructions, the resources), and the patterns that work treat each layer with its own discipline. This post collects the reusable patterns I keep reaching for, the kind you can copy into your own Skills tomorrow.
Pattern 1: the router description
Every Skill lives or dies by whether Claude loads it at the right moment, and that decision is made entirely from the description in the metadata index. The router-description pattern treats that one line as a routing contract. It states three things explicitly: the trigger condition ("when the user…"), the input shape ("…provides a Stripe webhook payload…"), and the produced outcome ("…and wants it validated and summarized"). Avoid adjectives; favor nouns a user would actually type. The acid test: read only the description and ask whether you could correctly decide when to fire the Skill. If you can't, neither can the model.
A second discipline here is boundary-setting between Skills. When two descriptions overlap, Claude has to guess, and guesses are where flakiness enters. Add a negative clause when needed — "use for inbound webhooks, not for outbound API calls" — so each Skill claims a clean, non-overlapping slice of intent space.
The reason this matters more as you scale is combinatorial. With three Skills, overlap is easy to spot and fix by eye. With thirty, ambiguity between any pair quietly poisons the routing, and you end up debugging the wrong-Skill-loaded problem far from where it originated. The router-description pattern is really a naming-and-namespacing discipline borrowed from ordinary software engineering: each unit should have one obvious job and a name that makes its boundaries unmistakable. Spend the effort here and the rest of the system stays predictable; skimp on it and you will chase intermittent misfires for weeks.
Pattern 2: script-offload for determinism
The most valuable structural pattern is moving brittle work out of the model and into bundled code. Any step that is pure mechanics — parsing, hashing, schema validation, date math, file generation — should be a script the body tells Claude to run, returning structured output the model then reasons over. The model's job becomes interpretation and judgment, not execution. This cuts token cost, removes a whole class of "the model miscounted" bugs, and makes the Skill behave identically on every run.
flowchart TD
A["Task arrives"] --> B["Router description matches"]
B --> C["Load SKILL.md body"]
C --> D{"Step is deterministic?"}
D -->|Yes| E["Run bundled script, get structured output"]
D -->|No| F["Model reasons over content"]
E --> G["Model interprets script result"]
F --> G
G --> H["Compose final answer"]
A practical refinement: have scripts emit machine-readable output (JSON or a tagged block) rather than free prose. When the script speaks structure, the model parses it cleanly and you avoid the model re-deriving values it should simply read. Keep scripts dependency-light and runnable in isolation so a teammate can debug them outside the agent.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Pattern 3: the reference-doc tier
Not all knowledge belongs in the body. The body is loaded in full the instant the Skill triggers, so it should stay lean — the procedure and the decision points. Bulky reference material (a full API field glossary, an exhaustive style guide, a table of error codes) belongs in separate files in the Skill folder that the body points to. The body says "if you hit an unfamiliar error code, consult errors.md," and Claude reads that file only when it actually needs it. This is progressive disclosure applied within a single Skill: keep tier-two cheap, push the heavy material to tier three.
This pattern is what lets a Skill carry deep domain knowledge without paying for all of it on every invocation. A Skill can ship a dozen reference docs totaling tens of thousands of words and still cost almost nothing until a specific one is read on demand.
A subtle but important corollary: the body's job is partly to act as an index into the reference tier. Rather than inlining the content, the body should tell Claude exactly when and why to reach for each file — "when validation fails, consult field-rules.md before deciding whether to drop the row." Good pointers are conditional and specific. A body that just says "see the docs for details" forces the model to load everything or nothing; a body that says precisely which file answers which question lets it pull the one relevant page. The reference tier is only as efficient as the body's signposting into it.
Pattern 4: the explicit context contract
A reliable Skill states what it needs up front. Open the body with a short "inputs" section: the files, parameters, or prior context the procedure assumes. When the inputs are missing, the body should instruct Claude to ask for them rather than hallucinate them. This single move — making the Skill demand its inputs instead of guessing — eliminates a surprising share of wrong outputs, because the failure becomes a clarifying question instead of a confident mistake.
Pair this with explicit output formatting. State the exact shape of the result — sections, fields, file names — so downstream steps and other Skills can consume it predictably. Treating each Skill as having a typed input and output, even informally, makes them compose into pipelines instead of one-offs.
Pattern 5: thin orchestrator, fat workers
When a job spans several Skills, resist the urge to build one mega-Skill. Instead keep a thin coordinating Skill whose body mostly decides which specialized Skill to invoke for each phase, and put the real procedural depth in the focused workers. This mirrors the orchestrator–subagent pattern at the knowledge layer: the coordinator stays readable and the workers stay independently testable. When a worker's logic changes, you edit one small Skill, not a tangled monolith.
The payoff shows up in maintenance. A monolithic Skill that handles intake, validation, enrichment, and reporting becomes a file nobody wants to touch, because any edit risks the other three jobs. Decompose it and each worker has a tight remit you can reason about and test in isolation, while the coordinator stays short enough to read in one sitting. This is the same modularity argument that governs ordinary code, and it applies to Skills for exactly the same reason: small, single-purpose units compose and evolve far better than large multi-purpose ones.
Anti-patterns to avoid
Three structures reliably cause pain. First, the kitchen-sink Skill whose description tries to cover five unrelated jobs — it triggers unpredictably and is impossible to scope; split it. Second, the prose-as-parser Skill that asks the model to do arithmetic or strict parsing in its head — wrap that in a script. Third, the fat-body Skill that inlines a 5,000-word reference into the instructions, paying that cost on every trigger — move the bulk to reference files. Each anti-pattern is the inverse of a pattern above, which is a useful way to spot them: if you are not doing one of the good patterns, check whether you have drifted into its opposite.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
A fourth, quieter anti-pattern deserves mention because it is so easy to fall into: the implicit-input Skill that assumes context it never states. It works perfectly in the session where you wrote it — because the relevant file happened to be open, or you had just discussed the parameters — and then fails mysteriously when a colleague runs it cold. The fix is the context-contract pattern: make the Skill name its inputs and refuse to proceed without them. Treat every Skill as if it will run in a fresh session with no shared memory, because eventually it will, and the ones that survive that transition are the ones built to demand what they need rather than to assume it.
The throughline across all of these patterns is that a Skill is software, and the same instincts that make software maintainable make Skills maintainable: one job per unit, deterministic work in code, heavy data behind a clean interface, explicit contracts at the boundaries, and small composable pieces over large tangled ones. None of this is exotic. It is the ordinary discipline of good engineering applied to a new kind of artifact, and teams that bring that discipline to their Skills end up with libraries that keep working as they grow rather than collapsing under their own weight.
Frequently asked questions
How long should a SKILL.md body be?
Long enough to be an unambiguous runbook and no longer. Push reference bulk into separate files and keep the body focused on the procedure and decision points, since the body loads in full on every trigger while reference files load only when read.
When should I split one Skill into several?
When the description has to cover more than one distinct intent, or when different parts of the body change for unrelated reasons. Distinct intents want distinct descriptions so they route cleanly; independently changing logic wants independent files so edits stay isolated.
Why have scripts emit JSON instead of prose?
Structured output lets the model read values directly instead of re-deriving them from prose, which removes parsing ambiguity and a class of transcription errors. It also makes the script independently testable outside the agent.
Do these patterns apply outside Claude Code?
Yes. Because Skills are portable folders, the same router-description, script-offload, and reference-tier patterns carry into Cowork plugins and Agent SDK builds unchanged. Structure travels with the folder.
Agentic patterns, applied to live calls
CallSphere uses these same structural patterns — thin orchestration, deterministic tool-offload, on-demand context — in voice and chat agents that handle real conversations, validate with tools, and book work automatically. See it at callsphere.ai.
Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.