Skip to content
Agentic AI
Agentic AI6 min read0 views

Claude Cowork patterns: structuring prompts, tools, context

Reusable Claude Cowork patterns: one skill per responsibility, narrow tools, small structured outputs, know-how in skills, and self-verifying agent loops.

Once you have shipped a few Claude Cowork workflows, you start to notice that the reliable ones share a shape, and the flaky ones share a different shape. The reliable ones treat the agent like a well-managed engineer: clear interfaces, narrow tools, tight feedback. The flaky ones treat it like a wishing well. This post collects the patterns that consistently separate the two — at the level of how you structure skills, how you shape tool schemas, and how you manage what lands in context — so you can reuse them instead of rediscovering them.

None of these are framework-specific tricks. They are the agentic-engineering equivalent of design patterns: named, reusable solutions to problems that recur every time you build with Claude.

Pattern 1: One skill, one responsibility

The most durable structural choice is to keep each skill focused on a single coherent capability rather than letting it sprawl. A skill that "handles all finance tasks" becomes a tangle no one can reason about; three skills — reconcile invoices, generate the monthly report, answer a vendor query — each stay legible, load only when relevant, and can be improved independently. When a skill's instructions start needing the word "meanwhile" or splitting into unrelated sections, that is the signal to fork it. Small, sharp skills compose; big ones rot.

This mirrors the single-responsibility principle from ordinary software, and the payoff is the same: you can change one skill without fear of breaking another, and the model gets a clean, well-scoped set of instructions exactly when it needs them rather than a sprawling document it must wade through.

Pattern 2: Tools as narrow, named verbs

How you define the tools the agent can call shapes its behavior more than any prompt. The pattern that works is narrow, well-named, single-purpose tools with explicit schemas — get_invoice_by_id, list_open_tickets, create_summary_doc — rather than one god-tool like do_finance_thing that takes a free-text instruction. Narrow tools give the model an unambiguous menu and give you precise control over scopes and gating. Each tool's description should say what it does, what it returns, and crucially what it does not do, so the model never reaches for the wrong one.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →
flowchart TD
  A["Task arrives"] --> B{"Which capability?"}
  B -->|Finance| C["Load reconcile skill"]
  B -->|Support| D["Load triage skill"]
  C --> E["Narrow tools: get_invoice, create_doc"]
  D --> F["Narrow tools: list_tickets, assign_owner"]
  E --> G["Structured small results"]
  F --> G
  G --> H["Model composes deliverable"]

Notice that the diagram routes by capability before any tool is touched. This routing-first pattern keeps the wrong tools out of context entirely on a given task, which both sharpens the model's choices and lowers token cost — the model never reads the schema for a tool it cannot use right now.

Pattern 3: Structured, small tool outputs

What a tool returns is as important as what it accepts. The pattern is to return the smallest structured payload that answers the question — an invoice's id, amount, status, and due date — not the entire raw record or document. Oversized outputs are the leading cause of context pollution: they crowd the window, bury the signal, and degrade every subsequent decision. If a tool genuinely must surface a large document, have it return a reference plus the relevant excerpt, and let the model request more only if needed. Treat the context window as the scarce resource it is.

A useful discipline: for every tool, ask "what is the minimum the model needs to make its next decision?" and return exactly that. This single habit improves accuracy and cost simultaneously, which is rare — usually you trade one for the other.

Pattern 4: Put know-how in skills, not in the prompt

Teams often try to make an agent smarter by stuffing more into the top-level prompt. The better pattern is to move durable know-how — policies, formats, edge cases, examples — into skills that load on demand. The prompt should carry the immediate task and intent; the skill carries the institutional knowledge. This keeps the always-present context lean while letting each task pull in deep, specific guidance exactly when relevant. An Agent Skill is, by definition, a folder of instructions, scripts, and resources Claude loads dynamically when the task calls for it — which is precisely what makes it the right home for know-how that would otherwise bloat every prompt.

Pattern 5: The observe-and-correct loop in the instructions

Reliable skills tell the agent how to check its own work. Rather than just "reconcile the invoices," a good skill says "after reconciling, recount the totals; if your discrepancy count does not match the difference between the two ledgers, you made an error — find it before reporting." This builds a verification step into the agent's own loop, catching mistakes before they reach the deliverable. The pattern generalizes: wherever a result can be cheaply checked, instruct the agent to check it, because a model that verifies its own arithmetic is dramatically more trustworthy than one that does not.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Pattern 6: Fail loud, never silent

The worst agent behavior is confidently producing a plausible-but-wrong result when something upstream broke. The pattern is to make every skill and tool fail loudly: if an input is missing, a format is off, or a connector errors, the agent should stop and say so rather than improvise around the gap. Encode this explicitly — "if the CSV has fewer than the expected columns, do not guess; report the mismatch and halt." Loud failure turns a silent data-corruption bug into a visible, fixable message, which is exactly the trade you want in any system touching real work.

Frequently asked questions

How big should a single skill be?

Big enough to fully cover one capability, small enough that its instructions read as a single coherent document. When you find unrelated sections or the word "meanwhile," split it. Focused skills load cleanly, improve independently, and keep the model's context sharp.

Why are narrow tools better than one flexible tool?

Narrow, named tools give the model an unambiguous menu, give you per-tool scopes and gating, and keep irrelevant tool schemas out of context. A single free-text god-tool hides intent, complicates safety, and invites the model to do the wrong thing in the wrong place.

What is the most common cause of unreliable agent runs?

Context pollution from oversized tool outputs. When a tool returns a whole document instead of the relevant field, it buries the signal and degrades every later decision. Returning small, structured payloads fixes accuracy and cost at the same time.

Where should policies and formats live — prompt or skill?

In skills. The prompt should carry the immediate task; durable know-how like policies, formats, and edge cases belongs in skills that load on demand. This keeps the always-present context lean while still giving each task deep, specific guidance.

The same patterns, on voice and chat

CallSphere applies exactly these structural patterns — narrow tools, small structured results, know-how in skills — to voice and chat agents that handle live calls, call tools mid-conversation, and book work continuously. See how it comes together at callsphere.ai.

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.