Prompt and Context Design for Cowork Plugins
What to put in a Claude Cowork plugin's context and what to leave out: the three context layers, a decision tree, and post-launch tuning.
Two Claude Cowork plugins can have identical skills and connectors and behave completely differently, because one was given a tight, well-chosen context and the other was stuffed with everything its author could think of. Context is not free real estate — every token you add competes for the model's attention and can crowd out the thing that actually mattered. The hardest, most underrated skill in building enterprise plugins is deciding what to put in context, what to leave out, and why. That decision is what this post is about.
The instinct most engineers bring from traditional software is wrong here. More documentation does not make a function more reliable. With an agent, more context past a point makes it less reliable, because relevant signal gets diluted. The goal is the smallest context that fully equips the model for the task in front of it.
The three layers of context
It helps to separate context into three layers. The always-on layer is what's present in every turn: the plugin's core purpose and the handful of rules that must never be violated ("never invent a figure; every number comes from a tool"). The on-demand layer is the skill bodies and reference files that load only when a request triggers them. The ephemeral layer is tool output and sub-agent results that enter context for a moment and should leave once distilled.
Most context problems are a layer mistake: putting an edge-case rule in the always-on layer where it burns budget every turn, or letting ephemeral tool output linger as if it were always-on. Sorting each piece of context into the right layer is half the battle.
What belongs in context
Put in context the things the model genuinely cannot infer: your house style for a document, the exact sections a renewal summary must contain, the policy rules that govern a decision, the names and meanings of your plan tiers. These are facts and conventions specific to your enterprise that no amount of general capability will produce. This is where context earns its keep — it turns a capable general model into one that knows your business.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
flowchart TD
A["Candidate context item"] --> B{"Can the model infer it?"}
B -->|Yes| C["Leave it out"]
B -->|No| D{"Needed every turn?"}
D -->|Yes| E["Always-on layer"]
D -->|No| F{"Triggered by request?"}
F -->|Yes| G["On-demand skill"]
F -->|No| H["Fetch via tool when needed"]
The decision tree above is the whole discipline compressed. Two questions — can the model infer it, and how often is it needed — sort almost every candidate item to the right place. Run anything you're tempted to add through it.
What to leave out, and why
Leave out anything the model already knows. You don't need to explain what a JSON object is, how to write a polite email, or what a contract renewal generally is. Spending context teaching the model things it learned in pretraining is pure waste that dilutes your actual instructions. Trust the base capability and spend your tokens on the specifics it can't know.
Leave out data you can fetch. The temptation is to paste an account's full record into the skill "so it's there." Don't — fetch it through a tool when the request actually needs it. Static pasted data goes stale, bloats every turn, and often isn't the record the request needed anyway. A tool call gets fresh, relevant data exactly when required and keeps it in the ephemeral layer where it belongs. The same logic applies to long reference documents: link them through a skill's on-demand files rather than inlining them everywhere.
And leave out instructions for situations this plugin won't face. A renewal-drafting plugin doesn't need refund-handling rules. Every unrelated instruction is attention the model has to spend deciding it's irrelevant — and occasionally it'll decide wrong and apply it.
Writing the instructions that stay
For the context that does belong, how you write it matters as much as what it says. Be concrete and imperative. "Summaries must open with the renewal date and current ARR" beats "try to include relevant account details." Give the model the specific structure, the specific rules, the specific vocabulary — the things a sharp new hire would need written down because they couldn't guess them.
State the hard constraints unambiguously and put them where they'll always be seen — the always-on layer. "Never state a number that didn't come from a tool" is the kind of rule that must hold every single turn, so it lives at the top, not buried in a reference file. Reserve that always-on space for the few rules that truly are inviolable; if everything is marked critical, nothing is.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Tuning context after launch
Context design is iterative. After the plugin is in real use, watch where it goes wrong, and most failures point at a context fix. If it invents figures, your no-inventing rule isn't prominent enough — move it up a layer. If it loads a skill at the wrong time, the trigger is too broad. If it's slow and rambling, you're probably carrying ephemeral output too long. The fix is almost always to remove something or move it to a leaner layer, not to add another paragraph. Resist the urge to patch every miss with more instruction; a bloated context is its own failure mode.
Frequently asked questions
What should I put in a Cowork plugin's context?
The things the model can't infer: your house style, the required structure of outputs, your enterprise's policy rules, and your specific vocabulary. Context should turn a capable general model into one that knows your business — not re-teach it general knowledge.
Why is adding more context sometimes harmful?
Because every token competes for the model's attention. Past a point, extra context dilutes the relevant signal and makes the agent less reliable, not more. The goal is the smallest context that fully equips the model for the task.
Should I paste data into the skill or fetch it with a tool?
Fetch it. Pasted data goes stale, bloats every turn, and may not be the record the request needs. A tool call returns fresh, relevant data exactly when required and keeps it in the short-lived ephemeral layer.
Where should hard constraints live?
In the always-on layer, stated concretely and unambiguously, so they're present every turn. Reserve that space for the few truly inviolable rules — if everything is marked critical, the model can't tell what actually is.
Bringing agentic AI to your phone lines
Lean, well-chosen context is exactly what keeps CallSphere's voice and chat agents on-script and on-brand while they answer every call, pull live data through tools, and book work for you. Hear disciplined context design at callsphere.ai.
Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.