Prompt and Context Design for Claude Code: What to Include

Ask anyone who has driven Claude Code on a real project what they got wrong first, and a surprising number say the same thing: they put too much in front of the model, not too little. The instinct is that more context means smarter answers. In agentic work the opposite is often true. The skill that quietly made a six-week build succeed — in the hands of a non-technical PM, no less — was deciding what to include in context, what to deliberately leave out, and why. This post is about that judgment.

Context is the agent's working memory, and like any working memory it has a finite, valuable budget. Even with a million-token window, what you place in front of the model on each turn shapes its attention. Fill it with noise and the signal drowns; starve it of the project's actual rules and it improvises. Good context design is the art of keeping exactly the right things present at exactly the right time.

Context is a budget, not a bucket

The foundational mindset shift is to stop treating context as a place to dump everything that might be relevant and start treating it as a budget to spend carefully. Context design is the practice of curating which instructions, files, and prior results occupy the model's window each turn, optimizing for relevance and signal rather than sheer completeness. A focused window of the three files that matter beats the entire repository pasted in.

The PM learned this the hard way. Early on she'd paste long documents and whole folders "so Claude has everything," and the agent grew vague and slow, latching onto irrelevant details. The fix was counterintuitive: include less. Give the model the specific file it's editing, the standing project rules, and the immediate task — and trust the runtime to fetch more when it needs it. Her edits got sharper the moment she stopped over-feeding the window.

What belongs in context, always

A few things earn a permanent seat. The standing project rules — stack, conventions, and an explicit never-do list — belong in every session, because they constrain the agent's choices in the right direction without you repeating yourself. These live in a project memory file the runtime loads automatically. Keep this short and high-signal; a bloated rules file is just noise with a halo.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

The current task and recent conversation belong too, scoped tightly to what's being built right now. And the specific artifact under work — the file being edited, the failing test's output — belongs in context exactly when it's relevant. The PM's habit was to phrase tasks narrowly so the relevant artifact was obvious: "fix the validation in the job-creation route," not "make the app better." Narrow framing is itself a context decision; it tells the runtime what to pull in.

flowchart TD
  A["New turn begins"] --> B["Load standing project rules"]
  B --> C["Add current task & recent turns"]
  C --> D{"Need a specific file or result?"}
  D -->|Yes| E["Fetch just that artifact just-in-time"]
  D -->|No| F["Leave the window lean"]
  E --> G{"Window getting noisy?"}
  F --> G
  G -->|Yes| H["Summarize & write decisions to memory file"]
  G -->|No| I["Send to model & act"]
  H --> I

The branch in that diagram from G to H is the move most teams miss. When the window fills with old back-and-forth, you don't just let it compact silently — you capture the durable decisions into the memory file first, so the next turn re-grounds from a clean summary instead of losing the reasoning. Persist the conclusions, discard the deliberation.

What to deliberately leave out

Just as important is what you keep out. Leave out files unrelated to the current task — the agent can request them if needed, and their presence only dilutes attention. Leave out long pastes of documentation when a one-line rule captures what matters; "we use Postgres, snake_case columns" beats forty pages of database manual. Leave out stale conversation that's been superseded by later decisions, because contradictory context makes the model hedge.

And leave out secrets, always. Credentials belong in the MCP server's environment, not in the prompt — both for security and because anything in context can surface in a summary or log. The PM's rule of thumb was simple and effective: if including something doesn't change what the agent should do next, leave it out. That single question, asked honestly, prunes most of the clutter that degrades agent performance.

Designing prompts that pull the right context

Prompts and context aren't separate disciplines — a well-shaped prompt is partly a context-selection instruction. When the PM asked "following our existing API route convention, add a feedback endpoint," the phrase "existing convention" directed the runtime and model toward the right reference code. Naming the relevant module, the relevant pattern, or the relevant constraint inside the prompt is how you steer what gets pulled into the window.

The complementary technique is to ask the agent to externalize its working context for long tasks: "write your plan to a scratch file and work through it." The plan then lives in a durable artifact rather than only in the volatile conversation, so even after compaction the agent re-reads its own plan and stays on course. This is context design as a workflow, not just a one-shot decision — you're actively shaping what persists. Done well, it lets a multi-day build hold a coherent thread that no single context window could ever contain on its own.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

When context goes wrong: the symptoms

It helps to recognize the failure modes. An over-stuffed context shows up as vagueness — the agent gives generic answers and misses the specific thing you care about, because the signal is buried. An under-grounded context shows up as drift — the agent violates a convention you established earlier, because the rule was never in standing context or got compacted away. A contradictory context shows up as hedging and flip-flopping, because the window contains two incompatible truths.

The remedy for all three is the same loop the PM ran instinctively: keep standing rules short and present, fetch task artifacts just-in-time, and persist decisions to durable memory before the window fills. When the agent started drifting, her first question was never "is the model dumb today?" but "what's in its context right now, and what's missing?" That diagnostic instinct — treating context as the first suspect — is what experienced agent operators develop, and it's learnable even without a coding background.

Frequently asked questions

Doesn't a million-token window make context curation unnecessary?

No. A large window raises the ceiling but doesn't remove the cost of noise — irrelevant content still dilutes the model's attention and can bury the detail that matters. Curation is about signal-to-noise, not just capacity. The best results come from a focused window of exactly what's relevant, with the runtime fetching more on demand.

How do I keep important decisions from being lost to compaction?

Write them into a durable project memory file before the conversation gets compacted. Compaction summarizes and drops older turns, so anything that lives only in chat can vanish. Persisting decisions and conventions to a file the runtime reloads each session keeps them load-bearing across many sessions.

What's the simplest test for whether something belongs in context?

Ask whether including it changes what the agent should do on the next turn. If yes, include it; if no, leave it out. Secrets are the firm exception — they never belong in context regardless, and live in the MCP server's environment instead. This one question prunes most context clutter.

Bringing sharp context design to live calls

CallSphere applies the same context discipline to voice and chat agents — lean standing rules, just-in-time data, and persisted decisions — so an assistant stays grounded across a real, multi-turn conversation. Hear it work at callsphere.ai.

Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.

Prompt and Context Design for Claude Code: What to Include

Context is a budget, not a bucket

What belongs in context, always

What to deliberately leave out

Designing prompts that pull the right context

When context goes wrong: the symptoms

Frequently asked questions

Doesn't a million-token window make context curation unnecessary?

How do I keep important decisions from being lost to compaction?

What's the simplest test for whether something belongs in context?

Bringing sharp context design to live calls

Try CallSphere AI Voice Agents

Related Articles You May Like

Where Claude Code GTM engineering is heading next

Where Claude Cowork is heading and how to prepare

Measuring Claude Cowork success: metrics that prove it

How to measure success of Claude Code GTM workflows

Claude Cowork walkthrough: from problem to shipped

End-to-end Claude Code GTM workflow: a real rebuild