Skip to content
Agentic AI
Agentic AI7 min read0 views

Context Design for Claude Opus: What to Include or Cut

Context engineering for Claude Opus: what to include, what to cut, how to manage a growing window, and why curation beats cramming for agent quality.

The biggest quality lever in agentic coding is not the prompt you type — it is everything else already sitting in the context window when Opus reads it. Two engineers can give Claude Code the identical request and get wildly different results because one fed it a clean, relevant window and the other fed it a cluttered one. Context design is the discipline of deciding what information enters the model's working memory, in what form, and at what moment. It is underrated precisely because it is invisible, and it is where the best operators quietly win.

This piece is about the judgment calls: what to include, what to cut, and the reasoning behind each. Opus 4.8 can work against a very large window, but capacity is a trap. The fact that you can include the whole repository does not mean you should, and understanding why is the difference between an agent that stays sharp and one that wanders.

Why more context is not better context

Context engineering is the practice of curating exactly what a model sees at each step so its limited attention is spent on information that actually changes the answer. Every token you add competes with every other token for the model's focus. A window padded with tangential files, stale conversation, and verbose logs does not make Opus more informed; it makes the signal harder to find.

There is a real phenomenon where relevant facts buried in a long, noisy window get effectively overlooked. The model has the information in principle but reasons as if it does not, because the important line is drowned. This is why a tightly scoped window of five pertinent files routinely beats a sprawling one with fifty. Capacity is about what the model can hold; quality is about what you choose to put there.

What belongs in the window

Include the things that change the decision. For a code task, that is the specific files being modified, the immediate callers or callees that constrain the change, the relevant test, and the conventions that govern how it should be written. Include the definition of done — what "correct" means for this task — because an agent optimizing toward an unstated goal will optimize toward the wrong one.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

Stable project knowledge belongs in CLAUDE.md, where it is injected automatically: the build and test commands, the coding conventions, the architectural constraints a newcomer would violate by accident. Task-specific knowledge belongs in the prompt and the files Opus reads on demand. The art is matching the lifetime of the information to where it lives — durable facts persist, volatile facts arrive fresh and leave when the task ends.

flowchart TD
  A["Candidate info"] --> B{"Changes the decision?"}
  B -->|No| C["Leave it out"]
  B -->|Yes| D{"Stable or volatile?"}
  D -->|Stable| E["Put in CLAUDE.md"]
  D -->|Volatile| F["Load on demand this task"]
  E --> G["Lean, relevant window"]
  F --> G
  G --> H["Opus reasons sharply"]

What to leave out, and why

Cut anything the model can fetch on demand. Because Claude Code lets Opus search and read files as it works, pre-loading the entire codebase is wasteful; it pays the attention cost up front for material that may never matter. Trust retrieval. Cut verbose tool outputs — a tool that returns a thousand log lines should be filtered to the lines that bear on the task before they reach the window.

Cut stale conversation. A long session accumulates dead ends, abandoned plans, and superseded edits, and all of it competes for attention with the live task. When the thread has drifted, the right move is to summarize the durable conclusions and start a fresh session rather than dragging the whole history forward. Leave out information whose only effect is to make the model hedge, second-guess, or rehash decisions already made.

Managing context as a session grows

Context is not set once; it evolves over a session, usually by growing. The harness summarizes earlier turns as the window fills, which preserves continuity but inevitably loses detail. The practical implication is that long-running sessions degrade: precision drops as summaries replace specifics, and cost rises as every turn carries the accumulated weight.

The countermeasure is deliberate resets. When you finish a unit of work, capture what matters — the decisions made, the conventions discovered — into a durable place like CLAUDE.md or a short note, then begin the next task with a clean window seeded only by that distilled knowledge. This is the agentic version of writing things down so you can forget them safely. It keeps each task running against a sharp, purpose-built context instead of an ever-thickening soup.

Context design and security

What you put in context is also an attack surface. If the agent reads untrusted content — a web page, a third-party file, a user-submitted ticket — instructions hidden in that content can attempt to hijack the model. Treat external text as data, not commands, and be deliberate about which untrusted sources you allow into the window at all. The same curation instinct that protects quality also protects safety: less unvetted material in context means less room for injected instructions to take hold.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Pair this with the least-privilege tool design from good MCP practice, and context becomes a coherent safety story rather than a leak. The window is the model's reality; everything in it shapes both what the agent does and what it can be tricked into doing. Designing it well is not housekeeping — it is one of the core engineering skills of working with Opus in Claude Code.

Frequently asked questions

Why not just load the whole repo since Opus has a huge window?

Capacity is not quality. Every token competes for attention, and relevant facts buried in a large noisy window get effectively overlooked. A lean window of the right files produces sharper reasoning than a sprawling one.

What is the difference between CLAUDE.md and per-task context?

CLAUDE.md holds stable, durable knowledge injected automatically every turn, like build commands and conventions. Per-task context is volatile detail loaded on demand for one task and discarded after. Match the information's lifetime to where it lives.

When should I reset the context window?

When a session has drifted or grown long. Summarize the durable conclusions into CLAUDE.md or a note, then start a fresh session. This avoids the precision loss and rising cost that come with an ever-accumulating window.

How does context design relate to security?

Untrusted content in the window can carry hidden instructions that try to hijack the agent. Treat external text as data, limit which unvetted sources enter context, and the same curation that protects quality also reduces injection risk.

Bringing agentic AI to your phone lines

Sharp context design is what keeps a live voice agent on track when a call goes sideways. CallSphere applies these context-engineering principles to voice and chat — agents that hold exactly the right information per call, use tools mid-conversation, and book work without losing the thread. See it live at callsphere.ai.


Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.