---
title: "Prompt and Context Design for Claude Agents"
description: "Learn what to put in a Claude agent's context and what to leave out, and why: working-set thinking, compression, and system prompts that grant bounded agency."
canonical: https://callsphere.ai/blog/prompt-and-context-design-for-claude-agents
category: "Agentic AI"
tags: ["agentic ai", "claude", "context engineering", "prompt design", "system prompt", "agent reliability"]
author: "CallSphere Team"
published: 2026-03-05T09:32:44.000Z
updated: 2026-06-06T21:47:43.916Z
---

# Prompt and Context Design for Claude Agents

> Learn what to put in a Claude agent's context and what to leave out, and why: working-set thinking, compression, and system prompts that grant bounded agency.

Ask experienced agent builders what separates a flaky agent from a dependable one, and most will eventually point at the same thing: what is in the context window. The model is fixed; the tools are fixed; the variable you control on every turn is the information Claude sees. Putting the wrong things in — or leaving the right things out — is the quiet cause of agents that hallucinate, wander, or cost three times what they should. This post is a focused treatment of prompt and context design: what belongs, what does not, and the reasoning behind each call.

## The context window is a working set, not an archive

The most useful reframe is to stop thinking of the context as a transcript and start thinking of it as a working set — the minimal collection of facts and instructions Claude needs to take the next correct action. Everything in the window is re-read by the model on every single turn, so each token competes for attention with every other token. A window stuffed with stale tool output and resolved sub-questions does not just cost money; it actively dilutes the signal the model needs right now. Less, but more relevant, beats more.

This is why context design is an active discipline rather than a one-time prompt-writing task. On each turn you are deciding what to carry forward, what to summarize, and what to drop. The model has no way to know that a result from twenty steps ago is no longer relevant unless you stop showing it. Curating the working set is the highest-leverage habit in agent engineering, and it is almost entirely under your control.

## What belongs in context

A focused agent context has a predictable composition. It holds the **role and constraints** — who the agent is and the rules it must not break. It holds the **current goal**, stated plainly. It holds the **tool definitions** for the tools this step might need. It holds **recent, relevant results** — the last few tool outputs that bear on the next decision. And it holds **retrieved facts**: the specific documents or records pulled in for this task. If a piece of information does not help the model choose its next action, it probably does not belong.

```mermaid
flowchart TD
  A["New turn begins"] --> B["Keep role + hard constraints"]
  B --> C["Keep current goal"]
  C --> D{"Result still relevant?"}
  D -->|Yes| E["Keep recent result"]
  D -->|No| F["Summarize or drop it"]
  E --> G["Add only needed tool schemas"]
  F --> G
  G --> H["Send focused working set to Claude"]
```

Order within the context carries meaning too. Put the durable, high-priority material — role and the hardest constraint — at the top where it anchors everything that follows, and keep the volatile, task-specific material toward the end where the most recent information naturally lives. A constraint stated once at the top and reinforced near the relevant action is obeyed far more reliably than the same words buried in the middle of a long block.

## What to deliberately leave out

Knowing what to exclude is as important as knowing what to include. Leave out resolved sub-conversations — once a lookup answered its question, the raw exchange can be compressed to its conclusion. Leave out tools the current step cannot use; a planning step does not need the schema for `charge_card`, and showing it only invites a premature call. Leave out verbose raw payloads when a three-field summary captures what the model needs. And leave out secrets and credentials entirely — they belong in the executor, never in the window the model reasons over.

The subtle one is leaving out instructions the agent does not need yet. If your agent occasionally writes incident reports, the detailed report format does not belong in the base prompt of every run — it belongs in a Skill that loads only when an incident report is actually being written. Carrying rarely-needed procedure in every prompt is a tax you pay on every turn for a payoff you get on few. Progressive disclosure keeps the working set sharp.

## Compression: how to keep continuity without bloat

Long-running agents would overflow any window if they carried raw history, so compression is non-negotiable. The reliable pattern is a rolling state summary: every few steps, distill the conversation into a compact "here is what we know and what is left" note and carry the summary forward instead of the full transcript. You can do this with the same agent or offload it to a cheaper model like Haiku to keep costs down. Continuity survives; the bloat does not.

Be deliberate about what the summary preserves: decisions made, facts established, open questions, and constraints already discovered. Drop the chatter — the intermediate reasoning that led to a conclusion is rarely needed once the conclusion is recorded. A good summary reads like the state of a task someone could pick up cold, which is exactly the property you want when the agent resumes after a long task or a restart.

## Designing the system prompt for agency

The system prompt is where you grant and bound the agent's autonomy, and the tone should be operational, not conversational. State the role, the constraints as imperatives, the available actions, the output contract, and the stopping condition. Be explicit about authority limits — what the agent may do alone versus what it must escalate — because ambiguity here is where agents overreach. A prompt that reads like a clear job description with firm boundaries produces an agent that acts confidently within them and stops at the edges.

Resist the urge to over-specify the path. Tell the agent the goal, the constraints, and the tools, then trust the model to plan. Scripting every step in the prompt fights the very capability you are paying for and makes the agent brittle the moment reality deviates from your script. Specify the destination and the guardrails; let Claude find the route.

## Frequently asked questions

### Why does extra context hurt instead of help?

Because the model re-reads the entire window on every turn, and every token competes for attention. Stale or irrelevant content dilutes the signal the model needs for its next action and raises cost on each step. A smaller, more relevant working set produces sharper, cheaper decisions than a larger archive.

### Should I script the agent's steps in the system prompt?

No. State the goal, constraints, available tools, and stopping condition, then let Claude plan the route. Over-scripting fights the model's planning ability and makes the agent brittle when reality differs from your script. Specify the destination and the guardrails, not every turn.

### How do I keep continuity on a long task without overflowing context?

Use a rolling state summary: every few steps, compress the conversation into a compact note of decisions, facts, and open questions, and carry that forward instead of the raw transcript. Offloading the summarization to a cheaper model keeps it inexpensive while preserving the continuity the agent needs.

## Bringing agentic AI to your phone lines

CallSphere applies this same context discipline to **voice and chat** — its agents carry just the right working set into each conversation, act with tools mid-call, and book real work around the clock without losing the thread. See it live at [callsphere.ai](https://callsphere.ai).

---

*Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.*

---

Source: https://callsphere.ai/blog/prompt-and-context-design-for-claude-agents