Prompt & Context Design for Claude Legal Agents

Ask two engineers why their Claude legal agents give different quality answers on the same contract, and nine times out of ten the difference is not the model — it is what each one chose to put in the context window. Legal work punishes sloppy context harder than almost any domain. Include too little and the agent guesses at a firm standard it was never shown. Include too much and the relevant clause drowns under a hundred pages of boilerplate. Prompt and context design is the quiet discipline that separates a legal agent you can trust from one that produces confident, plausible, and wrong analysis.

This post is about that discipline specifically — not architecture, not tools, but the editorial judgment of what goes into the window each turn and why. The governing principle is simple to state and hard to practice: every token in context should earn its place by improving the current decision, and everything else should be reachable through a tool rather than resident in the window.

What always belongs in context

A small set of things should be present every single turn. First, the agent's role and rules — that it assists licensed attorneys, cites every clause it references, and frames conclusions as analysis rather than advice. These are non-negotiable and stable, so they live in the system prompt unchanged across matters. Second, the specific question and its scope — the matter, the document type, what the attorney actually wants to know. Vague scope produces vague analysis, so the orchestrator should resolve and state the scope explicitly rather than leaving Claude to infer it.

Third, the relevant slice of the playbook — the firm's standard positions for exactly the clauses in play, retrieved fresh, not the entire playbook. And fourth, the clauses under review with their provenance attached. That is the core working set: rules, scope, applicable standards, and the specific text. Notice what is small here — a focused context of the few standards and clauses that bear on the question consistently outperforms a sprawling one, because the model's attention concentrates where it matters.

What to deliberately leave out

The harder skill is exclusion. Leave out the full document when only a few clauses are at issue — let the agent pull more through a tool if it discovers it needs them. Leave out the entire playbook when only the indemnification and liability sections apply. Leave out prior conversation turns that no longer bear on the current decision; a long, stale history confuses the model about what it is being asked now. Leave out internal metadata, formatting noise, and system plumbing that the model does not need to reason about.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

The reason exclusion matters so much in legal work is that irrelevant context does not merely waste tokens — it actively degrades accuracy. When the relevant clause sits among ninety pages of unrelated terms, the model is more likely to anchor on the wrong passage or miss the deviation entirely. A disciplined context-builder asks of every block: does this change the answer to the question being asked? If not, it stays out of the window and behind a tool. This is also the cheaper path, but cost is the lesser benefit; correctness is the real one.

flowchart TD
  A["Attorney question + scope"] --> B["Stable rules (system prompt)"]
  A --> C{"What does THIS question need?"}
  C --> D["Relevant playbook sections only"]
  C --> E["Clauses under review + provenance"]
  C -->|Not needed now| F["Leave in store, reachable via tool"]
  B --> G["Assembled lean context"]
  D --> G
  E --> G
  G --> H{"Claude: enough to answer?"}
  H -->|No| F
  H -->|Yes| I["Sourced analysis"]

Make context carry provenance, always

In legal work, a clause without a citation is unusable, and the only reliable way to get cited output is to put cited input into context. Every piece of retrieved text should arrive with its source reference attached — document and paragraph — so that when Claude reasons over it, it can carry that reference straight into its answer. If context blocks are anonymous slabs of text, the model has nothing to cite and will either omit citations or, worse, invent them.

This is why provenance is a context-design concern and not just a retrieval one. The way you format a retrieved clause in the window — text plus a clearly labeled source — directly shapes whether the agent's output is verifiable. A practical habit is to wrap each retrieved item so its identifier is unmistakable and instruct the agent, in the stable rules, to reference that identifier whenever it relies on the item. Provenance designed into the context is what makes the downstream citation check meaningful; without it, the check has nothing to verify against.

Design context for the tool loop, not the single shot

A subtle shift in mindset improves every legal agent: design context assuming the agent can fetch more, rather than assuming this turn is the only chance. Because Claude operates in a tool loop, you do not need to pre-load everything it might conceivably want. You load the lean working set, and you trust the agent to recognize a gap and pull the missing clause or standard through a tool. This is what makes aggressive exclusion safe — exclusion is not deprivation when the excluded material is one tool call away.

The corollary is that your tools must make the missing material easy to find, which loops context design back into tool design. If the agent might need a related amendment, there must be a tool that fetches it by reference. If it might need a comparable clause from another contract, semantic search must be available. Good context design and good tool design are two halves of one decision: keep the window focused, and make everything outside it reachable. Prompt and context design for a legal agent is the practice of choosing the minimal, provenance-bearing working set that lets the model answer the current question accurately, with everything else available on demand.

Common context mistakes that produce bad legal answers

A few failure patterns recur. Dumping the entire contract 'to be safe' buries the relevant clause and lowers accuracy. Embedding the whole playbook in the system prompt bloats every turn and couples policy to code. Carrying a long conversation history forward verbatim confuses the model about the current ask. Stripping provenance from retrieved text makes citations impossible. And omitting explicit scope leaves the agent guessing at jurisdiction or document type. Each of these is an editorial error, not a model limitation — and each is fixable by returning to the governing question: does this token improve the current decision?

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

The teams whose legal agents earn attorney trust are the ones who treat context assembly as a first-class engineering surface, tuned and tested like any other. They version their system prompts, measure accuracy against an evaluation set as they adjust what goes in the window, and resist the gravitational pull of a huge context window toward filling it. Restraint, in this domain, is a feature.

Frequently asked questions

Should I put the whole contract in context to be safe?

No. Including the full document when only a few clauses are at issue buries the relevant text and lowers accuracy. Load the clauses that bear on the question and let the agent pull more through a tool if it finds a gap — exclusion is safe because the rest is one tool call away.

Where should the firm's playbook live?

In retrieval, loaded as the specific sections relevant to the current question — not embedded whole in the system prompt. This keeps each turn's context lean, lets lawyers edit standards without a redeploy, and avoids coupling legal policy to engineering code.

How do I get the agent to cite its sources reliably?

Put cited material into context. Every retrieved clause should arrive with a clear source reference, and the stable rules should instruct the agent to carry that reference into any answer that relies on it. Cited input plus an explicit citation rule produces citable, verifiable output.

Focused context, on the phone

The same restraint — load only what the current decision needs, keep provenance, let tools fetch the rest — is what keeps any agent accurate under pressure. CallSphere brings this discipline to voice and chat, with assistants that pull just the right information mid-conversation and book work at any hour. See it live at callsphere.ai.

Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.

Prompt & Context Design for Claude Legal Agents

What always belongs in context

What to deliberately leave out

Make context carry provenance, always

Design context for the tool loop, not the single shot

Common context mistakes that produce bad legal answers

Frequently asked questions

Should I put the whole contract in context to be safe?

Where should the firm's playbook live?

How do I get the agent to cite its sources reliably?

Focused context, on the phone

Try CallSphere AI Voice Agents

Related Articles You May Like

Where Claude Code GTM engineering is heading next

Where Claude Cowork is heading and how to prepare

Measuring Claude Cowork success: metrics that prove it

How to measure success of Claude Code GTM workflows

Claude Cowork walkthrough: from problem to shipped

End-to-end Claude Code GTM workflow: a real rebuild