Context Design for a Claude Finance Narrative Agent

Two finance agents can run the same model, call the same tools, and produce wildly different narratives. The difference is almost always context design — what you choose to put in front of Claude on each call, and just as importantly, what you deliberately withhold. A bloated context produces vague, hedging commentary that buries the real story; a starved one produces confident nonsense. This post is about getting that balance right for the specific job of explaining financial results.

The core principle: context is a budget, not a bucket

Even with a million-token window, every token you add competes for the model's attention. The instinct to "give it everything" — the full chart of accounts, all prior MD&A, every budget memo — reliably makes narratives worse, not better. The skill is curation: for each line you ask Claude to explain, assemble the smallest set of inputs that fully determines a correct, specific explanation, and stop there. A focused context produces a focused sentence.

Think of context design as answering one question per call: "What does a careful analyst need on their desk to explain this single variance?" They need the number and its comparatives, the budget assumption, what they wrote last quarter, and any operational note that touches this account. They do not need the freight ledger or the full board deck. Build the context to match that desk, and the output starts to sound like that analyst.

What belongs in context

Four things earn their place. First, the fact for this line — actuals, prior, budget, computed variances, units, and signs, fully spelled out so nothing is inferred. Second, the relevant prior commentary, the two or three most recent notes on this account, which give the narrative continuity and let it say "again" or "reversing last quarter's trend." Third, the budget assumption or operational driver behind the line, so causal claims have something real to attach to. Fourth, the rules and voice that govern how to write it.

flowchart TD
  A["Material line to explain"] --> B["Pull fact for this line"]
  B --> C["Retrieve 2-3 prior notes by account"]
  C --> D["Add budget / driver context"]
  D --> E{"Anything irrelevant?"}
  E -->|Yes| F["Drop it from context"]
  E -->|No| G["Assemble minimal context"]
  F --> G
  G --> H["Claude writes grounded comment"]

What to leave out, and why

Leave out everything not tied to the line you're explaining. The full chart of accounts adds noise and tempts the model to draw cross-line comparisons it wasn't asked for and can't verify. Whole prior reports drown the relevant note in pages of unrelated text. Raw, un-normalized ledger rows reintroduce exactly the ambiguity your normalization layer removed. And speculative material — analyst opinions, unverified market commentary — is poison for a document that must be defensible, because the model will happily weave it in as fact.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

A subtler exclusion: don't include the numbers for lines you've gated out as immaterial. If you pass them "for completeness," the model tends to comment on them anyway, and your tight narrative balloons with notes about trivial swings. The materiality gate should shape context, not just the final edit. What the model never sees, it can't ramble about.

Designing the instruction layer

The instructions deserve as much care as the data. State the voice precisely — a careful analyst writing for leadership, plain and specific, no filler. State the invariants as hard rules: cite only provided numbers, attribute causes only to provided context, hedge explicitly when a cause is unknown, and keep length proportional to materiality. Crucially, give the model permission to say "the driver is not evident in the available data." Without that explicit out, models fabricate causes to satisfy the implicit demand for an explanation.

Keep these instructions stable and separate from the per-line data. Putting the durable rules and voice in the system prompt lets them be cached across every line of the statement and keeps the per-call context to just the small, changing facts. This separation is both a quality and a cost decision — stable instructions plus minimal per-line data is the efficient frontier for this workload.

Continuity without contamination

Continuity is what makes finance commentary feel intelligent — recognizing that margin has slipped three quarters running, or that a cost the team flagged last period finally reversed. You get it by retrieving prior commentary keyed to the same canonical account. But continuity carries a risk: if last quarter's note contained an error or an outdated assumption, naively including it propagates the mistake. Treat retrieved prior notes as context to reference, not ground truth to repeat, and have the model reconcile them against the current numbers rather than parrot them.

A practical guard is to pass prior notes with their period stamps and instruct the model to flag when current data contradicts a prior claim. That turns memory from a liability into a strength: the narrative not only carries the story forward but catches when the story has changed, which is exactly the kind of insight a sharp analyst provides and a careless one misses.

Tuning context with real reviews

Context design isn't set once; it's tuned against output. Log the exact context for every call and, when a reviewer marks a comment as vague or wrong, look at what was in front of the model. Vague output usually means too much irrelevant context diluting attention; wrong causal claims usually mean a missing driver the model filled in. Each correction tells you to add a specific input or remove a noisy one. Over a few close cycles this feedback loop converges on a lean, reliable context recipe.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Resist the temptation to fix every problem by adding more to the context — that's the move that slowly degrades the whole system. More often the fix is subtraction: removing the report that was drowning the relevant note, or gating out the immaterial lines that were inviting clutter. The best finance narrative agents run on surprisingly little context per call, precisely curated, which is what lets them stay sharp and fast on a deadline.

Frequently asked questions

If the context window is huge, why not include everything?

Because every token competes for attention. Including the full chart of accounts and whole prior reports reliably makes narratives vaguer and tempts the model into unverifiable cross-line claims. Curated, minimal context produces sharper, more specific commentary.

How do I stop the agent from inventing a cause for a variance?

Provide the real driver in context and explicitly permit the model to say the cause isn't evident in the data. Models fabricate causes mainly when the prompt implicitly demands an explanation and gives them no honest way to decline.

How much prior commentary should I retrieve?

Usually the two or three most recent notes for that specific account — enough for continuity, not so much that the relevant point drowns. Pass them with period stamps and ask the model to reconcile against current numbers rather than repeat them.

How do I know my context design is working?

Log the exact context per call and review it whenever a comment is marked vague or wrong. Vague output signals too much noise; wrong causes signal a missing input. Tune by subtraction first, addition second.

Bringing agentic AI to your phone lines

Tight, well-curated context is what separates an agent that sounds smart from one that is. CallSphere applies the same discipline to voice and chat agents — giving them exactly the context they need to answer every call, use tools mid-conversation, and book work around the clock. See it live at callsphere.ai.

Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.

Context Design for a Claude Finance Narrative Agent

The core principle: context is a budget, not a bucket

What belongs in context

What to leave out, and why

Designing the instruction layer

Continuity without contamination

Tuning context with real reviews

Frequently asked questions

If the context window is huge, why not include everything?

How do I stop the agent from inventing a cause for a variance?

How much prior commentary should I retrieve?

How do I know my context design is working?

Bringing agentic AI to your phone lines

Try CallSphere AI Voice Agents

Related Articles You May Like

Where Claude Code GTM engineering is heading next

Where Claude Cowork is heading and how to prepare

Measuring Claude Cowork success: metrics that prove it

How to measure success of Claude Code GTM workflows

Claude Cowork walkthrough: from problem to shipped

End-to-end Claude Code GTM workflow: a real rebuild