Skip to content
Agentic AI
Agentic AI6 min read0 views

Prompt and Context Design for Claude Finance Agents (Cowork Plugins Finance)

What to put in a Claude finance agent's context and what to leave out: tiny system prompts, skills vs prompts, retrieval, and ordering for accuracy.

The fastest way to make a Claude finance agent worse is to give it more. More background, more pasted reports, more "just in case" instructions — each addition dilutes attention and pushes the figures that matter further from the model's focus. Context design is the quiet skill that separates a finance agent you trust from one that confidently misreports a balance. This post is about what to put in context, what to deliberately leave out, and the reasoning behind each call.

Key takeaways

  • Context is a budget, not a junk drawer — every token competes for the model's attention.
  • Keep the system prompt tiny: role, tone, hard guardrails, nothing more.
  • Load procedure from skills on demand and facts from tools fresh; don't paste either inline.
  • Use retrieval to pull only the policy passage the task needs, not the whole policy manual.
  • Put the most decision-critical data last, where the model attends most strongly.

Treat context as a budget

Even with a large context window, attention is finite and roughly competitive: the more you load, the less weight any single fact carries. Claude's models can hold a great deal, but "can hold" is not "should hold." For finance, where a single transposed figure matters, the goal is the smallest context that contains everything decision-relevant and nothing else. Every paragraph you add should earn its place by changing what the agent decides.

The discipline is to ask, for each candidate piece of context: does the agent need this to make the next decision correctly? If it can be fetched when needed, fetch it. If it's only relevant sometimes, gate it behind a skill. If it's never decision-relevant, cut it.

The four context layers and what belongs in each

It helps to think in layers, because each has a different lifetime and a different rule. The system prompt is always on, so it must be tiny. Skills load conditionally. Tool results arrive fresh per call. Retrieved snippets are pulled just-in-time. Mapping content to the right layer is most of the job.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →
flowchart TD
  A["Finance request"] --> B{"Always relevant?"}
  B -->|Yes| C["System prompt: role + guardrails"]
  B -->|No| D{"Procedural?"}
  D -->|Yes| E["Load matching skill"]
  D -->|No| F{"A current fact?"}
  F -->|Yes| G["Fetch via tool"]
  F -->|No| H["Retrieve policy snippet"]
  C --> I["Assemble minimal context"]
  E --> I
  G --> I
  H --> I

Keep the system prompt small and sharp

A finance system prompt should fit comfortably on one screen. It states who the agent is (a careful finance assistant), how it behaves (cite every figure, never estimate), and the non-negotiables (never post without approval). That's it. Resist the urge to paste the chart of accounts or last quarter's results here — those are facts that go stale and belong in tools.

You are a careful finance assistant for [Company].
Report only figures returned by tools this session, and cite each.
Never estimate a number; if unavailable, say so.
Never post or approve entries without explicit human approval.
Follow the loaded skill's steps exactly.

Five lines. Everything else the agent needs, it acquires at runtime from the right layer.

Retrieval over pasting

Finance policy manuals are long and mostly irrelevant to any single task. Pasting the whole revenue-recognition policy to answer one accrual question wastes budget and buries the relevant clause. Instead, retrieve the specific passage. A skill can reference a small set of cited policy files, and a retrieval step pulls only the matching section into context. The agent reads the clause it needs, cites it, and moves on — the other forty pages never compete for attention.

ContentDon'tDo
Chart of accountsPaste in promptFetch via tool
Close procedureInline every runSkill, loaded on demand
Rev-rec policyPaste whole manualRetrieve the clause
Prior resultsHardcode in promptFetch fresh per run

Order matters: critical data last

Models attend more strongly to the start and especially the end of context. For finance, place the data the agent must act on — the trial balance rows it's reconciling, the entry it's reviewing — near the end, right before the instruction to produce output. Background and procedure go earlier. This small ordering choice measurably reduces the chance the agent overlooks the figure that drives its conclusion.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Common pitfalls

  • The kitchen-sink prompt. Stuffing every conceivably useful fact into context lowers accuracy. Add only what changes a decision.
  • Stale pasted data. Hardcoding a trial balance means it's wrong by next run. Fetch facts through tools every time.
  • Whole-manual policy dumps. Pasting a full policy buries the one relevant clause. Retrieve the passage instead.
  • Burying the key data in the middle. Critical figures placed mid-context get less attention. Put them last.
  • Procedure in the system prompt. It's always-on weight for something only sometimes relevant. Move it to a skill.

Tighten your context in 6 steps

  1. Cut the system prompt to role, behavior, and hard guardrails only.
  2. Move every procedure out of prompts and into skills that load conditionally.
  3. Replace any pasted financial data with tool calls that fetch it fresh.
  4. Swap whole-document policy dumps for retrieval of the specific clause.
  5. Reorder context so the decision-critical data sits last.
  6. Re-run a known case and confirm accuracy held or improved with less context.

Frequently asked questions

If the context window is huge, why ration it?

Because attention is competitive. A bigger window lets you hold more, but irrelevant content still dilutes focus on the figures that matter. Smaller, sharper context generally yields more accurate finance output.

How do I decide skill vs. system prompt?

Ask whether the content is relevant to every request. Role and guardrails are — they go in the system prompt. A close procedure is relevant only during close — it goes in a skill that loads when needed.

Does retrieval add latency?

A little, but it usually nets out faster and cheaper because the agent processes far less text and reasons over a focused, relevant snippet instead of wading through an entire manual.

Bringing agentic AI to your phone lines

This same context discipline — tiny prompts, on-demand skills, fresh tool data — is what keeps CallSphere's agentic voice and chat assistants sharp as they answer every call, pull the right record mid-conversation, and book work 24/7. See it live at callsphere.ai.


Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.