Claude Code 1M Context: A Step-by-Step Setup Guide
A follow-along guide to configure Claude Code for long sessions and the 1M-token context window: setup, memory, compaction, resume, and verify.
Reading about the 1M-token context window is one thing; getting your own Claude Code session to use it well is another. This is a hands-on walkthrough you can follow start to finish. By the end you will have a session configured for long-running work, project memory that survives compaction, the large-context path enabled, and a habit for resuming and forking sessions without losing the thread. No theory dumps — just the steps, in order, with the reasoning attached so you know why each one matters.
Step 1: Start a clean session in the right directory
Launch Claude Code from the root of the project you actually want it to reason over. The starting directory is the agent's anchor: it scopes file discovery, command execution, and the relative paths in every tool call. Starting one level too high pulls in unrelated trees and wastes context; starting too deep blinds the agent to shared modules it needs.
On launch, the tool reads project configuration and any memory files, enumerates available tools and MCP servers, and opens a fresh transcript. Confirm the working directory and the model before you type your first real request. For a long architectural session you want the most capable model; for fast, repetitive edits a smaller, cheaper model keeps latency low. This single choice sets the cost and quality envelope for the entire session.
Step 2: Write project memory before you write prompts
Before issuing your first task, give the agent durable context. Create a project memory file describing the architecture, conventions, build and test commands, and any do-not-touch areas. This file is re-injected as stable context on every turn, which means it survives compaction when raw conversation does not. Five minutes here saves you from repeating yourself for the next three hours.
Keep it crisp and factual. List the package layout, the testing command, naming conventions, and the two or three landmines specific to your codebase. Avoid prose the agent can infer from the code itself; spend the words on things it cannot discover by reading files, like deployment quirks or an intentional-but-unusual pattern.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
flowchart TD
A["Start session in project root"] --> B["Write project memory file"]
B --> C["Connect MCP servers & tools"]
C --> D["Give scoped first prompt"]
D --> E{"Context filling up?"}
E -->|No| F["Continue iterating"] --> E
E -->|Yes| G["Trigger compaction / summarize"]
G --> H["Resume or fork session later"]
Step 3: Connect the tools and MCP servers you need
An agent is only as capable as the tools it can reach. Wire up the MCP servers relevant to this task — a database connector, an issue tracker, your internal API — before you start, so the agent can pull live data instead of guessing. Each connected server contributes its tool schemas to the prompt, so connect what you need and skip what you do not; unused schemas are dead weight in the context budget.
Verify the connection with a small read-only request first. Ask the agent to list tables, fetch one record, or describe an endpoint. Confirming the wiring early means that when you hand it the real task, a failure is a logic problem, not a misconfigured server you discover an hour in.
Step 4: Frame the first prompt to fill context deliberately
Now use the large window on purpose. For a 1M-token task — say, a cross-cutting refactor — point the agent at the specific directories and ask it to read them into context before proposing changes. A good first prompt names the goal, the relevant files or modules, the constraints, and the definition of done. The agent reads, builds a mental model across the whole surface area, and only then acts.
Resist the urge to dump the entire repository if you do not need it. The window is large enough to hold a lot, but every token you load is a token of cost and a token competing for the model's attention. Load the slice the task touches plus its immediate dependencies. Precision here is what separates an effective big-context session from an expensive, unfocused one.
Step 5: Manage the budget as the session grows
Watch how full the context is getting. As a long session accumulates tool output and edits, the assembled prompt climbs toward the budget. When it does, compact deliberately: ask the agent to summarize progress, the decisions made, and the remaining work into a concise checkpoint. This collapses a long, verbose history into a dense note, freeing room while preserving the thread.
Two habits keep marathon sessions healthy. First, prune stale tool output — a giant file you read ten steps ago rarely needs to stay verbatim. Second, checkpoint at natural boundaries, like after finishing one module before starting the next. Both are forms of the same discipline: spend context on what the next step needs, not on a complete archive of everything that happened.
Step 6: Resume, fork, and verify
When you stop, the session persists. Resuming reloads the transcript and working state so the agent continues where it left off, summaries intact. Forking is the power move: branch from a known-good checkpoint to try a risky change without polluting the main thread, then keep whichever branch worked. Treat sessions like git for your reasoning — cheap to branch, easy to abandon.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Finally, verify. After any substantial change, have the agent run the tests and build it learned about from your memory file. A long-context session that edits twenty files is only valuable if the result actually compiles and passes — make verification the last step of every task, not an afterthought.
Frequently asked questions
Do I need to enable anything special to use the 1M-token window?
You start a session and load context into it; the large window is the capacity the agent draws on as the prompt grows. The work is on your side: load the right files deliberately and manage the budget, because the window is a resource to spend wisely, not free space to fill.
What should go in a project memory file?
Durable, hard-to-infer facts: package layout, build and test commands, naming conventions, deployment quirks, and any do-not-touch areas. It is re-injected as stable context every turn, so it survives compaction and keeps the agent aligned across long sessions.
When should I compact a session manually?
When the context is filling up or at natural task boundaries. Ask the agent to summarize progress and remaining work into a checkpoint; this collapses verbose history into a dense note, freeing budget while preserving the decisions and open threads.
Can I safely experiment without ruining a long session?
Yes — fork from a known-good checkpoint. The branch inherits the accumulated context, so you can try a risky change in isolation and keep or discard it, much like a git branch for the agent's working state.
Bringing agentic AI to your phone lines
This same step-by-step rigor — scoped context, durable memory, deliberate budget management — is what powers CallSphere's voice and chat agents, which hold a conversation, use tools mid-call, and book work 24/7. See it live at callsphere.ai.
Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.