How Claude Skills Work Internally vs Prompts and MCP
Inside Claude Agent Skills: progressive disclosure, the metadata index, and how Skills differ from prompts, Projects, MCP, and subagents.
If you have spent any time wiring up Claude Code, you have probably hit the same wall everyone hits: your system prompt grows until it becomes a 4,000-word wall of instructions, half of which are irrelevant to whatever the user just asked. You paste in API conventions, a coding style guide, a deployment runbook, and three different file formats Claude needs to know about — and every single token rides along in context whether or not it is needed. Agent Skills exist to solve exactly that problem, and to understand them you have to look at how they actually work under the hood rather than treating them as "prompts with extra steps."
What a Skill actually is at the file level
An Agent Skill is a folder. At minimum it contains a SKILL.md file with YAML frontmatter (a name and a description) followed by a body of instructions written in Markdown. Optionally the folder also holds scripts, reference documents, templates, and any other resources the Skill needs. An Agent Skill is a self-contained folder of instructions, scripts, and resources that Claude discovers by metadata and loads into context only when a task makes it relevant. That last clause is the whole point: a Skill is not always-on text. It is text that gets pulled in on demand.
Compare that to a raw prompt. A prompt is a flat string you hand to the model, fully resident in the context window for the entire request. There is no discovery step, no conditional loading, no associated files. A Skill, by contrast, has structure the runtime understands — a name to match against, a description to decide relevance, a body to inject, and a directory the agent can read from or execute.
That structural difference has consequences beyond tidiness. Because a Skill is a directory rather than a string, it can carry executable code, versioned reference material, and templates that travel together as one unit. You can put a Skill under source control, review changes to it in a pull request, and ship it to a whole team the same way you ship any other code. A prompt buried in someone's settings or pasted into a chat has none of those properties — it is invisible to your tooling and impossible to govern. Skills turn ad-hoc instructions into first-class, auditable artifacts.
Progressive disclosure: the core mechanism
The internal trick that makes Skills scale is progressive disclosure, which works in three tiers. Tier one is the metadata index. When Claude Code starts, it scans the available Skills and reads only their frontmatter — name and description, typically under a hundred tokens each. Dozens of Skills can sit in this index for the cost of a paragraph. Tier two is the body: the moment Claude decides a Skill is relevant, it loads the full SKILL.md instructions into context. Tier three is the resources: scripts and reference files the body points to, which Claude reads or runs only at the exact moment they are needed.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
flowchart TD
A["Agent starts"] --> B["Scan Skills, index name+description only"]
B --> C["User sends task"]
C --> D{"Does any Skill description match intent?"}
D -->|No| E["Answer from base context"]
D -->|Yes| F["Load full SKILL.md body"]
F --> G{"Body references a script or doc?"}
G -->|No| H["Act on instructions"]
G -->|Yes| I["Read file or run script in skill dir"]
I --> H
This is why a team can install fifty Skills without blowing the context budget. The expensive content stays on disk until the description match fires. The selection itself is model-driven: Claude reads the index and reasons about whether a Skill's description covers the current request, the same way it would decide which tool to call. There is no separate classifier model and no keyword router — relevance is a judgment the agent makes in-band.
It is worth dwelling on why this three-tier shape matters more than it first appears. The naive alternative — load everything that might be relevant — fails for two compounding reasons. The first is cost: every always-resident token is billed on every single turn of a long conversation, so a fat standing prompt taxes the entire session, not just the one message that needed it. The second is the attention problem: a model spreads its focus across whatever is in front of it, so padding the context with rarely-relevant procedure measurably degrades the model's handling of the parts that are relevant. Progressive disclosure sidesteps both. The metadata index is cheap enough to ignore, the body arrives only when earned, and the heavy resources never load unless a concrete step demands them. The result is a system whose context cost grows with the work actually being done rather than with the size of your Skill library.
Where this sits relative to MCP and Projects
People conflate Skills with MCP because both extend Claude, but they operate on different axes. The Model Context Protocol gives Claude new capabilities — it connects the agent to external systems through a server that exposes tools, resources, and prompts over a defined wire protocol. MCP answers "what can the agent reach?" A Skill answers "how should the agent behave and what does it need to know to do this task well?" The cleanest mental model: MCP brings the hands, Skills bring the know-how. A Skill frequently exists precisely to teach Claude how to drive an MCP server's tools correctly — which sequence to call them in, what the fields mean, what the gotchas are.
Projects (and the Claude Cowork plugin equivalent) are a third axis: persistent, always-loaded context scoped to a workspace. Project knowledge is more like a permanent prompt — it is there for every message in that Project. Skills are conditional and portable; you can move a Skill folder between Claude Code, Cowork, and the Agent SDK and it behaves the same. So the architecture stacks: Projects set the standing context, MCP extends reach, and Skills inject task-specific procedure exactly when triggered.
Seeing them as orthogonal axes rather than competing features is the unlock. You do not choose between a Project and a Skill any more than you choose between a database and a function — they answer different questions and routinely appear together. A well-designed setup might carry a Project that establishes the company's voice and standing constraints, several MCP servers wiring in the CRM and the billing system, and a dozen Skills that each encode how to perform a specific recurring task using those tools. Each layer has a single clear responsibility, which is exactly what makes the whole thing maintainable as it grows.
Skills versus subagents
Subagents are an execution-isolation mechanism, not a knowledge mechanism. When Claude Code spawns a subagent, it hands off a task to a fresh context window with its own token budget, runs in parallel with siblings, and returns a summarized result to the orchestrator. That isolation is the value: the subagent's intermediate reasoning never pollutes the main thread. A Skill, by contrast, lives inside whatever agent loads it. The two compose naturally — an orchestrator can spawn a subagent, and that subagent can load the Skills relevant to its slice of the job. One is about who runs the work; the other is about what knowledge that worker pulls in.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Why the boundaries matter in practice
Getting these distinctions right changes how you architect a system. If you find yourself stuffing tool-usage instructions into a system prompt, that is almost always Skill-shaped content that should be progressively disclosed. If you are duplicating the same procedure across multiple Projects, extract it into a portable Skill. If a long multi-step task is drowning your main context in noise, the answer is a subagent, not a bigger prompt. And if Claude simply cannot reach a system at all, you need MCP — no amount of prompting substitutes for an actual connection. Misdiagnosing the axis is the most common reason agentic setups become bloated and brittle.
Frequently asked questions
Does loading a Skill cost the same as a long prompt?
Only when the Skill is actually triggered. Until then you pay only for the name and description in the metadata index — usually well under a hundred tokens per Skill. A long static prompt, by contrast, costs its full length on every request whether relevant or not. That asymmetry is the entire efficiency argument for Skills.
Can a Skill run code?
Yes. The Skill folder can include scripts, and the instructions can direct Claude to execute them as a deterministic step rather than reasoning through the work token by token. This is ideal for tasks like data transforms or file generation where a few lines of Python are more reliable and cheaper than asking the model to do it manually.
How does Claude decide which Skill to load?
It reads the descriptions in the metadata index and judges relevance against the current task, the same in-context reasoning it uses to choose a tool. Because of this, a vague description is the number one cause of a Skill failing to fire. Write descriptions that name the concrete trigger conditions.
Do I need MCP to use Skills?
No. Skills are independent of MCP — a Skill that only contains instructions and reference docs needs no servers at all. They pair powerfully, but you can ship purely knowledge-based Skills with zero infrastructure.
Bringing these patterns to your phone lines
CallSphere takes the same progressive-disclosure and tool-orchestration ideas and applies them to voice and chat agents that answer every call, load the right procedure mid-conversation, and book real work around the clock. See it running at callsphere.ai.
Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.