Build Your First Claude Code Skill: A Walkthrough
Step-by-step guide to building a Claude Code skill: folder layout, SKILL.md frontmatter, bundled scripts, testing the trigger, and shipping it.
Reading about Agent Skills is one thing; getting one to fire on the right task, run your script, and produce the output you wanted is another. In this walkthrough we build a real skill from an empty folder to a tested, shippable capability. The example skill takes raw support-ticket exports and turns them into a weekly triage summary, but the steps generalize to anything you do repeatedly and want Claude Code to do the same way every time.
Step 1: Pick a task narrow enough to encode
The best first skill is a procedure you already perform by hand and could explain to a new hire in a page. "Summarize tickets" is too broad; "convert the weekly tickets.csv export into a triage summary grouped by severity, with a top-five themes section" is the right size. A good skill has a clear input, a clear output, and a sequence of steps in between. If you cannot write the steps down for a human, the model cannot follow them either.
Write the procedure out in plain prose first, before touching any files. List the inputs it expects, the transformations it performs, the edge cases it must handle (empty file, missing column, malformed dates), and the exact shape of the output. This prose becomes the spine of your SKILL.md body, and doing it first keeps you from skipping the awkward details that cause skills to misbehave in production.
Step 2: Create the folder and SKILL.md
Make a directory under your project's .claude/skills, for example ticket-triage, and inside it create SKILL.md. The file begins with YAML frontmatter and then the instructions:
---
name: ticket-triage
description: Use when converting the weekly tickets.csv support export into a triage summary grouped by severity with a top themes section.
---
# Ticket triage
When the user asks for a weekly triage summary, follow these steps...
The diagram below shows the build-and-test loop you will run through repeatedly while developing the skill.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
flowchart TD
A["Write SKILL.md frontmatter + steps"] --> B["Add bundled script if needed"]
B --> C["Restart Claude Code to rescan skills"]
C --> D["Prompt with a realistic task"]
D --> E{"Did the skill trigger?"}
E -->|No| F["Sharpen the description"] --> C
E -->|Yes| G{"Output correct?"}
G -->|No| H["Tighten the steps"] --> C
G -->|Yes| I["Commit and ship the skill"]
Notice that the description names the exact file (tickets.csv) and the exact output (triage summary by severity). That specificity is deliberate — it is the signal Claude uses to decide whether to load this skill at all, so vague wording here is the number-one reason a freshly built skill never triggers.
Step 3: Write the instruction body
The body is where you turn your prose procedure into numbered steps the model can execute. Be explicit about ordering and decisions. Spell out what to do when the input is malformed. If a step is ambiguous, the model will improvise, and improvisation is exactly what a skill exists to eliminate. Use imperative voice — "Parse the CSV. Group rows by the severity column. For each group, count tickets and list the three most recent."
Keep the body focused on the procedure, not on background theory. The model already knows what a CSV is; it does not know your severity taxonomy or that finance tickets always go in their own bucket. Encode the local, specific, hard-to-guess knowledge and trust the base model for the general parts. A body that re-explains common concepts wastes tokens and buries the instructions that actually matter.
Step 4: Add a bundled script for the deterministic parts
Some work should not be done by the model at all. Counting rows, parsing dates, and computing percentages are deterministic and cheap to script. Put a small script next to SKILL.md — say summarize.py — and have the instructions tell Claude to run it: "Run python summarize.py tickets.csv and use its JSON output to write the summary." This is a key pattern: let code handle what code is good at, and let the model handle judgment, phrasing, and synthesis.
Bundling a script also makes the skill far more reliable across runs. A pure-prompt skill that asks the model to tally numbers will occasionally miscount; a script that emits exact counts will not. The model's job becomes reading trustworthy structured output and turning it into readable prose, which it does extremely well. The split keeps both halves in their strengths.
Step 5: Test the trigger and the output
Restart Claude Code so it rescans the skills directory, then prompt it with a realistic request. First check that the skill triggers — Claude should mention loading it or behave according to its steps. If it does not, your description is the suspect; add the activating nouns and verbs your prompt used. Once it triggers reliably, check the output against a known-good example. Feed it a malformed file too, and confirm the edge-case handling you wrote actually fires.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Treat triggering and correctness as two separate test passes, because they fail for different reasons and have different fixes. Run the skill against three or four real exports, not just one, so you catch the variations real data throws at you. Only once it behaves on varied inputs is it ready to commit alongside the project so your whole team inherits it.
Step 6: Ship it and iterate
Commit the skill folder into the repository under .claude/skills so it travels with the project and every teammate's Claude Code picks it up automatically. From here, iteration is cheap: when the skill makes a mistake, you usually fix one sentence in the body or add one edge case, restart, and retest. Because the skill is plain text and a small script, code review covers it like any other change, and you get a clean history of how the capability evolved.
Frequently asked questions
Where do I put a skill so Claude Code finds it?
Place the skill folder under your project's .claude/skills directory, your user-level configuration, or inside a plugin. Claude Code scans these locations at startup and adds each skill's name and description to its index.
Why isn't my new skill triggering?
Almost always the description is too vague or missing the words your prompt uses. Rewrite it to name the specific inputs, outputs, and actions involved, then restart Claude Code so it rescans and retest with a realistic request.
Should logic live in the body or in a bundled script?
Put deterministic work — counting, parsing, math — in a bundled script the model runs, and keep judgment, phrasing, and synthesis in the instruction body. This split makes the skill both reliable and cheap to run.
Do I need to restart Claude Code after editing a skill?
Yes, restart so the skills directory is rescanned and the metadata index is rebuilt. After restarting, prompt with a realistic task to confirm both that the skill triggers and that its output is correct.
Bringing agentic AI to your phone lines
CallSphere turns the same build-test-ship loop into live voice and chat agents — they load the right procedure mid-call, run real tools, and book work 24/7 without a human in the loop. Try it at callsphere.ai.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.