---
title: "Build Your First Claude Agent Skill: A Walkthrough (Skills For Organizations)"
description: "Step-by-step guide to building a production Claude Agent Skill — frontmatter, procedural body, scripts, reference files, and testing discovery and execution."
canonical: https://callsphere.ai/blog/build-your-first-claude-agent-skill-a-walkthrough-skills-for-organizat
category: "Agentic AI"
tags: ["agentic ai", "claude", "agent skills", "tutorial", "anthropic", "claude code", "developer workflow"]
author: "CallSphere Team"
published: 2026-03-15T08:23:11.000Z
updated: 2026-06-07T01:28:22.846Z
---

# Build Your First Claude Agent Skill: A Walkthrough (Skills For Organizations)

> Step-by-step guide to building a production Claude Agent Skill — frontmatter, procedural body, scripts, reference files, and testing discovery and execution.

Reading about Agent Skills is one thing; shipping one your whole team relies on is another. This is a hands-on walkthrough that takes you from an empty directory to a working, tested skill that Claude loads on its own when the task calls for it. We'll build a realistic example — a skill that turns messy support-ticket exports into a clean weekly summary — and you'll see every file, every command, and every decision along the way.

## Key takeaways

- Start with the frontmatter description first — it is the trigger that decides whether your skill is ever used.
- Keep the body a procedure, not an essay: numbered steps Claude can follow deterministically.
- Move deterministic work (parsing, validation, math) into scripts so it stays out of context and out of the model's reasoning.
- Test discovery and execution separately — a skill can load correctly but execute wrong, or vice versa.
- Iterate by reading the transcript: if Claude skipped a step, the body wasn't explicit enough.

## Step 1: scaffold the folder

Every skill begins as a directory. Create the structure first so you have somewhere to put each piece as you build it. The minimum is a single `SKILL.md`; we'll add a script and a reference file because our example needs both.

```
mkdir -p ticket-summary/scripts ticket-summary/reference
touch ticket-summary/SKILL.md
touch ticket-summary/scripts/parse_tickets.py
touch ticket-summary/reference/summary-format.md
```

Naming matters more than it looks. The folder name becomes part of how you and your teammates refer to the skill, and it should read like a capability — `ticket-summary`, not `helper` or `v2`. Treat the whole directory as a unit of organizational knowledge you'll version-control and review like any other code.

## Step 2: write the frontmatter

The frontmatter is the first thing to get right because it controls discovery. Claude only sees the `name` and `description` when deciding whether to load the skill, so the description must read like a precise trigger condition, naming the input it expects and the output it produces.

```
---
name: ticket-summary
description: Converts raw support-ticket CSV or JSON exports into a structured weekly summary with volume, top issues, and SLA breaches. Use when the user shares a ticket export and asks for a digest or report.
---
```

Notice the description does two jobs: it states what the skill does and the exact situation that should trigger it. Avoid soft phrasing like "helps with support data." If you can't tell from the description alone when the skill should fire, neither can Claude.

## Step 3: write the body as a procedure

The body is loaded only after the skill is selected, so it can be detailed — but it should read like a runbook, not prose. Give Claude an explicit ordered procedure, tell it which script to run, and tell it where to find the output format. Below the diagram is the flow this body encodes.

```mermaid
flowchart TD
  A["User shares ticket export"] --> B{"Description matches?"}
  B -->|Yes| C["Load SKILL.md body"]
  C --> D["Run parse_tickets.py on the file"]
  D --> E["Read structured JSON output"]
  E --> F["Open reference/summary-format.md"]
  F --> G["Compose weekly digest"]
  G --> H["Return summary to user"]
```

Here is a body that matches that flow. Each step is unambiguous, the heavy lifting is delegated to the script, and the formatting rules live in a linked file so the body stays short.

```
# Ticket Summary

When the user provides a ticket export:

1. Run `scripts/parse_tickets.py ` to normalize the export.
   It outputs JSON with: total, by_category, sla_breaches, top_issues.
2. If the script errors, report the exact error and ask for a clean export.
   Do not attempt to parse the raw file yourself.
3. Read `reference/summary-format.md` for the required section order.
4. Produce the summary using that format. Round percentages to whole numbers.
5. Always include the SLA-breach count even if it is zero.
```

## Step 4: push deterministic work into a script

Parsing a CSV, counting categories, and computing SLA breaches are deterministic operations. Letting the model do them by hand wastes context and invites arithmetic mistakes. A script does them reliably, and only its compact JSON output enters context. This is the single biggest quality lever in skill design: keep the model reasoning about what to do, and let code do what is mechanical.

```
import sys, json, csv
from collections import Counter

path = sys.argv[1]
rows = list(csv.DictReader(open(path)))
by_cat = Counter(r["category"] for r in rows)
breaches = sum(1 for r in rows if r["sla"] == "breached")
print(json.dumps({
    "total": len(rows),
    "by_category": dict(by_cat),
    "sla_breaches": breaches,
    "top_issues": [c for c, _ in by_cat.most_common(5)],
}))
```

The body told Claude to surface script errors rather than fall back to manual parsing — that instruction is what keeps a malformed file from silently producing a wrong summary. Be explicit about failure handling in the body; the script's job is to fail loudly, the body's job is to react sanely.

## Step 5: write the reference file

The reference file holds detail the body shouldn't carry: the exact section order, tone, and any boilerplate. Because it loads only when the body opens it, you can be as thorough as you like without paying for it on every invocation. Keep one concern per reference file so future skills can reuse it.

For our example, `summary-format.md` specifies the digest layout — headline volume number first, then top issues with counts, then SLA section, then a one-line trend note. Putting this in its own file also means a non-engineer can adjust the report format without touching the procedure or the script.

## Step 6: test discovery, then execution

Two failure modes, two tests. First, test discovery: paste a realistic prompt ("here's last week's ticket export, can I get the digest") and confirm Claude loads the skill at all. If it doesn't, the description is the problem — sharpen it. Second, test execution: give it a real file and confirm the script runs, the output is read, and the format matches. If a step is skipped, the body wasn't explicit enough.

The fastest debugging tool is the transcript. Read what Claude actually did turn by turn. Did it call the script or try to eyeball the CSV? Did it open the reference file? Each deviation maps directly to a line you can tighten in the body. Iterate there rather than rewriting from scratch.

A useful habit is to keep a short set of fixture inputs — one clean export, one malformed, one empty — and run all three after every change to the skill. Discovery and execution can both regress quietly when you edit the description or reorder the procedure, and a thirty-second fixture run catches it before your teammates do. Once the skill is stable, those same fixtures become the regression suite that protects it as the underlying systems evolve.

## Ship it in 6 steps

1. Scaffold the folder with `SKILL.md`, a `scripts/` dir, and a `reference/` dir.
2. Write a frontmatter description that names the trigger, the input, and the output.
3. Write the body as a numbered procedure that delegates heavy work to scripts.
4. Implement the script to do deterministic work and fail loudly on bad input.
5. Put formatting and detail in reference files the body opens on demand.
6. Test discovery and execution separately, then iterate from the transcript.

| Concern | Put it in | Why |
| --- | --- | --- |
| When to fire | Frontmatter description | It's the discovery index |
| What to do | SKILL.md body | Loaded on selection |
| Mechanical work | scripts/ | Deterministic, off-context |
| Formatting detail | reference/ | Loaded only when needed |

## Common pitfalls

- **Writing the body before the description.** If discovery never fires, the body never runs. Get the trigger right first.
- **Letting the model parse data by hand.** Move parsing and math into scripts; it's faster, cheaper, and correct.
- **Silent fallbacks.** Tell the body to surface script errors instead of guessing, or bad input yields confident-but-wrong output.
- **Cramming format rules into the body.** Long bodies bloat every load — link a reference file instead.
- **Testing only the happy path.** Feed it a malformed export to confirm your error handling actually triggers.

## Frequently asked questions

### How long should a SKILL.md body be?

As short as it can be while remaining an unambiguous procedure — often well under a page. Anything detailed or reusable belongs in a linked reference file, and anything mechanical belongs in a script.

### Do I need a script in every skill?

No. Skills that are purely about judgment or formatting can be all instructions. Add a script the moment a step is deterministic, repetitive, or numeric — that's where scripts pay off.

### How do I know if my description is good?

Read it cold and ask: could I tell exactly when this should fire and what it takes in and gives back? If not, rewrite it to name the trigger, input, and output explicitly.

### What's the fastest way to debug a skill?

Read the transcript. Every place Claude skipped a step or improvised maps to a line in the body you can make more explicit. Iterate there rather than starting over.

## Bringing agentic AI to your phone lines

The same build-test-iterate loop powers CallSphere's **voice and chat** agents — skills that load mid-conversation, call tools while the caller waits, and complete real bookings without a human. Try it at [callsphere.ai](https://callsphere.ai).

---

*Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.*

---

Source: https://callsphere.ai/blog/build-your-first-claude-agent-skill-a-walkthrough-skills-for-organizat