---
title: "End-to-End: Securing a Codebase With Claude, Step by Step"
description: "A realistic end-to-end walkthrough of using Claude to find, triage, and fix a real source-code vulnerability — from messy problem to shipped, verified fix."
canonical: https://callsphere.ai/blog/end-to-end-securing-a-codebase-with-claude-step-by-step
category: "Agentic AI"
tags: ["agentic ai", "claude", "code security", "claude code", "ssrf", "appsec", "walkthrough"]
author: "CallSphere Team"
published: 2026-05-27T17:46:22.000Z
updated: 2026-06-06T21:47:41.600Z
---

# End-to-End: Securing a Codebase With Claude, Step by Step

> A realistic end-to-end walkthrough of using Claude to find, triage, and fix a real source-code vulnerability — from messy problem to shipped, verified fix.

Most writing about LLM code security stays abstract. This post does the opposite: it follows one realistic scenario end to end, from the moment a vulnerability class is suspected to the moment a verified fix ships and the gap is closed for good. The team is a mid-sized SaaS company; the codebase is a Python and TypeScript monorepo with the usual accumulated history. The goal is to show how Claude actually fits into the work, where humans stay in control, and what 'done' really looks like. No magic, no perfect model — just a competent agent inside a well-designed loop.

The trigger is mundane. A penetration test flags a single server-side request forgery (SSRF) issue in an image-fetching endpoint. The security lead's real worry is not that one bug; it's the suspicion that the same risky pattern — taking a user-supplied URL and fetching it without validation — exists in a dozen other places nobody has audited. This is exactly the kind of pattern-spread problem where an LLM earns its keep, and exactly where a naive scan returns either nothing or a wall of noise.

## Step one: frame the problem so the agent can reason

The first move is not to prompt Claude with 'find all SSRF.' That invites hallucinated breadth. Instead, the engineer writes a tight brief: here is the confirmed SSRF in the image endpoint, here is exactly why it's exploitable (the URL is fetched server-side with no allowlist and the response is returned to the caller), and here is the question — where else does this same trust assumption appear? Framing the known-good example gives the model an anchor it can generalize from with discipline rather than inventing patterns.

The engineer also scopes access deliberately. Through an MCP server, Claude gets read-only access to the repository's source paths and nothing else — no network tools, no write access yet. The task runs in Claude Code so the agent can navigate the codebase, follow imports, and trace how the suspect URL values flow, rather than reasoning over a single pasted file.

## Step two: discovery with data-flow reasoning

Claude works outward from the confirmed bug. It identifies the helper that performs the unvalidated fetch, then traces every caller. For each candidate, it reasons about whether the input is genuinely user-controlled and whether the response is exposed — the two conditions that turn a fetch into an exploitable SSRF. This is where an LLM outperforms a regex: it distinguishes the endpoint that fetches an admin-supplied, already-validated internal URL (not a finding) from the one that fetches whatever a logged-in user pastes (a finding).

```mermaid
flowchart TD
  A["Confirmed SSRF in image endpoint"] --> B["Brief Claude with the known-good example"]
  B --> C["Read-only repo access via MCP"]
  C --> D["Trace callers of the unsafe fetch helper"]
  D --> E{"User-controlled & response exposed?"}
  E -->|No| F["Mark not-exploitable, note why"]
  E -->|Yes| G["Add to triage list"]
  G --> H["Human reviews & confirms"]
  H --> I["Claude proposes scoped patch"]
  I --> J["Tests + review gate, then ship"]
```

The output is not a verdict; it's a ranked triage list. Claude returns nine candidate sites, classifies four as clearly exploitable, three as conditionally exploitable depending on deployment, and two as not-exploitable with a written justification for each. Crucially, it shows its reasoning per site, so a human can audit the logic rather than trusting a label.

## Step three: human triage where it counts

Now the AppSec engineer takes over the judgment calls. They accept the four clear findings immediately — the reasoning checks out and matches the pen-test result. For the three conditional ones, they pull in the service owners, because whether an internal-only endpoint is exploitable depends on network topology the model doesn't fully know. Two are confirmed, one is dismissed as genuinely unreachable. This is the irreplaceable human layer: the model proposes a thorough, well-reasoned map, and a person makes the calls that depend on context the codebase doesn't contain.

By the end of triage there are six confirmed SSRF sites, up from the single one the pen test found. That multiplication — one reported bug becoming a systematically discovered class — is the real return on putting Claude in the loop.

## Step four: from finding to fix

With confirmed findings, the engineer grants Claude scoped write access to propose patches, still behind a merge gate. Rather than asking for six independent fixes, the engineer directs a better design: introduce a single validated-fetch utility that enforces an allowlist and blocks internal address ranges, then refactor all six sites to use it. Claude drafts the utility, writes unit tests for the SSRF defenses (including the metadata-endpoint and localhost cases attackers love), and migrates each call site. Each change lands as its own reviewable pull request.

The engineer reviews every patch. Two need correction — Claude initially missed that one call site passed the URL through a wrapper that re-introduced the raw value — and the engineer has the agent fix it. This back-and-forth is normal and healthy; the agent accelerates the work without owning the final correctness, which still rests with the human who approves the merge.

## Step five: verify, then close the gap permanently

Shipping the patch is not the end. The team writes a regression test that fails if the unsafe fetch helper is ever called directly again, and they encode the new rule as an Agent Skill: any future code that fetches a user-supplied URL must route through the validated utility. Now the next time Claude reviews a pull request that reintroduces the pattern, it flags it automatically, with the institutional rule loaded as context. The one-time cleanup becomes a durable control.

Stepping back, the shape of the work is the lesson. The agent did the breadth — tracing flows across a large codebase faster than any human could — and the disciplined human did the depth: the context-dependent judgment and the final sign-off. **An end-to-end LLM security workflow** is one where the model handles systematic discovery and drafting while humans own triage and the merge gate, and every fix is verified and converted into a lasting guardrail.

## Frequently asked questions

### Why not just ask Claude to 'find all vulnerabilities' in the repo?

Open-ended prompts invite hallucinated breadth and unranked noise. Anchoring the agent on one confirmed, well-explained example and asking it to find that specific pattern produces disciplined, generalizable results you can actually triage. Specificity in the brief is what separates a useful run from a wall of false positives.

### How much can the agent be trusted to fix on its own?

Let it draft patches and tests, but keep every change behind a human merge gate. In a realistic run the agent's patches usually need at least some correction — a missed wrapper, an edge case — which a reviewer catches. The model accelerates the fix; the human still owns final correctness.

### What turns a one-time cleanup into a permanent improvement?

Encode the new rule as a regression test and an Agent Skill the reviewer loads on future pull requests. Then the pattern can't silently return, because the next review flags it automatically with your institutional standard as context. Closing the gap matters as much as fixing the instances.

### Does this require Claude Code specifically, or any LLM?

The pattern needs an agent that can navigate a real codebase — follow imports, trace data flow, and run tests — not just answer over a pasted file. Claude Code provides that repository-aware loop with scoped MCP access, which is why the discovery step works across many files rather than one.

## Bringing this loop to your phone lines

CallSphere runs the same propose-then-verify agentic loop on **voice and chat** — assistants that discover intent, take scoped actions mid-conversation, and hand off to humans at the moments that matter, every call, all day. See it live at [callsphere.ai](https://callsphere.ai).

---

*Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.*

---

Source: https://callsphere.ai/blog/end-to-end-securing-a-codebase-with-claude-step-by-step
