---
title: "Building a Code Review Bot with the Claude API"
description: "Step-by-step guide to building an automated code review bot using the Claude API. Covers GitHub integration, diff analysis, security scanning, style enforcement, and delivering actionable feedback on pull requests."
canonical: https://callsphere.ai/blog/building-code-review-bot-claude-api
category: "Agentic AI"
tags: ["Code Review", "Claude API", "GitHub", "DevOps", "AI Engineering", "Automation"]
author: "CallSphere Team"
published: 2026-01-28T00:00:00.000Z
updated: 2026-05-07T08:10:34.287Z
---

# Building a Code Review Bot with the Claude API

> Step-by-step guide to building an automated code review bot using the Claude API. Covers GitHub integration, diff analysis, security scanning, style enforcement, and delivering actionable feedback on pull requests.

## Why Build an AI Code Review Bot?

Manual code review is a bottleneck in every engineering team. Senior engineers spend 5-10 hours per week reviewing pull requests. Reviews are inconsistent -- what one reviewer catches, another misses. And review latency delays merges, slowing the entire development cycle.

An AI code review bot does not replace human reviewers. It augments them by catching the mechanical issues (bugs, security vulnerabilities, style violations, missing tests) so that human reviewers can focus on architecture, design, and business logic.

## Architecture Overview

The system has four components:

```mermaid
flowchart LR
    USER(["User message"])
    LOOP{"messages.create
agent loop"}
    THINK["Extended thinking
optional"]
    TOOL{"stop_reason
tool_use?"}
    EXEC["Execute tool
append tool_result"]
    DONE(["stop_reason
end_turn"])
    USER --> LOOP --> THINK --> TOOL
    TOOL -->|Yes| EXEC --> LOOP
    TOOL -->|No| DONE
    style LOOP fill:#4f46e5,stroke:#4338ca,color:#fff
    style THINK fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style DONE fill:#059669,stroke:#047857,color:#fff
```

1. **GitHub Webhook Listener**: Receives PR events from GitHub
2. **Diff Analyzer**: Extracts and structures the code changes
3. **Claude Review Engine**: Analyzes code and generates feedback
4. **GitHub Comment Writer**: Posts review comments on the PR

```
GitHub PR Event -> Webhook -> Diff Analyzer -> Claude API -> GitHub Comments
```

## Step 1: GitHub Webhook Listener

```python
from fastapi import FastAPI, Request, HTTPException
import hmac
import hashlib
import os

app = FastAPI()
GITHUB_WEBHOOK_SECRET = os.environ["GITHUB_WEBHOOK_SECRET"]

@app.post("/webhook/github")
async def handle_github_webhook(request: Request):
    # Verify webhook signature
    signature = request.headers.get("X-Hub-Signature-256", "")
    body = await request.body()

    expected = "sha256=" + hmac.new(
        GITHUB_WEBHOOK_SECRET.encode(),
        body,
        hashlib.sha256
    ).hexdigest()

    if not hmac.compare_digest(signature, expected):
        raise HTTPException(status_code=403, detail="Invalid signature")

    payload = await request.json()
    event_type = request.headers.get("X-GitHub-Event")

    if event_type == "pull_request" and payload["action"] in ("opened", "synchronize"):
        await review_pull_request(
            repo=payload["repository"]["full_name"],
            pr_number=payload["pull_request"]["number"],
            base_sha=payload["pull_request"]["base"]["sha"],
            head_sha=payload["pull_request"]["head"]["sha"],
        )

    return {"status": "ok"}
```

## Step 2: Diff Analyzer

```python
import httpx

GITHUB_TOKEN = os.environ["GITHUB_TOKEN"]

async def get_pr_diff(repo: str, pr_number: int) -> list[dict]:
    """Fetch the PR diff and parse it into structured file changes."""
    async with httpx.AsyncClient() as client:
        # Get list of changed files
        response = await client.get(
            f"https://api.github.com/repos/{repo}/pulls/{pr_number}/files",
            headers={
                "Authorization": f"token {GITHUB_TOKEN}",
                "Accept": "application/vnd.github.v3+json",
            }
        )
        files = response.json()

    changes = []
    for file in files:
        if file["status"] == "removed":
            continue  # Skip deleted files

        changes.append({
            "filename": file["filename"],
            "status": file["status"],  # added, modified, renamed
            "additions": file["additions"],
            "deletions": file["deletions"],
            "patch": file.get("patch", ""),  # The actual diff
            "language": detect_language(file["filename"]),
        })

    return changes

def detect_language(filename: str) -> str:
    ext_map = {
        ".py": "python", ".ts": "typescript", ".tsx": "typescript",
        ".js": "javascript", ".jsx": "javascript", ".go": "go",
        ".rs": "rust", ".java": "java", ".rb": "ruby",
    }
    for ext, lang in ext_map.items():
        if filename.endswith(ext):
            return lang
    return "unknown"
```

## Step 3: Claude Review Engine

This is the core of the system. We send each file's diff to Claude with specialized review instructions.

```python
from anthropic import Anthropic
import json

client = Anthropic()

REVIEW_SYSTEM_PROMPT = """You are an expert code reviewer. For each code diff provided,
analyze the changes and identify:

1. **Bugs**: Logic errors, off-by-one errors, null pointer issues, race conditions
2. **Security**: SQL injection, XSS, auth bypasses, secrets exposure, input validation
3. **Performance**: N+1 queries, unnecessary allocations, missing indexes, O(n^2) algorithms
4. **Style**: Naming conventions, code organization, readability
5. **Missing tests**: New logic paths that lack test coverage

For each issue found, provide:
- severity: "critical", "warning", or "suggestion"
- line: the line number in the diff (from the + side)
- description: clear explanation of the issue
- suggestion: specific code fix when possible

Return your review as a JSON array of issues. If the code looks good, return an empty array.
Do NOT fabricate issues -- only report genuine problems."""

async def review_file(filename: str, patch: str, language: str) -> list[dict]:
    """Review a single file's changes."""
    if not patch or len(patch)  max_tokens_per_call and current_batch:
            # Review current batch
            yield await review_batch(current_batch)
            current_batch = []
            current_tokens = 0

        current_batch.append(change)
        current_tokens += patch_tokens

    if current_batch:
        yield await review_batch(current_batch)
```

## Reducing False Positives

The biggest challenge with AI code review is false positives. Every false positive erodes developer trust in the tool. Strategies to minimize them:

1. **Include project context**: Add a `.ai-review-config.yml` that describes coding standards, acceptable patterns, and known exceptions
2. **Use file-type-specific prompts**: A Python review prompt differs from a TypeScript review prompt
3. **Filter low-confidence findings**: Ask Claude to rate its confidence (1-10) and only surface issues above 7
4. **Learn from dismissals**: Track which comments developers dismiss and adjust the prompt accordingly
5. **Limit scope**: Focus on security and bugs initially. Add style checks only after the bot has earned trust

## Cost Analysis

For an average PR with 10 changed files and 500 lines of diff:

| Component | Tokens | Cost (Sonnet) |
| --- | --- | --- |
| System prompt (cached) | 500 | $0.00015 |
| 10 file diffs | 5,000 | $0.015 |
| 10 review outputs | 3,000 | $0.045 |
| **Total per PR** | **8,500** | **$0.06** |

At 50 PRs per day, the monthly cost is approximately $90 -- less than one hour of a senior engineer's time. The ROI is immediate and substantial.

---

Source: https://callsphere.ai/blog/building-code-review-bot-claude-api