---
title: "Building a Research Agent with Claude: Web Search, Analysis, and Report Generation"
description: "Build a complete research agent that searches the web, evaluates sources, synthesizes findings, and generates structured reports using Claude and the Anthropic SDK."
canonical: https://callsphere.ai/blog/building-research-agent-claude-web-search-analysis
category: "Learn Agentic AI"
tags: ["Claude", "Research Agent", "Web Search", "Report Generation", "Python"]
author: "CallSphere Team"
published: 2026-03-17T00:00:00.000Z
updated: 2026-05-06T01:02:45.135Z
---

# Building a Research Agent with Claude: Web Search, Analysis, and Report Generation

> Build a complete research agent that searches the web, evaluates sources, synthesizes findings, and generates structured reports using Claude and the Anthropic SDK.

## What a Research Agent Does

A research agent automates the cycle that human researchers follow: formulate questions, search for information, evaluate source credibility, extract key findings, and synthesize everything into a coherent report. With Claude's large context window and reasoning capabilities, you can build agents that handle this entire pipeline — from raw web search results to polished analysis.

The agent we will build uses three tools: web search, page content extraction, and report formatting. Claude orchestrates them in a loop, deciding when to search for more information and when it has enough to write the final report.

## Architecture Overview

The research agent follows a plan-search-synthesize pattern:

```mermaid
flowchart LR
    INPUT(["User intent"])
    PARSE["Parse plus
classify"]
    PLAN["Plan and tool
selection"]
    AGENT["Agent loop
LLM plus tools"]
    GUARD{"Guardrails
and policy"}
    EXEC["Execute and
verify result"]
    OBS[("Trace and metrics")]
    OUT(["Outcome plus
next action"])
    INPUT --> PARSE --> PLAN --> AGENT --> GUARD
    GUARD -->|Pass| EXEC --> OUT
    GUARD -->|Fail| AGENT
    AGENT --> OBS
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style OBS fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff
```

1. **Planning**: Claude breaks the research question into sub-queries
2. **Searching**: The agent searches the web for each sub-query
3. **Extraction**: Relevant pages are fetched and key content is extracted
4. **Synthesis**: Claude analyzes all gathered information and produces a report

## Defining the Research Tools

```python
import anthropic
import json
import requests

client = anthropic.Anthropic()

tools = [
    {
        "name": "web_search",
        "description": "Search the web for information. Returns a list of results with titles, URLs, and snippets. Use specific, targeted queries for best results.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Search query string"
                },
                "num_results": {
                    "type": "integer",
                    "description": "Number of results to return (1-10)",
                    "default": 5
                }
            },
            "required": ["query"]
        }
    },
    {
        "name": "fetch_page",
        "description": "Fetch and extract the main text content from a URL. Use this to get details from a search result.",
        "input_schema": {
            "type": "object",
            "properties": {
                "url": {
                    "type": "string",
                    "description": "The URL to fetch"
                }
            },
            "required": ["url"]
        }
    },
    {
        "name": "save_report",
        "description": "Save the final research report to a file. Call this when research is complete and the report is written.",
        "input_schema": {
            "type": "object",
            "properties": {
                "title": {"type": "string", "description": "Report title"},
                "content": {"type": "string", "description": "Full report in markdown"},
                "sources": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "List of source URLs used"
                }
            },
            "required": ["title", "content", "sources"]
        }
    }
]
```

## Implementing Tool Execution

Each tool maps to a real function. For web search, you can use any search API — here we use a generic pattern:

```python
def execute_tool(name: str, inputs: dict) -> dict:
    if name == "web_search":
        return perform_web_search(inputs["query"], inputs.get("num_results", 5))
    elif name == "fetch_page":
        return fetch_page_content(inputs["url"])
    elif name == "save_report":
        return save_research_report(inputs)
    return {"error": f"Unknown tool: {name}"}

def perform_web_search(query: str, num_results: int) -> dict:
    # Replace with your preferred search API (Brave, Serper, SerpAPI)
    api_key = os.environ["SEARCH_API_KEY"]
    response = requests.get(
        "https://api.search.brave.com/res/v1/web/search",
        headers={"X-Subscription-Token": api_key},
        params={"q": query, "count": num_results},
    )
    results = response.json().get("web", {}).get("results", [])
    return {
        "results": [
            {"title": r["title"], "url": r["url"], "snippet": r.get("description", "")}
            for r in results
        ]
    }

def fetch_page_content(url: str) -> dict:
    try:
        resp = requests.get(url, timeout=10, headers={"User-Agent": "ResearchBot/1.0"})
        # In production, use readability or trafilatura for text extraction
        from trafilatura import extract
        text = extract(resp.text) or ""
        return {"content": text[:8000], "url": url}  # Truncate to manage tokens
    except Exception as e:
        return {"error": str(e), "url": url}

def save_research_report(data: dict) -> dict:
    filename = data["title"].lower().replace(" ", "_")[:50] + ".md"
    with open(filename, "w") as f:
        f.write(f"# {data['title']}\n\n")
        f.write(data["content"])
        f.write("\n\n## Sources\n\n")
        for url in data["sources"]:
            f.write(f"- {url}\n")
    return {"saved": filename, "word_count": len(data["content"].split())}
```

## The Research Agent Loop

The agent loop runs until Claude either saves a report or exhausts its research budget:

```python
def run_research_agent(topic: str, max_turns: int = 20) -> str:
    system = """You are a thorough research agent. Given a topic:
1. Break it into 3-5 specific sub-questions
2. Search for each sub-question
3. Fetch the most relevant pages for detailed information
4. Synthesize findings into a comprehensive report
5. Save the report using the save_report tool

Always cite your sources. Prioritize recent, authoritative sources."""

    messages = [{"role": "user", "content": f"Research this topic: {topic}"}]

    for turn in range(max_turns):
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            system=system,
            tools=tools,
            messages=messages,
        )

        if response.stop_reason == "end_turn":
            return response.content[0].text

        messages.append({"role": "assistant", "content": response.content})

        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": json.dumps(result),
                })

        messages.append({"role": "user", "content": tool_results})

    return "Research agent reached maximum turns without completing."
```

## Source Evaluation Strategy

Strong research agents do not just collect information — they evaluate it. Add instructions that guide Claude to assess source quality:

```python
system_with_evaluation = """When evaluating sources, consider:
- Domain authority (academic, government, established media vs blogs)
- Publication date (prefer sources from the last 12 months)
- Author credentials (named experts vs anonymous content)
- Corroboration (do multiple independent sources agree?)

If sources conflict, note the disagreement and explain which
position has stronger evidence."""
```

Claude will naturally apply these criteria when writing the synthesis, noting where sources agree and where they diverge.

## FAQ

### How do I prevent the agent from searching endlessly?

Set a `max_turns` limit as shown in the code above. You can also add a `searches_remaining` counter in your system prompt and decrement it with each search call. Another approach is to track total tokens used and stop when approaching a budget threshold.

### What search APIs work best with research agents?

Brave Search API and Serper.dev both provide reliable, affordable web search. For academic research, consider Google Scholar via SerpAPI. The choice depends on your use case — Brave is best for general web content, while specialized APIs work better for niche domains like medical or legal research.

### How do I handle rate limits during intensive research?

Implement exponential backoff in your `perform_web_search` function and add a short delay between consecutive searches. For Claude API rate limits, catch `anthropic.RateLimitError` and retry with backoff. The Anthropic SDK has built-in retry logic that handles transient errors automatically.

---

#Claude #ResearchAgent #WebSearch #ReportGeneration #Python #AgenticAI #LearnAI #AIEngineering

---

Source: https://callsphere.ai/blog/building-research-agent-claude-web-search-analysis
