Skip to content
Learn Agentic AI
Learn Agentic AI13 min read6 views

Building a Research Agent with Claude: Web Search, Analysis, and Report Generation

Build a complete research agent that searches the web, evaluates sources, synthesizes findings, and generates structured reports using Claude and the Anthropic SDK.

What a Research Agent Does

A research agent automates the cycle that human researchers follow: formulate questions, search for information, evaluate source credibility, extract key findings, and synthesize everything into a coherent report. With Claude's large context window and reasoning capabilities, you can build agents that handle this entire pipeline — from raw web search results to polished analysis.

The agent we will build uses three tools: web search, page content extraction, and report formatting. Claude orchestrates them in a loop, deciding when to search for more information and when it has enough to write the final report.

Architecture Overview

The research agent follows a plan-search-synthesize pattern:

flowchart TD
    START["Building a Research Agent with Claude: Web Search…"] --> A
    A["What a Research Agent Does"]
    A --> B
    B["Architecture Overview"]
    B --> C
    C["Defining the Research Tools"]
    C --> D
    D["Implementing Tool Execution"]
    D --> E
    E["The Research Agent Loop"]
    E --> F
    F["Source Evaluation Strategy"]
    F --> G
    G["FAQ"]
    G --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
  1. Planning: Claude breaks the research question into sub-queries
  2. Searching: The agent searches the web for each sub-query
  3. Extraction: Relevant pages are fetched and key content is extracted
  4. Synthesis: Claude analyzes all gathered information and produces a report

Defining the Research Tools

import anthropic
import json
import requests

client = anthropic.Anthropic()

tools = [
    {
        "name": "web_search",
        "description": "Search the web for information. Returns a list of results with titles, URLs, and snippets. Use specific, targeted queries for best results.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Search query string"
                },
                "num_results": {
                    "type": "integer",
                    "description": "Number of results to return (1-10)",
                    "default": 5
                }
            },
            "required": ["query"]
        }
    },
    {
        "name": "fetch_page",
        "description": "Fetch and extract the main text content from a URL. Use this to get details from a search result.",
        "input_schema": {
            "type": "object",
            "properties": {
                "url": {
                    "type": "string",
                    "description": "The URL to fetch"
                }
            },
            "required": ["url"]
        }
    },
    {
        "name": "save_report",
        "description": "Save the final research report to a file. Call this when research is complete and the report is written.",
        "input_schema": {
            "type": "object",
            "properties": {
                "title": {"type": "string", "description": "Report title"},
                "content": {"type": "string", "description": "Full report in markdown"},
                "sources": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "List of source URLs used"
                }
            },
            "required": ["title", "content", "sources"]
        }
    }
]

Implementing Tool Execution

Each tool maps to a real function. For web search, you can use any search API — here we use a generic pattern:

flowchart TD
    CENTER(("Core Concepts"))
    CENTER --> N0["Planning: Claude breaks the research qu…"]
    CENTER --> N1["Searching: The agent searches the web f…"]
    CENTER --> N2["Extraction: Relevant pages are fetched …"]
    CENTER --> N3["Synthesis: Claude analyzes all gathered…"]
    style CENTER fill:#4f46e5,stroke:#4338ca,color:#fff
def execute_tool(name: str, inputs: dict) -> dict:
    if name == "web_search":
        return perform_web_search(inputs["query"], inputs.get("num_results", 5))
    elif name == "fetch_page":
        return fetch_page_content(inputs["url"])
    elif name == "save_report":
        return save_research_report(inputs)
    return {"error": f"Unknown tool: {name}"}

def perform_web_search(query: str, num_results: int) -> dict:
    # Replace with your preferred search API (Brave, Serper, SerpAPI)
    api_key = os.environ["SEARCH_API_KEY"]
    response = requests.get(
        "https://api.search.brave.com/res/v1/web/search",
        headers={"X-Subscription-Token": api_key},
        params={"q": query, "count": num_results},
    )
    results = response.json().get("web", {}).get("results", [])
    return {
        "results": [
            {"title": r["title"], "url": r["url"], "snippet": r.get("description", "")}
            for r in results
        ]
    }

def fetch_page_content(url: str) -> dict:
    try:
        resp = requests.get(url, timeout=10, headers={"User-Agent": "ResearchBot/1.0"})
        # In production, use readability or trafilatura for text extraction
        from trafilatura import extract
        text = extract(resp.text) or ""
        return {"content": text[:8000], "url": url}  # Truncate to manage tokens
    except Exception as e:
        return {"error": str(e), "url": url}

def save_research_report(data: dict) -> dict:
    filename = data["title"].lower().replace(" ", "_")[:50] + ".md"
    with open(filename, "w") as f:
        f.write(f"# {data['title']}\n\n")
        f.write(data["content"])
        f.write("\n\n## Sources\n\n")
        for url in data["sources"]:
            f.write(f"- {url}\n")
    return {"saved": filename, "word_count": len(data["content"].split())}

The Research Agent Loop

The agent loop runs until Claude either saves a report or exhausts its research budget:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

def run_research_agent(topic: str, max_turns: int = 20) -> str:
    system = """You are a thorough research agent. Given a topic:
1. Break it into 3-5 specific sub-questions
2. Search for each sub-question
3. Fetch the most relevant pages for detailed information
4. Synthesize findings into a comprehensive report
5. Save the report using the save_report tool

Always cite your sources. Prioritize recent, authoritative sources."""

    messages = [{"role": "user", "content": f"Research this topic: {topic}"}]

    for turn in range(max_turns):
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            system=system,
            tools=tools,
            messages=messages,
        )

        if response.stop_reason == "end_turn":
            return response.content[0].text

        messages.append({"role": "assistant", "content": response.content})

        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": json.dumps(result),
                })

        messages.append({"role": "user", "content": tool_results})

    return "Research agent reached maximum turns without completing."

Source Evaluation Strategy

Strong research agents do not just collect information — they evaluate it. Add instructions that guide Claude to assess source quality:

system_with_evaluation = """When evaluating sources, consider:
- Domain authority (academic, government, established media vs blogs)
- Publication date (prefer sources from the last 12 months)
- Author credentials (named experts vs anonymous content)
- Corroboration (do multiple independent sources agree?)

If sources conflict, note the disagreement and explain which
position has stronger evidence."""

Claude will naturally apply these criteria when writing the synthesis, noting where sources agree and where they diverge.

FAQ

How do I prevent the agent from searching endlessly?

Set a max_turns limit as shown in the code above. You can also add a searches_remaining counter in your system prompt and decrement it with each search call. Another approach is to track total tokens used and stop when approaching a budget threshold.

What search APIs work best with research agents?

Brave Search API and Serper.dev both provide reliable, affordable web search. For academic research, consider Google Scholar via SerpAPI. The choice depends on your use case — Brave is best for general web content, while specialized APIs work better for niche domains like medical or legal research.

How do I handle rate limits during intensive research?

Implement exponential backoff in your perform_web_search function and add a short delay between consecutive searches. For Claude API rate limits, catch anthropic.RateLimitError and retry with backoff. The Anthropic SDK has built-in retry logic that handles transient errors automatically.


#Claude #ResearchAgent #WebSearch #ReportGeneration #Python #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Interview Prep

7 AI Coding Interview Questions From Anthropic, Meta & OpenAI (2026 Edition)

Real AI coding interview questions from Anthropic, Meta, and OpenAI in 2026. Includes implementing attention from scratch, Anthropic's progressive coding screens, Meta's AI-assisted round, and vector search — with solution approaches.

Learn Agentic AI

Building a Multi-Agent Data Pipeline: Ingestion, Transformation, and Analysis Agents

Build a three-agent data pipeline with ingestion, transformation, and analysis agents that process data from APIs, CSVs, and databases using Python.

Learn Agentic AI

How to Build an AI Coding Assistant with Claude and MCP: Step-by-Step Guide

Build a powerful AI coding assistant that reads files, runs tests, and fixes bugs using the Claude API and Model Context Protocol servers in TypeScript.

Learn Agentic AI

Building Your First MCP Server: Connect AI Agents to Any External Tool

Step-by-step tutorial on building an MCP server in TypeScript, registering tools and resources, handling requests, and connecting to Claude and other LLM clients.

Learn Agentic AI

Computer Use Agents 2026: How Claude, GPT-5.4, and Gemini Navigate Desktop Applications

Comparison of computer use capabilities across Claude, GPT-5.4, and Gemini including accuracy benchmarks, speed tests, supported applications, and real-world limitations.

Learn Agentic AI

Building a Research Agent with Web Search and Report Generation: Complete Tutorial

Build a research agent that searches the web, extracts and synthesizes data, and generates formatted reports using OpenAI Agents SDK and web search tools.