Skip to content
Learn Agentic AI
Learn Agentic AI12 min read2 views

Claude PDF and Document Analysis Agent: Processing Complex Documents at Scale

Build a document analysis agent that uploads PDFs to Claude, performs page-level analysis, extracts tables and structured data, and compares information across multiple documents.

Claude's Native PDF Understanding

Claude can process PDF documents directly through the Messages API. Rather than converting PDFs to text first (losing formatting, tables, and layout information), Claude analyzes the rendered pages as images while simultaneously processing any embedded text. This dual understanding — visual layout plus textual content — makes it exceptionally capable at extracting structured data from complex documents.

This capability is particularly valuable for contracts, financial reports, research papers, invoices, and any document where layout carries meaning.

Uploading PDFs to Claude

PDFs are sent as base64-encoded content in the message:

flowchart TD
    START["Claude PDF and Document Analysis Agent: Processin…"] --> A
    A["Claude39s Native PDF Understanding"]
    A --> B
    B["Uploading PDFs to Claude"]
    B --> C
    C["Page-Level Analysis"]
    C --> D
    D["Structured Data Extraction with Tools"]
    D --> E
    E["Multi-Document Comparison"]
    E --> F
    F["Scaling Document Processing"]
    F --> G
    G["FAQ"]
    G --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
import anthropic
import base64

client = anthropic.Anthropic()

def analyze_pdf(file_path: str, question: str) -> str:
    with open(file_path, "rb") as f:
        pdf_data = base64.standard_b64encode(f.read()).decode()

    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=4096,
        messages=[{
            "role": "user",
            "content": [
                {
                    "type": "document",
                    "source": {
                        "type": "base64",
                        "media_type": "application/pdf",
                        "data": pdf_data,
                    }
                },
                {
                    "type": "text",
                    "text": question,
                }
            ]
        }]
    )
    return response.content[0].text

Claude processes each page of the PDF, understanding both the text content and the visual layout. This means it can correctly interpret tables, charts, headers, footnotes, and multi-column layouts.

Page-Level Analysis

For large documents, you may want to analyze specific page ranges or process pages individually. Send targeted questions about specific sections:

def analyze_pages(file_path: str, analyses: list[dict]) -> list[dict]:
    """Run multiple analyses on a single PDF."""
    with open(file_path, "rb") as f:
        pdf_data = base64.standard_b64encode(f.read()).decode()

    results = []
    for analysis in analyses:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            messages=[{
                "role": "user",
                "content": [
                    {
                        "type": "document",
                        "source": {
                            "type": "base64",
                            "media_type": "application/pdf",
                            "data": pdf_data,
                        }
                    },
                    {
                        "type": "text",
                        "text": analysis["question"],
                    }
                ]
            }]
        )
        results.append({
            "analysis": analysis["name"],
            "result": response.content[0].text
        })
    return results

# Usage
results = analyze_pages("annual_report.pdf", [
    {"name": "financial_summary", "question": "Extract all revenue figures, costs, and profit margins from the financial statements."},
    {"name": "risk_factors", "question": "List all risk factors mentioned in the document with their severity."},
    {"name": "key_metrics", "question": "What are the key performance indicators and their year-over-year changes?"},
])

Structured Data Extraction with Tools

Combine PDF analysis with tool use to extract structured data that can be programmatically processed:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

extraction_tool = {
    "name": "extract_invoice_data",
    "description": "Extract structured data from an invoice document",
    "input_schema": {
        "type": "object",
        "properties": {
            "vendor_name": {"type": "string"},
            "invoice_number": {"type": "string"},
            "invoice_date": {"type": "string", "description": "ISO format date"},
            "due_date": {"type": "string", "description": "ISO format date"},
            "line_items": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "description": {"type": "string"},
                        "quantity": {"type": "number"},
                        "unit_price": {"type": "number"},
                        "total": {"type": "number"}
                    },
                    "required": ["description", "quantity", "unit_price", "total"]
                }
            },
            "subtotal": {"type": "number"},
            "tax": {"type": "number"},
            "total": {"type": "number"},
            "currency": {"type": "string"}
        },
        "required": ["vendor_name", "invoice_number", "invoice_date", "line_items", "total"]
    }
}

def extract_invoice(pdf_path: str) -> dict:
    with open(pdf_path, "rb") as f:
        pdf_data = base64.standard_b64encode(f.read()).decode()

    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=2048,
        tools=[extraction_tool],
        tool_choice={"type": "tool", "name": "extract_invoice_data"},
        messages=[{
            "role": "user",
            "content": [
                {
                    "type": "document",
                    "source": {
                        "type": "base64",
                        "media_type": "application/pdf",
                        "data": pdf_data,
                    }
                },
                {"type": "text", "text": "Extract all invoice data from this document."}
            ]
        }]
    )

    for block in response.content:
        if block.type == "tool_use":
            return block.input
    return {}

Forcing tool use with tool_choice guarantees structured JSON output that you can insert directly into a database or feed to a downstream system.

Multi-Document Comparison

One of Claude's strongest capabilities is comparing information across multiple documents in a single conversation:

def compare_documents(pdf_paths: list[str], comparison_prompt: str) -> str:
    content = []

    for i, path in enumerate(pdf_paths):
        with open(path, "rb") as f:
            pdf_data = base64.standard_b64encode(f.read()).decode()

        content.append({
            "type": "document",
            "source": {
                "type": "base64",
                "media_type": "application/pdf",
                "data": pdf_data,
            }
        })
        content.append({
            "type": "text",
            "text": f"The above is Document {i + 1}: {path}",
        })

    content.append({"type": "text", "text": comparison_prompt})

    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=4096,
        messages=[{"role": "user", "content": content}]
    )
    return response.content[0].text

# Compare two contracts
result = compare_documents(
    ["contract_v1.pdf", "contract_v2.pdf"],
    "Compare these two contract versions. List every change including "
    "additions, deletions, and modifications to terms. Flag any changes "
    "that affect liability, payment terms, or termination clauses."
)

Scaling Document Processing

For batch document processing, combine PDF analysis with the Batches API:

def batch_analyze_pdfs(pdf_paths: list[str], question: str) -> str:
    requests = []
    for i, path in enumerate(pdf_paths):
        with open(path, "rb") as f:
            pdf_data = base64.standard_b64encode(f.read()).decode()

        requests.append({
            "custom_id": f"pdf-{i}-{path}",
            "params": {
                "model": "claude-sonnet-4-20250514",
                "max_tokens": 2048,
                "messages": [{
                    "role": "user",
                    "content": [
                        {
                            "type": "document",
                            "source": {
                                "type": "base64",
                                "media_type": "application/pdf",
                                "data": pdf_data,
                            }
                        },
                        {"type": "text", "text": question}
                    ]
                }]
            }
        })

    batch = client.messages.batches.create(requests=requests)
    return batch.id

This approach processes hundreds of PDFs at 50% cost while handling rate limits automatically.

FAQ

What is the maximum PDF size Claude can process?

Each PDF is converted to images internally. Claude can handle PDFs up to approximately 100 pages per request, though performance is optimal with shorter documents. For very large documents, split them into sections and process each section separately, then use a final synthesis step.

Can Claude extract data from scanned PDFs without OCR?

Yes. Because Claude processes PDF pages as images, it can read text from scanned documents directly — no OCR preprocessing required. This works for most print quality scans. Very low resolution scans or heavily distorted documents may need preprocessing with image enhancement tools first.

How accurate is table extraction from PDFs?

Claude's table extraction is highly accurate for standard table layouts — rows, columns, headers, and merged cells are handled well. Complex nested tables or tables that span multiple pages may require additional prompting to handle correctly. Always validate extracted numerical data against known totals when accuracy is critical.


#Claude #PDFProcessing #DocumentAnalysis #DataExtraction #Python #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Interview Prep

7 AI Coding Interview Questions From Anthropic, Meta & OpenAI (2026 Edition)

Real AI coding interview questions from Anthropic, Meta, and OpenAI in 2026. Includes implementing attention from scratch, Anthropic's progressive coding screens, Meta's AI-assisted round, and vector search — with solution approaches.

Learn Agentic AI

Building a Multi-Agent Data Pipeline: Ingestion, Transformation, and Analysis Agents

Build a three-agent data pipeline with ingestion, transformation, and analysis agents that process data from APIs, CSVs, and databases using Python.

Learn Agentic AI

How to Build an AI Coding Assistant with Claude and MCP: Step-by-Step Guide

Build a powerful AI coding assistant that reads files, runs tests, and fixes bugs using the Claude API and Model Context Protocol servers in TypeScript.

Learn Agentic AI

Building Your First MCP Server: Connect AI Agents to Any External Tool

Step-by-step tutorial on building an MCP server in TypeScript, registering tools and resources, handling requests, and connecting to Claude and other LLM clients.

Learn Agentic AI

Computer Use Agents 2026: How Claude, GPT-5.4, and Gemini Navigate Desktop Applications

Comparison of computer use capabilities across Claude, GPT-5.4, and Gemini including accuracy benchmarks, speed tests, supported applications, and real-world limitations.

Learn Agentic AI

Building a Research Agent with Web Search and Report Generation: Complete Tutorial

Build a research agent that searches the web, extracts and synthesizes data, and generates formatted reports using OpenAI Agents SDK and web search tools.