Skip to content
Learn Agentic AI
Learn Agentic AI13 min read1 views

Building a Document Comparison Agent: AI-Powered Contract and Document Diff

Build an AI agent that extracts text from documents, aligns corresponding sections, detects meaningful differences between versions, and generates clear summaries highlighting what changed and why it matters.

Beyond Simple Text Diff

Standard diff tools compare text line by line. They will tell you that line 47 changed from "30 days" to "45 days" — but they will not tell you this is a payment terms extension that affects your cash flow. A document comparison agent understands context. It groups changes by section, classifies their significance (cosmetic, substantive, material), and explains the business impact of each change.

This is especially valuable for contract review, policy updates, regulatory filings, and any document where the meaning of changes matters as much as their location.

Text Extraction Tool

Documents arrive in various formats. This tool extracts clean text from PDFs, DOCX files, and plain text:

flowchart TD
    START["Building a Document Comparison Agent: AI-Powered …"] --> A
    A["Beyond Simple Text Diff"]
    A --> B
    B["Text Extraction Tool"]
    B --> C
    C["Section Alignment Tool"]
    C --> D
    D["Difference Detection Tool"]
    D --> E
    E["Similarity Scoring Tool"]
    E --> F
    F["Assembling the Document Comparison Agent"]
    F --> G
    G["Example Usage"]
    G --> H
    H["FAQ"]
    H --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
from pathlib import Path
from agents import Agent, Runner, function_tool

@function_tool
def extract_text(file_path: str) -> str:
    """Extract text content from a document file.
    Supports .txt, .pdf, and .docx formats."""
    path = Path(file_path)
    suffix = path.suffix.lower()

    try:
        if suffix == ".txt":
            return path.read_text(encoding="utf-8")

        elif suffix == ".pdf":
            import pymupdf
            doc = pymupdf.open(file_path)
            pages = []
            for page in doc:
                pages.append(page.get_text())
            doc.close()
            return "\n\n".join(pages)

        elif suffix == ".docx":
            from docx import Document
            doc = Document(file_path)
            paragraphs = [p.text for p in doc.paragraphs if p.text.strip()]
            return "\n\n".join(paragraphs)

        else:
            return f"Unsupported format: {suffix}"

    except Exception as e:
        return f"Extraction error: {e}"

Section Alignment Tool

Contracts and legal documents are structured into sections. This tool splits documents into sections and aligns them between versions:

import re
import difflib

_documents: dict[str, str] = {}

@function_tool
def load_document(label: str, file_path: str) -> str:
    """Load and store a document for comparison. Use labels like
    'original' and 'revised'."""
    from pathlib import Path
    path = Path(file_path)
    if path.suffix == ".txt":
        text = path.read_text()
    else:
        # Delegate to extract_text for other formats
        return f"Use extract_text for {path.suffix} files, then call store_text."

    _documents[label] = text
    word_count = len(text.split())
    section_count = len(re.split(r"\n(?=\d+\.|Section |Article |ARTICLE )", text))
    return f"Loaded '{label}': {word_count} words, ~{section_count} sections."

@function_tool
def store_text(label: str, text: str) -> str:
    """Store already-extracted text under a label for comparison."""
    _documents[label] = text
    return f"Stored '{label}': {len(text.split())} words."

Difference Detection Tool

This tool finds the actual differences between two document versions:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

@function_tool
def compute_diff(label_a: str, label_b: str) -> str:
    """Compute differences between two loaded documents.
    Returns additions, deletions, and modifications."""
    if label_a not in _documents or label_b not in _documents:
        available = ", ".join(_documents.keys())
        return f"Missing document. Available: {available}"

    lines_a = _documents[label_a].splitlines()
    lines_b = _documents[label_b].splitlines()

    differ = difflib.unified_diff(
        lines_a, lines_b,
        fromfile=label_a, tofile=label_b,
        lineterm="",
    )
    diff_lines = list(differ)

    if not diff_lines:
        return "Documents are identical."

    # Summarize changes
    additions = sum(1 for l in diff_lines if l.startswith("+") and not l.startswith("+++"))
    deletions = sum(1 for l in diff_lines if l.startswith("-") and not l.startswith("---"))

    # Extract changed sections (context around changes)
    changes = []
    current_change = []
    for line in diff_lines:
        if line.startswith("@@"):
            if current_change:
                changes.append("\n".join(current_change))
            current_change = [line]
        elif current_change is not None:
            current_change.append(line)
    if current_change:
        changes.append("\n".join(current_change))

    output = (
        f"Diff Summary: {additions} additions, {deletions} deletions, "
        f"{len(changes)} changed sections\n\n"
    )
    # Show first 10 change blocks
    for i, change in enumerate(changes[:10]):
        output += f"--- Change {i+1} ---\n{change}\n\n"

    if len(changes) > 10:
        output += f"... and {len(changes) - 10} more change blocks."

    return output

Similarity Scoring Tool

Quantify how different two documents are overall:

@function_tool
def similarity_score(label_a: str, label_b: str) -> str:
    """Calculate overall similarity between two documents."""
    if label_a not in _documents or label_b not in _documents:
        return "Missing document."

    text_a = _documents[label_a]
    text_b = _documents[label_b]

    # Sequence matcher for overall similarity
    ratio = difflib.SequenceMatcher(None, text_a, text_b).ratio()

    # Word-level comparison
    words_a = set(text_a.lower().split())
    words_b = set(text_b.lower().split())
    jaccard = len(words_a & words_b) / len(words_a | words_b) if (words_a | words_b) else 0

    return (
        f"Similarity between '{label_a}' and '{label_b}':\n"
        f"  Character-level similarity: {ratio:.1%}\n"
        f"  Word overlap (Jaccard): {jaccard:.1%}\n"
        f"  Unique to '{label_a}': {len(words_a - words_b)} words\n"
        f"  Unique to '{label_b}': {len(words_b - words_a)} words"
    )

Assembling the Document Comparison Agent

doc_agent = Agent(
    name="Document Comparator",
    instructions="""You are a document comparison agent specializing in contracts
and legal documents. When given two document versions:

1. Extract text from both documents using extract_text.
2. Store them with store_text using labels 'original' and 'revised'.
3. Call similarity_score for an overall comparison metric.
4. Call compute_diff to get the detailed differences.
5. Analyze each change block and classify it as:
   - Cosmetic: formatting, typos, rephrasing with same meaning
   - Substantive: meaningful change to terms, obligations, or rights
   - Material: high-impact change affecting financial terms, liability,
     termination, or indemnification
6. Produce a report with:
   - Executive Summary (overall similarity, number of material changes)
   - Material Changes (each with before/after text and impact analysis)
   - Substantive Changes (grouped by section)
   - Cosmetic Changes (brief list)
   - Risk Assessment (what the changes mean for the parties involved)""",
    tools=[extract_text, load_document, store_text, compute_diff, similarity_score],
)

Example Usage

result = Runner.run_sync(
    doc_agent,
    "Compare the original contract at /docs/contract_v1.pdf with the "
    "revised version at /docs/contract_v2.pdf. Focus on any changes to "
    "payment terms, liability clauses, and termination conditions.",
)
print(result.final_output)

The agent extracts text from both PDFs, computes a 94.2% similarity score, identifies 12 change blocks, classifies 2 as material (payment terms extended from 30 to 60 days, liability cap increased from $1M to $5M), 5 as substantive (new force majeure clause, updated data handling provisions), and 5 as cosmetic. The risk assessment highlights the cash flow impact of extended payment terms.

FAQ

Can this agent handle scanned PDFs without selectable text?

Not directly — scanned PDFs require OCR. Add a preprocessing step using pytesseract or a cloud OCR service like Google Document AI. Extract the text via OCR first, then feed it to the comparison agent through store_text.

How does the agent handle documents with completely different structures?

The diff tool works best when documents share a similar structure. For documents with reorganized sections, add a section-matching tool that uses semantic similarity (embeddings) to align sections by content rather than position before computing differences.

This agent provides a strong first pass that saves hours of manual review. However, for legally binding decisions, always have a qualified attorney review the agent's findings. The agent excels at surfacing changes that might be missed during manual review, not at replacing legal judgment.


#DocumentComparison #TextExtraction #Contracts #Diff #AIAgents #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Use Cases

Automating Client Document Collection: How AI Agents Chase Missing Tax Documents and Reduce Filing Delays

See how AI agents automate tax document collection — chasing missing W-2s, 1099s, and receipts via calls and texts to eliminate the #1 CPA bottleneck.

Learn Agentic AI

API Design for AI Agent Tool Functions: Best Practices and Anti-Patterns

How to design tool functions that LLMs can use effectively with clear naming, enum parameters, structured responses, informative error messages, and documentation.

Learn Agentic AI

AI Agents for IT Helpdesk: L1 Automation, Ticket Routing, and Knowledge Base Integration

Build IT helpdesk AI agents with multi-agent architecture for triage, device, network, and security issues. RAG-powered knowledge base, automated ticket creation, routing, and escalation.

Learn Agentic AI

Computer Use in GPT-5.4: Building AI Agents That Navigate Desktop Applications

Technical guide to GPT-5.4's computer use capabilities for building AI agents that interact with desktop UIs, browser automation, and real-world application workflows.

Learn Agentic AI

Prompt Engineering for AI Agents: System Prompts, Tool Descriptions, and Few-Shot Patterns

Agent-specific prompt engineering techniques: crafting effective system prompts, writing clear tool descriptions for function calling, and few-shot examples that improve complex task performance.

Learn Agentic AI

Google Cloud AI Agent Trends Report 2026: Key Findings and Developer Implications

Analysis of Google Cloud's 2026 AI agent trends report covering Gemini-powered agents, Google ADK, Vertex AI agent builder, and enterprise adoption patterns.