Skip to content
AI Agent for Document Generation: Contracts, Proposals, and Reports on Demand
Learn Agentic AI13 min read24 views

AI Agent for Document Generation: Contracts, Proposals, and Reports on Demand

Build an AI agent that generates professional documents like contracts, proposals, and reports by combining template engines, dynamic data injection, and PDF rendering with version tracking.

From Manual Documents to Automated Generation

Every business produces documents: contracts for new clients, proposals for deals, weekly reports for stakeholders, and invoices for accounting. These documents follow consistent templates but require unique data for each instance. A document generation agent combines template engines for structure, LLM reasoning for dynamic content, and PDF rendering for professional output.

This guide walks through building a complete document generation agent that accepts structured data, fills templates, generates custom sections with AI, renders PDFs, and tracks versions.

Defining Document Templates

We use Jinja2 as the template engine. Each template is an HTML file with placeholders for dynamic data. HTML-to-PDF conversion produces professional output with CSS styling:

flowchart LR
    INPUT(["User intent"])
    PARSE["Parse plus<br/>classify"]
    PLAN["Plan and tool<br/>selection"]
    AGENT["Agent loop<br/>LLM plus tools"]
    GUARD{"Guardrails<br/>and policy"}
    EXEC["Execute and<br/>verify result"]
    OBS[("Trace and metrics")]
    OUT(["Outcome plus<br/>next action"])
    INPUT --> PARSE --> PLAN --> AGENT --> GUARD
    GUARD -->|Pass| EXEC --> OUT
    GUARD -->|Fail| AGENT
    AGENT --> OBS
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style OBS fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff
from jinja2 import Environment, FileSystemLoader
from dataclasses import dataclass, field
from datetime import datetime
from typing import Any

@dataclass
class DocumentTemplate:
    name: str
    template_file: str
    required_fields: list[str]
    ai_sections: list[str] = field(default_factory=list)

TEMPLATES = {
    "contract": DocumentTemplate(
        name="Service Agreement",
        template_file="contract.html",
        required_fields=["client_name", "client_address", "service_description",
                         "start_date", "end_date", "total_amount"],
        ai_sections=["scope_of_work", "termination_clause"],
    ),
    "proposal": DocumentTemplate(
        name="Business Proposal",
        template_file="proposal.html",
        required_fields=["prospect_name", "company", "problem_statement",
                         "budget_range"],
        ai_sections=["executive_summary", "proposed_solution", "timeline"],
    ),
    "report": DocumentTemplate(
        name="Weekly Report",
        template_file="report.html",
        required_fields=["team_name", "week_start", "metrics", "highlights"],
        ai_sections=["analysis", "recommendations"],
    ),
}

env = Environment(loader=FileSystemLoader("templates"))

Each template declares which fields are required from the user and which sections should be generated by the AI. This separation keeps humans in control of factual data while delegating narrative writing to the LLM.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

Generating AI-Powered Sections

The agent generates document sections based on the structured data and the document type. Each section gets a targeted prompt:

from openai import OpenAI

client = OpenAI()

SECTION_PROMPTS = {
    "executive_summary": (
        "Write a concise executive summary for a business proposal. "
        "Focus on the client's problem and why our solution is the best fit. "
        "Keep it under 150 words. Use a professional but approachable tone."
    ),
    "proposed_solution": (
        "Describe the proposed solution in detail. Include methodology, "
        "deliverables, and key differentiators. Use bullet points for clarity."
    ),
    "scope_of_work": (
        "Write a clear scope of work clause for a service agreement. "
        "Be specific about what is included and what is excluded."
    ),
    "termination_clause": (
        "Write a standard termination clause. Include notice period, "
        "grounds for termination, and obligations upon termination."
    ),
    "analysis": (
        "Analyze the metrics and highlights provided. Identify trends, "
        "areas of concern, and positive developments."
    ),
    "recommendations": (
        "Based on the analysis, provide 3-5 actionable recommendations "
        "for the next week. Be specific and prioritized."
    ),
    "timeline": (
        "Create a realistic project timeline with milestones. "
        "Include discovery, implementation, testing, and launch phases."
    ),
}

def generate_section(section_name: str, context: dict[str, Any]) -> str:
    """Generate a document section using an LLM."""
    prompt = SECTION_PROMPTS.get(section_name, f"Write the {section_name} section.")
    context_str = "\n".join(f"{k}: {v}" for k, v in context.items())

    response = client.chat.completions.create(
        model="gpt-4o",
        temperature=0.4,
        messages=[
            {"role": "system", "content": prompt},
            {"role": "user", "content": f"Document context:\n{context_str}"},
        ],
    )
    return response.choices[0].message.content

Building the Document Assembly Pipeline

The assembly pipeline validates input data, generates AI sections, renders the template, and produces a PDF:

import hashlib
import json

@dataclass
class GeneratedDocument:
    template_name: str
    html_content: str
    data: dict[str, Any]
    version_hash: str
    created_at: str

def assemble_document(template_key: str, data: dict[str, Any]) -> GeneratedDocument:
    """Assemble a complete document from template and data."""
    template_def = TEMPLATES[template_key]

    # Validate required fields
    missing = [f for f in template_def.required_fields if f not in data]
    if missing:
        raise ValueError(f"Missing required fields: {missing}")

    # Generate AI sections
    for section in template_def.ai_sections:
        if section not in data:
            data[section] = generate_section(section, data)

    # Render HTML template
    template = env.get_template(template_def.template_file)
    html = template.render(**data, generated_date=datetime.now().strftime("%B %d, %Y"))

    # Compute version hash for tracking
    content_hash = hashlib.sha256(json.dumps(data, sort_keys=True).encode()).hexdigest()[:12]

    return GeneratedDocument(
        template_name=template_def.name,
        html_content=html,
        data=data,
        version_hash=content_hash,
        created_at=datetime.now().isoformat(),
    )

Rendering PDFs with WeasyPrint

WeasyPrint converts HTML with CSS directly to PDF. It handles page breaks, headers, footers, and professional typography:

from weasyprint import HTML
from pathlib import Path

def render_pdf(document: GeneratedDocument, output_dir: str = "output") -> str:
    """Render an assembled document to PDF."""
    Path(output_dir).mkdir(exist_ok=True)
    filename = (
        f"{document.template_name.replace(' ', '_').lower()}"
        f"_{document.version_hash}.pdf"
    )
    filepath = Path(output_dir) / filename

    HTML(string=document.html_content).write_pdf(str(filepath))
    return str(filepath)

Version Tracking and Storage

Every generated document is tracked with its input data, version hash, and metadata. This enables auditing and regeneration:

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

import sqlite3

def init_db(db_path: str = "documents.db"):
    conn = sqlite3.connect(db_path)
    conn.execute("""
        CREATE TABLE IF NOT EXISTS documents (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            template_name TEXT NOT NULL,
            version_hash TEXT NOT NULL,
            input_data TEXT NOT NULL,
            pdf_path TEXT,
            created_at TEXT NOT NULL
        )
    """)
    conn.commit()
    return conn

def save_document_record(conn: sqlite3.Connection, doc: GeneratedDocument, pdf_path: str):
    conn.execute(
        "INSERT INTO documents (template_name, version_hash, input_data, pdf_path, created_at) "
        "VALUES (?, ?, ?, ?, ?)",
        (doc.template_name, doc.version_hash, json.dumps(doc.data), pdf_path, doc.created_at),
    )
    conn.commit()

FAQ

Never deploy AI-generated legal text without lawyer review. Use the AI to generate first drafts based on your approved clause library, then flag all AI-generated sections for human review. Store approved clause variants as few-shot examples in your prompts to improve consistency.

Can I add custom branding like logos and company colors?

Yes. The HTML templates support full CSS including custom fonts, colors, and embedded images. Use base64-encoded images in the template or reference files in the templates directory. WeasyPrint handles CSS print media queries for page-specific styling.

How do I handle document revisions and track changes?

Store each version with its input data and version hash. To show changes between versions, diff the rendered HTML or the input data dictionaries. The version hash changes whenever any input field changes, making it easy to detect modifications.


#DocumentGeneration #AIAgents #PDFGeneration #TemplateEngine #WorkflowAutomation #Python #AgenticAI #LearnAI #AIEngineering

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Agents

Personal AI Assistant: How to Pick One for Business in 2026

A founder's guide to the personal AI assistant market: best AI assistant apps, business-grade options, and how CallSphere's voice agent fits in.

AI Agents

Free AI Agents in 2026: When Free Wins and When It Costs You

A founder's guide to free AI agents, low-code AI agent builders, and how to know when you should pay for a real platform like CallSphere.

Agentic AI

Graphiti: How Temporal Knowledge Graphs Give AI Voice Agents Persistent Memory (2026 Guide)

Graphiti is the open-source temporal knowledge graph for AI agents in 2026. Learn how bi-temporal memory beats vector RAG for voice agents and long-running LLMs.

AI Agents

Chatbot App vs ChatGPT: What's the Difference, and Which Do I Need?

Chatbot app vs ChatGPT in 2026: a founder's clear take on the difference, when to use which, and how a real AI chatbot app development works.

HVAC

Building an HVAC After-Hours Emergency Escalation System: A Complete Engineering Guide

How we built a fault-tolerant HVAC emergency triage and tech-dispatch platform on Kubernetes — three-tier CQRS, 11 micro-agents on the OpenAI Agents SDK + LangGraph, NATS JetStream, DTMF/SMS/WebSocket acceptance, circuit breakers, and an evaluation pipeline that catches regressions before they wake a tech at 3 AM.

Enterprise AI

OpenAI Frontier vs Anthropic Managed Agents: 2026 Comparison

Head-to-head: OpenAI Frontier and Anthropic's managed agent stack — strengths, fit, and what each means for enterprise AI voice and chat deployment.