---
title: "File Upload Handling in FastAPI for AI Agents: Processing Documents and Images"
description: "Handle file uploads in FastAPI for AI agent document processing and image analysis. Learn type validation, size limits, chunked uploads for large files, and async processing pipelines for uploaded content."
canonical: https://callsphere.ai/blog/file-upload-handling-fastapi-ai-agents-documents-images
category: "Learn Agentic AI"
tags: ["FastAPI", "File Upload", "Document Processing", "AI Agents", "Python"]
author: "CallSphere Team"
published: 2026-03-17T00:00:00.000Z
updated: 2026-05-06T01:02:44.850Z
---

# File Upload Handling in FastAPI for AI Agents: Processing Documents and Images

> Handle file uploads in FastAPI for AI agent document processing and image analysis. Learn type validation, size limits, chunked uploads for large files, and async processing pipelines for uploaded content.

## File Uploads for AI Agent Workloads

AI agents frequently need to process user-uploaded files: PDFs for research agents, images for vision analysis, CSV files for data agents, or code files for coding assistants. FastAPI handles file uploads through Starlette's `UploadFile` class, which provides async file reading, automatic temp file management, and streaming for large files.

The key challenge is not just receiving the file but validating it, storing it safely, and feeding it into your AI processing pipeline efficiently.

## Basic File Upload Endpoint

Start with a simple upload endpoint that accepts a file alongside agent parameters:

```mermaid
flowchart LR
    CLIENT(["Client SDK"])
    GW["API Gateway
auth plus rate limit"]
    APP["FastAPI app
handlers and DI"]
    VAL["Pydantic validation"]
    SVC["Service layer
business logic"]
    DB[(Database)]
    QUEUE[(Background queue)]
    OBS[(Tracing)]
    CLIENT --> GW --> APP --> VAL --> SVC
    SVC --> DB
    SVC --> QUEUE
    SVC --> OBS
    SVC --> CLIENT
    style GW fill:#4f46e5,stroke:#4338ca,color:#fff
    style APP fill:#f59e0b,stroke:#d97706,color:#1f2937
    style DB fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
```

```python
from fastapi import UploadFile, File, Form, HTTPException

@router.post("/agent/upload")
async def upload_and_process(
    file: UploadFile = File(...),
    agent_type: str = Form(default="document"),
    instructions: str = Form(default="Summarize this document"),
):
    content = await file.read()

    if not content:
        raise HTTPException(400, "Empty file")

    result = await document_agent.process(
        content=content,
        filename=file.filename,
        instructions=instructions,
    )

    return {
        "filename": file.filename,
        "size_bytes": len(content),
        "result": result,
    }
```

## File Type and Size Validation

Never trust client-provided file types. Validate both the extension and the actual file content:

```python
import magic  # python-magic library

ALLOWED_TYPES = {
    "application/pdf": [".pdf"],
    "text/plain": [".txt", ".md", ".csv"],
    "text/csv": [".csv"],
    "image/png": [".png"],
    "image/jpeg": [".jpg", ".jpeg"],
}

MAX_FILE_SIZE = 20 * 1024 * 1024  # 20 MB

async def validate_upload(file: UploadFile) -> bytes:
    # Read content
    content = await file.read()

    # Check size
    if len(content) > MAX_FILE_SIZE:
        raise HTTPException(
            413,
            f"File too large. Maximum size: "
            f"{MAX_FILE_SIZE // (1024*1024)}MB",
        )

    # Check actual MIME type using file content
    detected_type = magic.from_buffer(content, mime=True)

    if detected_type not in ALLOWED_TYPES:
        raise HTTPException(
            415,
            f"Unsupported file type: {detected_type}. "
            f"Allowed: {', '.join(ALLOWED_TYPES.keys())}",
        )

    # Verify extension matches content
    ext = Path(file.filename).suffix.lower()
    allowed_exts = ALLOWED_TYPES[detected_type]
    if ext not in allowed_exts:
        raise HTTPException(
            400,
            f"Extension {ext} does not match "
            f"detected type {detected_type}",
        )

    # Reset file position for downstream processing
    await file.seek(0)
    return content
```

The `python-magic` library reads file headers to determine the actual type, preventing renamed malicious files from bypassing extension checks.

## Multiple File Upload

AI agents that compare documents or process batches need multi-file upload:

```python
from typing import List

@router.post("/agent/batch-upload")
async def batch_upload(
    files: List[UploadFile] = File(...),
    instructions: str = Form(default="Compare these documents"),
):
    if len(files) > 10:
        raise HTTPException(400, "Maximum 10 files per batch")

    processed_files = []
    total_size = 0

    for file in files:
        content = await validate_upload(file)
        total_size += len(content)

        if total_size > 50 * 1024 * 1024:  # 50MB total limit
            raise HTTPException(
                413, "Total upload size exceeds 50MB limit"
            )

        processed_files.append({
            "filename": file.filename,
            "content": content,
            "size": len(content),
        })

    result = await document_agent.process_batch(
        files=processed_files,
        instructions=instructions,
    )
    return result
```

## Storing Uploaded Files

For files that need to persist beyond the request, save them to disk or object storage:

```python
import aiofiles
from pathlib import Path

UPLOAD_DIR = Path("uploads")
UPLOAD_DIR.mkdir(exist_ok=True)

async def save_upload(
    file: UploadFile, subdirectory: str = ""
) -> Path:
    # Generate safe filename
    safe_name = f"{uuid.uuid4()}{Path(file.filename).suffix}"
    save_dir = UPLOAD_DIR / subdirectory
    save_dir.mkdir(parents=True, exist_ok=True)
    file_path = save_dir / safe_name

    async with aiofiles.open(file_path, "wb") as f:
        while chunk := await file.read(8192):
            await f.write(chunk)

    return file_path

@router.post("/agent/upload-and-store")
async def upload_store_process(
    file: UploadFile = File(...),
    db: AsyncSession = Depends(get_db),
):
    await validate_upload(file)
    await file.seek(0)

    file_path = await save_upload(file, subdirectory="documents")

    # Record in database
    doc = Document(
        filename=file.filename,
        stored_path=str(file_path),
        size_bytes=file_path.stat().st_size,
        uploaded_at=datetime.utcnow(),
    )
    db.add(doc)
    await db.flush()

    return {"document_id": str(doc.id), "filename": file.filename}
```

Reading the file in 8KB chunks with `aiofiles` prevents loading the entire file into memory at once, which matters for large uploads.

## Async Document Processing Pipeline

Combine file upload with background processing for a complete document agent workflow:

```python
@router.post("/agent/analyze-document", status_code=202)
async def analyze_document(
    file: UploadFile = File(...),
    analysis_type: str = Form(default="summary"),
    background_tasks: BackgroundTasks = None,
    db: AsyncSession = Depends(get_db),
):
    content = await validate_upload(file)
    await file.seek(0)

    # Save file
    file_path = await save_upload(file, "analysis")

    # Create task record
    task = AnalysisTask(
        filename=file.filename,
        stored_path=str(file_path),
        analysis_type=analysis_type,
        status="pending",
    )
    db.add(task)
    await db.flush()
    task_id = str(task.id)

    # Process in background
    background_tasks.add_task(
        run_document_analysis,
        task_id=task_id,
        file_path=str(file_path),
        analysis_type=analysis_type,
    )

    return {"task_id": task_id, "status": "pending"}
```

## FAQ

### How do I handle very large file uploads without running out of memory?

Use chunked reading with `await file.read(chunk_size)` in a loop instead of `await file.read()` which loads the entire file into memory. For files over 100MB, consider a chunked upload protocol where the client uploads in parts, or use presigned URLs to upload directly to object storage like S3, then pass the object key to your API for processing.

### Can I accept both a file and a JSON body in the same request?

FastAPI does not allow combining `UploadFile` with a JSON request body in the same endpoint because multipart form data and JSON bodies use different content types. Use `Form()` parameters alongside `File()`, or accept the JSON as a string `Form` field and parse it with Pydantic manually. Another approach is a two-step flow: upload the file first and get back a file ID, then send a JSON request referencing that file ID.

### How do I extract text from uploaded PDFs for the AI agent?

Use libraries like `PyMuPDF` (fitz) or `pdfplumber` for text extraction. Read the uploaded bytes, open the PDF, iterate through pages, and extract text. For scanned PDFs without embedded text, you need OCR with a library like `pytesseract`. Process PDF extraction in a background task because it can be CPU-intensive for large documents with many pages.

---

#FastAPI #FileUpload #DocumentProcessing #AIAgents #Python #AgenticAI #LearnAI #AIEngineering

---

Source: https://callsphere.ai/blog/file-upload-handling-fastapi-ai-agents-documents-images