Skip to content
Learn Agentic AI
Learn Agentic AI10 min read1 views

API Pagination for AI Agent Data: Cursor-Based, Offset, and Keyset Pagination

Compare cursor-based, offset, and keyset pagination strategies for AI agent APIs. Includes FastAPI implementations, performance analysis, and guidance on choosing the right approach for your data access patterns.

Why Pagination Matters for AI Agent APIs

AI agents generate enormous volumes of data: conversation histories, tool call logs, evaluation results, and audit trails. Returning all records in a single response is impractical. Without pagination, a single query for an agent's conversation history could return millions of messages, consuming excessive memory, saturating the network, and timing out.

Pagination splits large result sets into manageable pages. The three dominant strategies — offset-based, cursor-based, and keyset pagination — each offer different performance characteristics and consistency guarantees.

Offset-Based Pagination: Simple but Fragile

Offset pagination uses a page number or offset combined with a limit. It is the most intuitive approach and maps directly to SQL's LIMIT and OFFSET clauses.

flowchart TD
    START["API Pagination for AI Agent Data: Cursor-Based, O…"] --> A
    A["Why Pagination Matters for AI Agent APIs"]
    A --> B
    B["Offset-Based Pagination: Simple but Fra…"]
    B --> C
    C["Cursor-Based Pagination: Consistent and…"]
    C --> D
    D["Keyset Pagination: Maximum Database Per…"]
    D --> E
    E["Choosing the Right Strategy"]
    E --> F
    F["FAQ"]
    F --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
from fastapi import FastAPI, Query
from pydantic import BaseModel
from sqlalchemy import select, func
from sqlalchemy.ext.asyncio import AsyncSession

app = FastAPI()

class PaginatedResponse(BaseModel):
    data: list[dict]
    total: int
    offset: int
    limit: int
    has_more: bool

@app.get("/v1/agents/{agent_id}/messages")
async def list_messages_offset(
    agent_id: str,
    offset: int = Query(0, ge=0),
    limit: int = Query(20, ge=1, le=100),
    db: AsyncSession = Depends(get_db),
):
    total = await db.scalar(
        select(func.count())
        .select_from(Message)
        .where(Message.agent_id == agent_id)
    )

    rows = await db.execute(
        select(Message)
        .where(Message.agent_id == agent_id)
        .order_by(Message.created_at.desc())
        .offset(offset)
        .limit(limit)
    )
    messages = rows.scalars().all()

    return PaginatedResponse(
        data=[m.to_dict() for m in messages],
        total=total,
        offset=offset,
        limit=limit,
        has_more=offset + limit < total,
    )

The problem with offset pagination is performance degradation at scale. OFFSET 1000000 forces the database to scan and discard one million rows before returning results. It also suffers from consistency issues: if new records are inserted while the client is paginating, pages can shift, causing duplicated or skipped items.

Cursor-Based Pagination: Consistent and Scalable

Cursor pagination uses an opaque token representing the position of the last item on the current page. The server decodes the cursor to determine where to start the next page, avoiding the performance cliff of large offsets.

import base64
import json

def encode_cursor(created_at: str, id: str) -> str:
    payload = json.dumps({"created_at": created_at, "id": id})
    return base64.urlsafe_b64encode(payload.encode()).decode()

def decode_cursor(cursor: str) -> dict:
    payload = base64.urlsafe_b64decode(cursor.encode()).decode()
    return json.loads(payload)

class CursorPaginatedResponse(BaseModel):
    data: list[dict]
    next_cursor: str | None
    has_more: bool

@app.get("/v1/agents/{agent_id}/conversations")
async def list_conversations_cursor(
    agent_id: str,
    cursor: str | None = Query(None),
    limit: int = Query(20, ge=1, le=100),
    db: AsyncSession = Depends(get_db),
):
    query = (
        select(Conversation)
        .where(Conversation.agent_id == agent_id)
        .order_by(
            Conversation.created_at.desc(),
            Conversation.id.desc(),
        )
    )

    if cursor:
        decoded = decode_cursor(cursor)
        query = query.where(
            (Conversation.created_at < decoded["created_at"])
            | (
                (Conversation.created_at == decoded["created_at"])
                & (Conversation.id < decoded["id"])
            )
        )

    rows = await db.execute(query.limit(limit + 1))
    items = rows.scalars().all()

    has_more = len(items) > limit
    items = items[:limit]

    next_cursor = None
    if has_more and items:
        last = items[-1]
        next_cursor = encode_cursor(
            last.created_at.isoformat(), str(last.id)
        )

    return CursorPaginatedResponse(
        data=[c.to_dict() for c in items],
        next_cursor=next_cursor,
        has_more=has_more,
    )

The trick of fetching limit + 1 items lets you determine whether more pages exist without running a separate count query.

Keyset Pagination: Maximum Database Performance

Keyset pagination is a variant of cursor pagination that directly uses column values rather than opaque tokens. It requires a strict, unique ordering and leverages database indexes for maximum efficiency.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

@app.get("/v1/agents/{agent_id}/tool-calls")
async def list_tool_calls_keyset(
    agent_id: str,
    after_id: int | None = Query(None),
    limit: int = Query(50, ge=1, le=200),
    db: AsyncSession = Depends(get_db),
):
    query = (
        select(ToolCall)
        .where(ToolCall.agent_id == agent_id)
        .order_by(ToolCall.id.asc())
    )

    if after_id is not None:
        query = query.where(ToolCall.id > after_id)

    rows = await db.execute(query.limit(limit + 1))
    items = rows.scalars().all()
    has_more = len(items) > limit
    items = items[:limit]

    return {
        "data": [t.to_dict() for t in items],
        "next_after_id": items[-1].id if has_more else None,
        "has_more": has_more,
    }

This generates a simple WHERE id > :after_id ORDER BY id LIMIT :limit query that uses an index seek instead of a sequential scan, performing consistently regardless of how deep into the dataset you paginate.

Choosing the Right Strategy

Use offset pagination for admin dashboards and internal tools where datasets are small, users need to jump to specific pages, and simplicity is valued over performance.

Use cursor pagination for public APIs consumed by AI agents that iterate through large datasets sequentially. It provides stable results and consistent performance.

Use keyset pagination when you control both the API and the client, your ordering column is indexed and unique, and you need maximum query performance on tables with millions of rows.

FAQ

Can I mix pagination strategies in the same API?

Yes, but be consistent within each resource. For example, use cursor pagination for conversation messages (which are append-heavy and sequentially accessed) and offset pagination for a paginated admin dashboard that needs page jumping. Document the strategy clearly in your OpenAPI spec for each endpoint.

How do I handle filtering with cursor pagination?

Apply filters before cursor conditions. The cursor encodes position within the filtered result set. If a user changes filters mid-pagination, they must start from the beginning with no cursor. Never reuse a cursor from a different filter combination — the underlying position may point to a record that no longer matches the new filter.

What page size should I default to for AI agent APIs?

Start with 20 to 50 items per page, with a maximum of 100 to 200. AI agents processing data in bulk may benefit from larger pages to reduce HTTP round trips, but excessively large pages increase memory pressure and response latency. Let clients specify the page size via a limit query parameter with a sane default and a hard maximum.


#APIPagination #CursorPagination #FastAPI #DatabasePerformance #AIAgents #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Use Cases

Automating Client Document Collection: How AI Agents Chase Missing Tax Documents and Reduce Filing Delays

See how AI agents automate tax document collection — chasing missing W-2s, 1099s, and receipts via calls and texts to eliminate the #1 CPA bottleneck.

Learn Agentic AI

API Design for AI Agent Tool Functions: Best Practices and Anti-Patterns

How to design tool functions that LLMs can use effectively with clear naming, enum parameters, structured responses, informative error messages, and documentation.

Learn Agentic AI

AI Agents for IT Helpdesk: L1 Automation, Ticket Routing, and Knowledge Base Integration

Build IT helpdesk AI agents with multi-agent architecture for triage, device, network, and security issues. RAG-powered knowledge base, automated ticket creation, routing, and escalation.

Learn Agentic AI

Computer Use in GPT-5.4: Building AI Agents That Navigate Desktop Applications

Technical guide to GPT-5.4's computer use capabilities for building AI agents that interact with desktop UIs, browser automation, and real-world application workflows.

Learn Agentic AI

Prompt Engineering for AI Agents: System Prompts, Tool Descriptions, and Few-Shot Patterns

Agent-specific prompt engineering techniques: crafting effective system prompts, writing clear tool descriptions for function calling, and few-shot examples that improve complex task performance.

Learn Agentic AI

Google Cloud AI Agent Trends Report 2026: Key Findings and Developer Implications

Analysis of Google Cloud's 2026 AI agent trends report covering Gemini-powered agents, Google ADK, Vertex AI agent builder, and enterprise adoption patterns.