Skip to content
Learn Agentic AI
Learn Agentic AI9 min read14 views

SQLiteSession: Building Persistent Conversations for AI Agents

Learn how to use SQLiteSession in the OpenAI Agents SDK to build persistent multi-turn conversations with automatic history retrieval, storage, and session limits.

Why Agent Memory Matters

Every meaningful conversation depends on memory. When a user asks your AI agent "What did I just say?" or "Can you change the second item?", the agent needs access to the conversation history. Without persistence, every interaction starts from zero — the agent has no idea who it is talking to or what has been discussed.

The OpenAI Agents SDK solves this with sessions — pluggable backends that store and retrieve conversation history automatically. The simplest and most portable option is SQLiteSession, which uses SQLite as the storage engine.

SQLiteSession Basics

SQLiteSession comes built into the OpenAI Agents SDK. It supports two modes: in-memory (for testing and ephemeral conversations) and file-based (for true persistence across process restarts).

flowchart TD
    START["SQLiteSession: Building Persistent Conversations …"] --> A
    A["Why Agent Memory Matters"]
    A --> B
    B["SQLiteSession Basics"]
    B --> C
    C["Automatic History Retrieval and Storage"]
    C --> D
    D["Multi-Turn Conversation Example"]
    D --> E
    E["SessionSettings with the Limit Parameter"]
    E --> F
    F["Full Working Chatbot Example"]
    F --> G
    G["When to Use SQLiteSession"]
    G --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff

In-Memory Sessions

An in-memory session lives only as long as the Python process. It is perfect for unit tests and short-lived scripts where you need multi-turn behavior but do not need data to survive a restart.

from agents import Agent, Runner
from agents.extensions.sessions import SQLiteSession

# In-memory session — data lost when process ends
session = SQLiteSession()

agent = Agent(
    name="Assistant",
    instructions="You are a helpful assistant. Remember what the user tells you.",
)

async def chat(user_message: str, session_id: str):
    result = await Runner.run(
        agent,
        user_message,
        session=session,
        session_id=session_id,
    )
    return result.final_output

File-Based Sessions

For real persistence, pass a file path to SQLiteSession. The database file is created automatically if it does not exist.

from agents.extensions.sessions import SQLiteSession

# File-based session — survives process restarts
session = SQLiteSession(db_path="./conversations.db")

That single change means your agent remembers conversations across restarts, deployments, and even server migrations (just copy the .db file).

Automatic History Retrieval and Storage

The key design principle of sessions in the Agents SDK is that they are transparent. You do not need to manually load history before a run or save it after. The runner handles both automatically.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

When you call Runner.run() with a session and session_id:

  1. Before the run: The runner calls session.get_items(session_id) to load all prior conversation items.
  2. During the run: The agent sees the full history as context and generates a response.
  3. After the run: The runner calls session.add_items(session_id, new_items) to persist the new turn.
import asyncio
from agents import Agent, Runner
from agents.extensions.sessions import SQLiteSession

session = SQLiteSession(db_path="./my_agent.db")

agent = Agent(
    name="MemoryBot",
    instructions="You remember everything the user tells you. When asked to recall, be specific.",
)

async def main():
    sid = "user-123-conversation-1"

    # Turn 1
    result = await Runner.run(
        agent, "My favorite color is blue.", session=session, session_id=sid
    )
    print(result.final_output)

    # Turn 2 — the agent automatically sees Turn 1
    result = await Runner.run(
        agent, "What is my favorite color?", session=session, session_id=sid
    )
    print(result.final_output)  # "Your favorite color is blue."

asyncio.run(main())

No manual history threading. No message array management. The session handles it.

Multi-Turn Conversation Example

Let us build a more realistic example — a travel planning agent that accumulates preferences over multiple turns.

flowchart TD
    ROOT["SQLiteSession: Building Persistent Conversat…"] 
    ROOT --> P0["SQLiteSession Basics"]
    P0 --> P0C0["In-Memory Sessions"]
    P0 --> P0C1["File-Based Sessions"]
    ROOT --> P1["SessionSettings with the Limit Parameter"]
    P1 --> P1C0["Choosing the Right Limit"]
    ROOT --> P2["Full Working Chatbot Example"]
    P2 --> P2C0["What Happens Under the Hood"]
    style ROOT fill:#4f46e5,stroke:#4338ca,color:#fff
    style P0 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
    style P1 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
    style P2 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
import asyncio
from agents import Agent, Runner
from agents.extensions.sessions import SQLiteSession

session = SQLiteSession(db_path="./travel_planner.db")

travel_agent = Agent(
    name="TravelPlanner",
    instructions="""You are a travel planning assistant. As the user shares preferences,
    build up a mental model of their ideal trip. Summarize what you know when asked.
    Be specific about dates, budget, and destinations mentioned.""",
)

async def multi_turn_demo():
    sid = "trip-planning-session-42"

    turns = [
        "I want to visit Japan in October.",
        "My budget is around $3000 for flights and hotels.",
        "I love hiking and traditional temples.",
        "Can you summarize what you know about my trip so far?",
    ]

    for message in turns:
        print(f"User: {message}")
        result = await Runner.run(
            travel_agent, message, session=session, session_id=sid
        )
        print(f"Agent: {result.final_output}
")

asyncio.run(multi_turn_demo())

Each turn builds on the previous ones. The fourth message triggers a summary that references Japan, October, the $3000 budget, and the hiking and temples preferences — all pulled from the session automatically.

SessionSettings with the Limit Parameter

Long conversations accumulate tokens fast. The SessionSettings class lets you control how much history the runner loads from the session. The limit parameter caps the number of items retrieved.

flowchart TD
    CENTER(("Core Concepts"))
    CENTER --> N0["During the run: The agent sees the full…"]
    CENTER --> N1["After the run: The runner calls session…"]
    CENTER --> N2["The user types a message."]
    CENTER --> N3["The retrieved items are prepended to th…"]
    CENTER --> N4["The agent generates a response using th…"]
    CENTER --> N5["The new user message and agent response…"]
    style CENTER fill:#4f46e5,stroke:#4338ca,color:#fff
from agents.extensions.sessions import SQLiteSession, SessionSettings

session = SQLiteSession(db_path="./conversations.db")

# Only load the last 20 items from history
settings = SessionSettings(limit=20)

result = await Runner.run(
    agent,
    "What were we discussing?",
    session=session,
    session_id="user-456",
    session_settings=settings,
)

This is critical for production systems where conversations can span hundreds of turns. Without a limit, you risk exceeding the model's context window or paying for unnecessary input tokens.

Choosing the Right Limit

Scenario Recommended Limit
Quick Q&A bot 10-20 items
Customer support agent 30-50 items
Long-running project assistant 50-100 items
Unlimited context (use with compaction) No limit, use compaction

Full Working Chatbot Example

Here is a complete, production-style chatbot that uses SQLiteSession with file persistence, session limits, and proper async handling.

import asyncio
import uuid
from agents import Agent, Runner
from agents.extensions.sessions import SQLiteSession, SessionSettings

DB_PATH = "./chatbot_sessions.db"

session = SQLiteSession(db_path=DB_PATH)
settings = SessionSettings(limit=50)

assistant = Agent(
    name="ChatBot",
    instructions="""You are a friendly and helpful conversational assistant.
    You remember the user's name, preferences, and prior requests.
    When the user returns to a topic discussed earlier, reference it naturally.
    Keep responses concise but warm.""",
)

async def handle_message(session_id: str, user_input: str) -> str:
    """Process a single user message and return the agent response."""
    result = await Runner.run(
        assistant,
        user_input,
        session=session,
        session_id=session_id,
        session_settings=settings,
    )
    return result.final_output

async def main():
    print("ChatBot ready. Type 'quit' to exit, 'new' for a new session.
")
    session_id = str(uuid.uuid4())
    print(f"Session: {session_id}
")

    while True:
        user_input = input("You: ").strip()
        if user_input.lower() == "quit":
            break
        if user_input.lower() == "new":
            session_id = str(uuid.uuid4())
            print(f"
New session: {session_id}
")
            continue
        if not user_input:
            continue

        response = await handle_message(session_id, user_input)
        print(f"Bot: {response}
")

asyncio.run(main())

What Happens Under the Hood

  1. The user types a message.
  2. Runner.run() calls session.get_items(session_id) which executes a SQL query: SELECT * FROM session_items WHERE session_id = ? ORDER BY rowid LIMIT 50.
  3. The retrieved items are prepended to the conversation context.
  4. The agent generates a response using the full context.
  5. The new user message and agent response are persisted via session.add_items().
  6. On the next turn, the cycle repeats with the updated history.

When to Use SQLiteSession

SQLiteSession is the right choice when:

  • You are building a single-server application or CLI tool
  • You want zero-dependency persistence (SQLite is built into Python)
  • You need a quick prototype with real persistence
  • Your conversations are bound to a single process or machine

For distributed systems where multiple workers need access to the same sessions, look at RedisSession or SQLAlchemySession instead. But for a remarkable number of use cases — personal assistants, development tools, local chatbots, and MVP products — SQLiteSession is all you need.

Sources:

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Technical Guides

Building Voice Agents with the OpenAI Realtime API: Full Tutorial

Hands-on tutorial for building voice agents with the OpenAI Realtime API — WebSocket setup, PCM16 audio, server VAD, and function calling.

Technical Guides

How AI Voice Agents Actually Work: Technical Deep Dive (2026 Edition)

A full technical walkthrough of how modern AI voice agents work — speech-to-text, LLM orchestration, TTS, tool calling, and sub-second latency.

Technical Guides

Voice AI Latency: Why Sub-Second Response Time Matters (And How to Hit It)

A technical breakdown of voice AI latency budgets — STT, LLM, TTS, network — and how to hit sub-second end-to-end response times.

AI Interview Prep

8 AI System Design Interview Questions Actually Asked at FAANG in 2026

Real AI system design interview questions from Google, Meta, OpenAI, and Anthropic. Covers LLM serving, RAG pipelines, recommendation systems, AI agents, and more — with detailed answer frameworks.

AI Interview Prep

8 LLM & RAG Interview Questions That OpenAI, Anthropic & Google Actually Ask

Real LLM and RAG interview questions from top AI labs in 2026. Covers fine-tuning vs RAG decisions, production RAG pipelines, evaluation, PEFT methods, positional embeddings, and safety guardrails with expert answers.

AI Interview Prep

7 ML Fundamentals Questions That Top AI Companies Still Ask in 2026

Real machine learning fundamentals interview questions from OpenAI, Google DeepMind, Meta, and xAI in 2026. Covers attention mechanisms, KV cache, distributed training, MoE, speculative decoding, and emerging architectures.