Why Agent Memory Matters

Every meaningful conversation depends on memory. When a user asks your AI agent "What did I just say?" or "Can you change the second item?", the agent needs access to the conversation history. Without persistence, every interaction starts from zero — the agent has no idea who it is talking to or what has been discussed.

The OpenAI Agents SDK solves this with sessions — pluggable backends that store and retrieve conversation history automatically. The simplest and most portable option is SQLiteSession, which uses SQLite as the storage engine.

SQLiteSession Basics

SQLiteSession comes built into the OpenAI Agents SDK. It supports two modes: in-memory (for testing and ephemeral conversations) and file-based (for true persistence across process restarts).

flowchart TD
    MSG(["New message"])
    WORKING["Working memory<br/>rolling window"]
    EPISODIC[("Episodic memory<br/>past sessions")]
    SEMANTIC[("Semantic memory<br/>facts and preferences")]
    SUM["Summarizer<br/>compresses old turns"]
    ROUTER{"Retrieve<br/>needed memories"}
    PROMPT["Assembled context"]
    LLM["LLM"]
    UPD["Memory updater<br/>writes new facts"]
    MSG --> WORKING --> ROUTER
    ROUTER -->|Past sessions| EPISODIC
    ROUTER -->|User facts| SEMANTIC
    EPISODIC --> SUM --> PROMPT
    SEMANTIC --> PROMPT
    WORKING --> PROMPT --> LLM --> UPD
    UPD --> EPISODIC
    UPD --> SEMANTIC
    style ROUTER fill:#4f46e5,stroke:#4338ca,color:#fff
    style LLM fill:#f59e0b,stroke:#d97706,color:#1f2937
    style EPISODIC fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style SEMANTIC fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b

In-Memory Sessions

An in-memory session lives only as long as the Python process. It is perfect for unit tests and short-lived scripts where you need multi-turn behavior but do not need data to survive a restart.

from agents import Agent, Runner
from agents.extensions.sessions import SQLiteSession

# In-memory session — data lost when process ends
session = SQLiteSession()

agent = Agent(
    name="Assistant",
    instructions="You are a helpful assistant. Remember what the user tells you.",
)

async def chat(user_message: str, session_id: str):
    result = await Runner.run(
        agent,
        user_message,
        session=session,
        session_id=session_id,
    )
    return result.final_output

File-Based Sessions

For real persistence, pass a file path to SQLiteSession. The database file is created automatically if it does not exist.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

from agents.extensions.sessions import SQLiteSession

# File-based session — survives process restarts
session = SQLiteSession(db_path="./conversations.db")

That single change means your agent remembers conversations across restarts, deployments, and even server migrations (just copy the .db file).

Automatic History Retrieval and Storage

The key design principle of sessions in the Agents SDK is that they are transparent. You do not need to manually load history before a run or save it after. The runner handles both automatically.

When you call Runner.run() with a session and session_id:

Before the run: The runner calls session.get_items(session_id) to load all prior conversation items.
During the run: The agent sees the full history as context and generates a response.
After the run: The runner calls session.add_items(session_id, new_items) to persist the new turn.

import asyncio
from agents import Agent, Runner
from agents.extensions.sessions import SQLiteSession

session = SQLiteSession(db_path="./my_agent.db")

agent = Agent(
    name="MemoryBot",
    instructions="You remember everything the user tells you. When asked to recall, be specific.",
)

async def main():
    sid = "user-123-conversation-1"

    # Turn 1
    result = await Runner.run(
        agent, "My favorite color is blue.", session=session, session_id=sid
    )
    print(result.final_output)

    # Turn 2 — the agent automatically sees Turn 1
    result = await Runner.run(
        agent, "What is my favorite color?", session=session, session_id=sid
    )
    print(result.final_output)  # "Your favorite color is blue."

asyncio.run(main())

No manual history threading. No message array management. The session handles it.

Multi-Turn Conversation Example

Let us build a more realistic example — a travel planning agent that accumulates preferences over multiple turns.

import asyncio
from agents import Agent, Runner
from agents.extensions.sessions import SQLiteSession

session = SQLiteSession(db_path="./travel_planner.db")

travel_agent = Agent(
    name="TravelPlanner",
    instructions="""You are a travel planning assistant. As the user shares preferences,
    build up a mental model of their ideal trip. Summarize what you know when asked.
    Be specific about dates, budget, and destinations mentioned.""",
)

async def multi_turn_demo():
    sid = "trip-planning-session-42"

    turns = [
        "I want to visit Japan in October.",
        "My budget is around $3000 for flights and hotels.",
        "I love hiking and traditional temples.",
        "Can you summarize what you know about my trip so far?",
    ]

    for message in turns:
        print(f"User: {message}")
        result = await Runner.run(
            travel_agent, message, session=session, session_id=sid
        )
        print(f"Agent: {result.final_output}
")

asyncio.run(multi_turn_demo())

Each turn builds on the previous ones. The fourth message triggers a summary that references Japan, October, the $3000 budget, and the hiking and temples preferences — all pulled from the session automatically.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

SessionSettings with the Limit Parameter

Long conversations accumulate tokens fast. The SessionSettings class lets you control how much history the runner loads from the session. The limit parameter caps the number of items retrieved.

from agents.extensions.sessions import SQLiteSession, SessionSettings

session = SQLiteSession(db_path="./conversations.db")

# Only load the last 20 items from history
settings = SessionSettings(limit=20)

result = await Runner.run(
    agent,
    "What were we discussing?",
    session=session,
    session_id="user-456",
    session_settings=settings,
)

This is critical for production systems where conversations can span hundreds of turns. Without a limit, you risk exceeding the model's context window or paying for unnecessary input tokens.

Choosing the Right Limit

Scenario	Recommended Limit
Quick Q&A bot	10-20 items
Customer support agent	30-50 items
Long-running project assistant	50-100 items
Unlimited context (use with compaction)	No limit, use compaction

Full Working Chatbot Example

Here is a complete, production-style chatbot that uses SQLiteSession with file persistence, session limits, and proper async handling.

import asyncio
import uuid
from agents import Agent, Runner
from agents.extensions.sessions import SQLiteSession, SessionSettings

DB_PATH = "./chatbot_sessions.db"

session = SQLiteSession(db_path=DB_PATH)
settings = SessionSettings(limit=50)

assistant = Agent(
    name="ChatBot",
    instructions="""You are a friendly and helpful conversational assistant.
    You remember the user's name, preferences, and prior requests.
    When the user returns to a topic discussed earlier, reference it naturally.
    Keep responses concise but warm.""",
)

async def handle_message(session_id: str, user_input: str) -> str:
    """Process a single user message and return the agent response."""
    result = await Runner.run(
        assistant,
        user_input,
        session=session,
        session_id=session_id,
        session_settings=settings,
    )
    return result.final_output

async def main():
    print("ChatBot ready. Type 'quit' to exit, 'new' for a new session.
")
    session_id = str(uuid.uuid4())
    print(f"Session: {session_id}
")

    while True:
        user_input = input("You: ").strip()
        if user_input.lower() == "quit":
            break
        if user_input.lower() == "new":
            session_id = str(uuid.uuid4())
            print(f"
New session: {session_id}
")
            continue
        if not user_input:
            continue

        response = await handle_message(session_id, user_input)
        print(f"Bot: {response}
")

asyncio.run(main())

What Happens Under the Hood

The user types a message.
Runner.run() calls session.get_items(session_id) which executes a SQL query: SELECT * FROM session_items WHERE session_id = ? ORDER BY rowid LIMIT 50.
The retrieved items are prepended to the conversation context.
The agent generates a response using the full context.
The new user message and agent response are persisted via session.add_items().
On the next turn, the cycle repeats with the updated history.

When to Use SQLiteSession

SQLiteSession is the right choice when:

You are building a single-server application or CLI tool
You want zero-dependency persistence (SQLite is built into Python)
You need a quick prototype with real persistence
Your conversations are bound to a single process or machine

For distributed systems where multiple workers need access to the same sessions, look at RedisSession or SQLAlchemySession instead. But for a remarkable number of use cases — personal assistants, development tools, local chatbots, and MVP products — SQLiteSession is all you need.

Sources:

SQLiteSession: Building Persistent Conversations for AI Agents

Why Agent Memory Matters

SQLiteSession Basics

In-Memory Sessions

File-Based Sessions

Automatic History Retrieval and Storage

Multi-Turn Conversation Example

SessionSettings with the Limit Parameter

Choosing the Right Limit

Full Working Chatbot Example

What Happens Under the Hood

When to Use SQLiteSession

Try CallSphere AI Voice Agents

Related Articles You May Like

Graphiti: How Temporal Knowledge Graphs Give AI Voice Agents Persistent Memory (2026 Guide)

Desktop AI Agents in 2026: Project Arc, Claude Cowork, OpenAI Agents Compared

OpenAI Frontier: Model-Native Orchestration Is the Default in 2026

Gemini Enterprise vs Anthropic vs OpenAI Frontier: 2026 Comparison

Anthropic's Financial Services Platform: State of Play in May 2026

Model-Native Harness: Why OpenAI and Anthropic Are Killing ReAct Loops