Skip to content
Learn Agentic AI
Learn Agentic AI11 min read0 views

LangGraph Checkpointing: Persistence and Time Travel for Agent Workflows

Implement persistence and time travel in LangGraph using MemorySaver, SqliteSaver, and PostgresSaver to checkpoint agent state, replay past executions, and recover from failures.

Why Checkpointing Matters

Without checkpointing, a LangGraph workflow is ephemeral. If the process crashes mid-execution, all state is lost and you must start over. Checkpointing solves this by saving the graph state after every node execution. This enables three critical capabilities: crash recovery, conversation memory across sessions, and time travel to inspect or replay past states.

MemorySaver: In-Memory Checkpointing

The simplest checkpointer stores state in a Python dictionary. It is perfect for development and testing:

flowchart TD
    START["LangGraph Checkpointing: Persistence and Time Tra…"] --> A
    A["Why Checkpointing Matters"]
    A --> B
    B["MemorySaver: In-Memory Checkpointing"]
    B --> C
    C["Thread IDs for Conversation Isolation"]
    C --> D
    D["SqliteSaver: Persistent Local Storage"]
    D --> E
    E["PostgresSaver: Production Persistence"]
    E --> F
    F["Time Travel: Inspecting Past States"]
    F --> G
    G["Replaying from a Past Checkpoint"]
    G --> H
    H["FAQ"]
    H --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import StateGraph, START, END
from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages

class State(TypedDict):
    messages: Annotated[list, add_messages]

def echo(state: State) -> dict:
    last = state["messages"][-1].content
    return {"messages": [{"role": "assistant", "content": f"Echo: {last}"}]}

builder = StateGraph(State)
builder.add_node("echo", echo)
builder.add_edge(START, "echo")
builder.add_edge("echo", END)

memory = MemorySaver()
graph = builder.compile(checkpointer=memory)

All state is lost when the process exits. Use this only for development.

Thread IDs for Conversation Isolation

Each conversation gets its own thread ID. This lets multiple users share the same graph instance:

from langchain_core.messages import HumanMessage

# Conversation 1
config1 = {"configurable": {"thread_id": "user-alice"}}
graph.invoke({"messages": [HumanMessage(content="Hi, I'm Alice")]}, config1)
graph.invoke({"messages": [HumanMessage(content="What's my name?")]}, config1)

# Conversation 2 — completely isolated
config2 = {"configurable": {"thread_id": "user-bob"}}
graph.invoke({"messages": [HumanMessage(content="Hi, I'm Bob")]}, config2)

Each thread maintains its own state history. Alice and Bob never see each other's messages.

SqliteSaver: Persistent Local Storage

For persistence that survives process restarts, use the SQLite checkpointer:

from langgraph.checkpoint.sqlite import SqliteSaver
import sqlite3

conn = sqlite3.connect("checkpoints.db", check_same_thread=False)
sqlite_saver = SqliteSaver(conn)

graph = builder.compile(checkpointer=sqlite_saver)

# State persists to disk
config = {"configurable": {"thread_id": "persistent-thread"}}
graph.invoke({"messages": [HumanMessage(content="Remember this")]}, config)

# Later, even after restart, the conversation continues
result = graph.invoke(
    {"messages": [HumanMessage(content="What did I say?")]},
    config,
)

The SQLite file contains the full state history for every thread, including all intermediate checkpoints.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

PostgresSaver: Production Persistence

For production deployments, use PostgreSQL:

from langgraph.checkpoint.postgres import PostgresSaver

DB_URI = "postgresql://user:password@localhost:5432/langgraph_db"

with PostgresSaver.from_conn_string(DB_URI) as pg_saver:
    pg_saver.setup()  # Creates tables on first run
    graph = builder.compile(checkpointer=pg_saver)

    config = {"configurable": {"thread_id": "prod-session-123"}}
    result = graph.invoke(
        {"messages": [HumanMessage(content="Process this order")]},
        config,
    )

PostgresSaver handles concurrent access, transactions, and connection pooling. Call setup() once to create the required checkpoint tables.

Time Travel: Inspecting Past States

Every node execution creates a checkpoint. You can list and inspect all checkpoints for a thread:

config = {"configurable": {"thread_id": "my-thread"}}

# Get current state
current = graph.get_state(config)
print("Current messages:", len(current.values["messages"]))

# List all checkpoints (state history)
history = list(graph.get_state_history(config))
for i, state in enumerate(history):
    print(f"Checkpoint {i}: {len(state.values['messages'])} messages")
    print(f"  Created by node: {state.metadata.get('source', 'unknown')}")

Replaying from a Past Checkpoint

You can resume execution from any historical checkpoint by providing its ID:

# Get the second-to-last checkpoint
history = list(graph.get_state_history(config))
past_state = history[2]  # Go back two steps

# Resume from that point with new input
past_config = {
    "configurable": {
        "thread_id": "my-thread",
        "checkpoint_id": past_state.config["configurable"]["checkpoint_id"],
    }
}

result = graph.invoke(
    {"messages": [HumanMessage(content="Try a different approach")]},
    past_config,
)

This creates a new branch in the state history. The original checkpoints remain untouched, giving you a full audit trail of every execution path.

FAQ

Does checkpointing add significant overhead?

MemorySaver adds negligible overhead. SqliteSaver and PostgresSaver add serialization and I/O time proportional to state size. For typical chat agents with dozens of messages, each checkpoint takes a few milliseconds. For agents with very large state objects, consider keeping state lean and storing bulk data externally.

Can I delete old checkpoints to save storage?

There is no built-in pruning API in the core library. For PostgresSaver, you can write SQL queries to delete checkpoints older than a retention period. For SqliteSaver, you can run a cleanup job against the database file directly.

Is the checkpoint format portable between saver backends?

No. Each saver serializes state in its own format. You cannot migrate checkpoints from SQLite to PostgreSQL directly. If you need to migrate, you would read state from one saver and write it to another programmatically.


#LangGraph #Checkpointing #Persistence #TimeTravel #Python #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Interview Prep

7 AI Coding Interview Questions From Anthropic, Meta & OpenAI (2026 Edition)

Real AI coding interview questions from Anthropic, Meta, and OpenAI in 2026. Includes implementing attention from scratch, Anthropic's progressive coding screens, Meta's AI-assisted round, and vector search — with solution approaches.

Learn Agentic AI

AI Agent Framework Comparison 2026: LangGraph vs CrewAI vs AutoGen vs OpenAI Agents SDK

Side-by-side comparison of the top 4 AI agent frameworks: LangGraph, CrewAI, AutoGen, and OpenAI Agents SDK — architecture, features, production readiness, and when to choose each.

AI Interview Prep

7 Agentic AI & Multi-Agent System Interview Questions for 2026

Real agentic AI and multi-agent system interview questions from Anthropic, OpenAI, and Microsoft in 2026. Covers agent design patterns, memory systems, safety, orchestration frameworks, tool calling, and evaluation.

Learn Agentic AI

Building a Multi-Agent Data Pipeline: Ingestion, Transformation, and Analysis Agents

Build a three-agent data pipeline with ingestion, transformation, and analysis agents that process data from APIs, CSVs, and databases using Python.

Learn Agentic AI

Building a Research Agent with Web Search and Report Generation: Complete Tutorial

Build a research agent that searches the web, extracts and synthesizes data, and generates formatted reports using OpenAI Agents SDK and web search tools.

Learn Agentic AI

OpenAI Agents SDK in 2026: Building Multi-Agent Systems with Handoffs and Guardrails

Complete tutorial on the OpenAI Agents SDK covering agent creation, tool definitions, handoff patterns between specialist agents, and input/output guardrails for safe AI systems.