---
title: "Time-Decay Memory for Chat Agents: Ebbinghaus Curves in Practice"
description: "Good agent memory needs to forget. Time-decay weights recent memories higher; Ebbinghaus-style curves auto-evict stale entries; TTL tiers keep allergies forever and small-talk for an hour."
canonical: https://callsphere.ai/blog/vw6g-time-decay-memory-chat-agents-2026
category: "Agentic AI"
tags: ["Memory", "Time Decay", "Forgetting", "Chat Agents", "Ebbinghaus"]
author: "CallSphere Team"
published: 2026-04-19T00:00:00.000Z
updated: 2026-05-07T16:46:12.821Z
---

# Time-Decay Memory for Chat Agents: Ebbinghaus Curves in Practice

> Good agent memory needs to forget. Time-decay weights recent memories higher; Ebbinghaus-style curves auto-evict stale entries; TTL tiers keep allergies forever and small-talk for an hour.

> **TL;DR** — Agents that never forget end up flooded with stale, irrelevant context. Time-decay memory weights recent memories higher (exponential decay on recency), uses TTL tiers for category-specific lifetimes ("dietary allergies" = forever; "today's mood" = 24 hours), and auto-evicts low-utility entries. The 2026 best-of-class agents use Ebbinghaus-curve decay with reinforcement on recall.

## The technique

Naive memory: dump every turn into a vector store, retrieve top-K each time. Three failures: (1) stale facts (the user moved cities a year ago); (2) salience inversion (the agent prefers a single vivid memory over a more recent contradicting one); (3) cost (memory grows without bound).

Time-decay memory multiplies semantic similarity by a recency function: `score = sim * exp(-lambda * age)`. `lambda` controls half-life; longer half-lives for stable facts, shorter for volatile state.

Ebbinghaus-curve memory goes further: each memory has a continuous decay rate. Successful recalls *reinforce* the memory (push the curve out); unused memories decay and eventually evict.

```mermaid
flowchart LR
  T[New turn] --> EX[Extract facts]
  EX --> TT{TTL tier}
  TT -->|allergy| INF[Infinite TTL]
  TT -->|preference| LONG[1y TTL]
  TT -->|context| SHORT[7d TTL]
  TT -->|chat-only| SES[session]
  INF --> S[(Memory store)]
  LONG --> S
  SHORT --> S
  Q[Query] --> R[Retrieve]
  R --> SC[score = sim * exp -lambda*age]
  SC --> RE[Reinforce on hit]
  RE --> S
```

## How it works

Each memory entry: `{ id, text, embedding, created_at, last_accessed_at, ttl_tier, decay_lambda, hit_count }`. At write time, an LLM tags the fact with a TTL tier (immutable / long / short / session) and an initial `decay_lambda`. At retrieval, the score is `cos(q, m.embedding) * exp(-m.decay_lambda * (now - m.last_accessed_at))`. On a hit, `last_accessed_at` updates; `hit_count` increments; `decay_lambda` decreases (memory hardens). A nightly job evicts entries where `exp(-lambda * age) < 0.05` and `hit_count == 0`.

## CallSphere implementation

Every CallSphere voice/chat agent runs time-decay memory:

- **Allergies + insurance numbers** in Healthcare = infinite TTL
- **Preferred broker / preferred school district** in OneRoof = 1-year TTL
- **Last 5 ticket subjects** in UrackIT IT helpdesk = 30-day TTL
- **Mood, current task, in-call context** = session-only

Decay parameters live per vertical. Healthcare's medication-allergy memory has `lambda` = 0 (immutable). Real-estate buyer urgency ("we want to close in 30 days") has `lambda` = 0.05/day so it fades after the buying window.

37 agents · 90+ tools · 115+ DB tables · 6 verticals. **$149/$499/$1499**, [14-day trial](/trial), [22% affiliate](/affiliate). See multi-turn memory at work on [/demo](/demo).

## Build steps with code

```python
import math, time

TTL_TIERS = {
    "immutable": (0.0, None),     # never evict, no decay
    "long":      (0.001, 365*86400),  # 1 year
    "short":     (0.01, 30*86400),
    "session":   (0.1, 86400),
}

def write_memory(text):
    tier = classify_ttl(text)        # LLM call: returns one of TTL_TIERS keys
    lam, ttl = TTL_TIERS[tier]
    db.insert("memory", {
        "text": text, "embedding": embed(text),
        "created_at": time.time(), "last_accessed_at": time.time(),
        "ttl_tier": tier, "decay_lambda": lam, "hit_count": 0,
    })

def retrieve(q, top_k=5):
    cands = vector_search(embed(q), k=50)
    now = time.time()
    scored = [
        (m, m.cos_sim * math.exp(-m.decay_lambda * (now - m.last_accessed_at)))
        for m in cands
    ]
    top = sorted(scored, key=lambda x: -x[1])[:top_k]
    for m, _ in top:
        db.update("memory", m.id, {
            "last_accessed_at": now,
            "hit_count": m.hit_count + 1,
            "decay_lambda": m.decay_lambda * 0.9,  # reinforce
        })
    return [m for m, _ in top]
```

1. LLM-classify TTL on write. The classifier is the silent ranker.
2. Reinforce on retrieval; do not just return — update.
3. Run a nightly evictor for hit_count == 0 and effective_score < 0.05.
4. Cap memory size per user; spillover evicts oldest session-tier first.

## Pitfalls

- **Wrong TTL classifier**: tagging "I love pizza" as immutable pollutes future calls. Calibrate.
- **Decay too aggressive**: agent forgets a real allergy. Always test on a golden set.
- **No staleness detection**: a "highly retrieved" memory is not necessarily *correct*. Add explicit contradiction handling.
- **Reinforcement loop**: mis-classified memory keeps getting hit, never decays. Add a max_hit_count guardrail.

## FAQ

**Decay or TTL?** Both. TTL is the floor (mass eviction), decay is the score modifier.

**Embedding store or graph?** Hybrid — embedding for fuzzy recall, graph for entity-heavy recalls. See vw6g-15 on graph memory.

**Per-user or global?** Per-user always. Cross-user memory is a privacy violation.

**Cost?** ~$0.001 per memory write (the TTL classifier). Cheap.

**See it on /demo?** Yes — the multi-turn demo logs decay scores in the trace panel.

## Sources

- [State of AI Agent Memory 2026 - Mem0](https://mem0.ai/blog/state-of-ai-agent-memory-2026)
- [A Practical Guide to Memory for Autonomous LLM Agents - TDS](https://towardsdatascience.com/a-practical-guide-to-memory-for-autonomous-llm-agents/)
- [Architecture and Orchestration of Memory Systems in AI Agents - Analytics Vidhya](https://www.analyticsvidhya.com/blog/2026/04/memory-systems-in-ai-agents/)
- [Agent Memory: Why Your AI Has Amnesia - Oracle](https://blogs.oracle.com/developers/agent-memory-why-your-ai-has-amnesia)

---

Source: https://callsphere.ai/blog/vw6g-time-decay-memory-chat-agents-2026
