By Sagar Shankaran, Founder of CallSphere
Neo4j's agent-memory project ships short-term, long-term, and reasoning memory in one graph. Microsoft Agent Framework and LangChain both wire it in. Here is the production pattern.
Key takeaways
TL;DR — Neo4j Labs shipped
neo4j-agent-memoryin 2026 — a graph-native memory layer for AI agents with three layers in one knowledge graph: short-term (full conversation chains), long-term (entity + preference graph), and reasoning (record of how the agent solved past problems). It plugs into Microsoft Agent Framework, LangChain, Pydantic AI, Google ADK, Strands, and CrewAI.
Vector memory is fast for fuzzy recall but blind to relationships. Graph memory excels at multi-hop entity queries — "which patients did Dr. Lee see in February who also saw Dr. Park in March, and which of them have BCBS in-network?" That is a 4-hop join, expensive in vector + SQL, trivial in Cypher.
The neo4j-agent-memory schema models three memory layers as connected sub-graphs in one graph:
(:Session)-[:HAS_MESSAGE]->(:Message) chains(:Person)-[:WORKS_AT]->(:Org), (:Person)-[:PREFERS]->(:Preference), etc.(:Plan)-[:STEP]->(:Action)-[:USED_TOOL]->(:Tool) with outcome metadataflowchart LR
C[Conversation turn] --> EX[NER + relation extractor]
EX --> ST[(Short-term graph)]
EX --> LT[(Long-term graph)]
P[Agent plan] --> RM[(Reasoning graph)]
Q[Query] --> CY[Cypher router]
CY --> ST
CY --> LT
CY --> RM
CY --> A[Agent]
On every turn, an entity extractor runs (multi-stage: spaCy/GLiNER for cheap; LLM fallback for hard cases). Entities get upserted with embedding-based dedup ("Sagar S" -> "Sagar Shankaran"). Relations are extracted with a small LLM. New facts are appended; conflicting facts trigger a reconciliation prompt.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Retrieval is a typed Cypher query, not a similarity search. The agent has tools like graph_query_neighbors(entity), graph_find_path(a, b), graph_get_preferences(user). For fuzzy recall the agent calls a separate vector index over node text + descriptions.
The reasoning layer is the underrated piece: every plan + tool-call + outcome is logged. The agent can query "have I solved a problem like this before?" and lift the playbook.
CallSphere uses Neo4j as the cross-entity memory layer for verticals where relationships matter most:
(:Patient)-[:HAS_PLAN]->(:InsurancePlan)-[:IN_NETWORK_WITH]->(:Provider) and (:Patient)-[:PRESCRIBED]->(:Medication) for allergy/interaction checks.(:Buyer)-[:WORKING_WITH]->(:Agent)-[:BROKERAGE]->(:Brokerage), (:Listing)-[:IN]->(:Neighborhood)-[:ZONED_FOR]->(:School).(:Incident)-[:ON]->(:Service)-[:DEPENDS_ON]->(:Service) for blast-radius reasoning.37 agents · 90+ tools · 115+ DB tables · 6 verticals. $149/$499/$1499, 14-day trial, 22% affiliate. Vertical pages: /industries/it-services, /industries/real-estate.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
from neo4j_agent_memory import AgentMemory
from neo4j import GraphDatabase
driver = GraphDatabase.driver("bolt://localhost:7687", auth=("neo4j", "***"))
mem = AgentMemory(driver)
# Write
mem.add_long_term(
user_id="patient_4421",
text="Patient is allergic to penicillin and amoxicillin.",
extract_entities=True,
)
mem.add_short_term(session_id="s_99", role="user", content="Same kid as last visit")
# Read with Cypher
with driver.session() as s:
rows = s.run("""
MATCH (p:Person {id: $uid})-[:ALLERGIC_TO]->(d:Drug)
RETURN d.name AS drug
""", uid="patient_4421").data()
asserted_at and source properties.Graph or vector? Both. Graph for entity-heavy queries; vector for fuzzy recall.
Neo4j or Memgraph? Neo4j for ecosystem and labs (agent-memory, GenAI integrations); Memgraph for raw query throughput.
Cypher complexity? Mid. A senior engineer is productive in a week.
Cost? Neo4j Aura starts at hobby tier; self-host community edition is free.
See it on /demo? Yes — try a multi-hop query like "find providers in-network for both my plans."
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
Graphiti is the open-source temporal knowledge graph for AI agents in 2026. Learn how bi-temporal memory beats vector RAG for voice agents and long-running LLMs.
GPT-Realtime-2 brings GPT-5-class reasoning into voice. What that means for tool-call reliability, structured output, and production agent design.
The public MCP registry crossed 9,400 servers in April 2026. Here is a curated walkthrough of the SaaS MCP servers CallSphere mounts in production, with OAuth 2.1 PKCE patterns.
AI SDK 5 ships fully typed chat for React, Svelte, Vue, and Angular plus first-class agent loop primitives. Here are the patterns that matter for shipping in 2026.
Personalizing agents for one user is easy. Personalizing them for a million users is a memory-tier problem. The hot/warm/cold split and what each tier optimizes for.
Long-running agents accumulate noisy state. Five consolidation patterns — summarization, salience scoring, decay, dedup, and refactor — and when each one fits.
© 2026 CallSphere LLC. All rights reserved.