By Sagar Shankaran, Founder of CallSphere
Your agent's memory, embeddings, and conversation state all live in Postgres. Backups must include vector data and survive a full-region loss. Here's how CallSphere does PITR for 115+ tables.
Key takeaways
TL;DR — pg_basebackup + WAL archiving covers vector data correctly. The hard part isn't backups — it's testing restores. Restore weekly, restore tested = restore that works.
flowchart TD
Client[Client] --> Edge[Cloudflare Worker]
Edge -->|WS upgrade| DO[Durable Object]
DO --> AI[(OpenAI Realtime WS)]
AI --> DO
DO --> Client
DO -.hibernation.-> Storage[(Persisted state)]pgvector won the AI memory war by being boring: it's just Postgres with a vector type. That means standard PostgreSQL backup techniques (pg_dump, base backups, WAL archiving) work for vector columns too. Most teams know this and still get burned because:
In 2026 most teams need 35-day PITR (Aurora's default) or longer for compliance, and 1-hour RTO for AI agent state.
Run three layers:
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Also: monitor your backups themselves — replication lag, archive failures, restore test success.
CallSphere runs Postgres 16 with pgvector 0.8.0 on a k3s StatefulSet. 115+ tables including agents, tools, calls, messages, embeddings (pgvector with HNSW indexes), and tenant-isolated vertical tables.
DR plan:
Per vertical:
:8084 — agent state in hc_agent_state table; embeddings of patient intake notes in hc_embeddings (3072-d, OpenAI text-embedding-3-large).re_listings_embed; the 6-container NATS pod's planning state in re_plans.sales_convo_embed; PM2 worker session metadata.Last quarterly drill: full restore of 480GB DB in 38 minutes. Embedding queries returned correctly within 90 seconds of cluster ready. $1499 enterprise tier on /pricing gets a documented DR plan + restore time SLA. Try the 14-day trial.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
# postgresql.conf
wal_level = replica
archive_mode = on
archive_command = 'wal-g wal-push %p'
# k8s CronJob
wal-g backup-push /var/lib/postgresql/data
pg_dump -Fc \
--extension=vector \
--extension=pg_trgm \
-d callsphere > callsphere.dump
wal-g backup-fetch /restore LATEST
echo "restore_command = 'wal-g wal-fetch %f %p'" >> /restore/postgresql.auto.conf
pg_ctl -D /restore start
psql -c "SELECT count(*) FROM hc_embeddings;"
psql -c "SELECT id FROM hc_embeddings ORDER BY embedding <-> ARRAY[...]::vector LIMIT 1;"
Q: Does pg_dump support vectors? A: Yes (pgvector ≥ 0.5). It uses the binary format. Make sure the destination has the pgvector extension at the same version.
Q: How long does a 1TB pgvector restore take? A: With wal-g and parallel apply, ~40 min for the base + WAL replay catchup. Index rebuild for HNSW is the long pole — plan for 3–6 hours.
Q: What's RPO for this setup? A: 60 seconds. WAL ships every minute; you lose at most a minute of writes.
Q: Should I use logical replication for DR? A: Pair it with physical streaming. Logical for cross-version migrations; physical for fast cluster failover.
Q: HNSW indexes increase restore time — worth it? A: For < 5M vectors, no — IVFFlat rebuilds faster. For > 10M, HNSW is a must despite the longer rebuild.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
The five vector databases competing for production traffic in 2026, benchmarked on QPS, recall, hybrid search, and operational cost.
pgvector 0.9 brings hybrid search, binary vectors, and improved indexing primitives. Why Postgres-native vector is good enough for most teams in 2026 honestly.
Mem0 on a single Postgres + pgvector instance, end to end. Schema, indexes, and the queries that keep latency under 200ms even with millions of memory records.
pgvector 0.8 with binary quantization cut HNSW build time 150x and hits 471 QPS at 99% recall on 50M vectors. Here is the production tuning guide for Postgres-shop teams.
pg_duckdb embeds DuckDB inside Postgres so transactional and analytic queries share the same database. AI dashboards that took 90 sec on Postgres run in <1 sec via DuckDB — without leaving Postgres.
Prisma owns the schema model, Atlas owns the migration plan, lint, and CI/CD. Together you get declarative schema, automatic diff plans, and zero-downtime production deploys for AI-heavy Postgres.
© 2026 CallSphere LLC. All rights reserved.