Skip to content
AI Engineering
AI Engineering11 min read0 views

Supabase + pgvector for AI Agents: Memory, RAG, and Auth in One Backend (2026)

Supabase ships pgvector, RLS, and edge functions out of the box. Here's a working AI-agent memory layer with semantic recall, per-user RLS, and a match_documents RPC ready for LangChain.

TL;DR — Supabase = managed Postgres + pgvector + RLS + edge functions. For AI agent memory and RAG it's the lowest-friction stack in 2026 — five tables, one RPC, and your LangChain agent has long-term memory.

What you'll build

A Supabase project with a documents table for RAG, an agent_memories table for per-user long-term memory, an RLS policy that scopes both to auth.uid(), and a match_documents RPC that LangChain.js calls directly.

Schema

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE documents (
  id BIGSERIAL PRIMARY KEY,
  user_id UUID NOT NULL REFERENCES auth.users (id) ON DELETE CASCADE,
  content TEXT NOT NULL,
  metadata JSONB DEFAULT '{}',
  embedding vector(1536) NOT NULL,
  created_at TIMESTAMPTZ DEFAULT now()
);

CREATE INDEX documents_hnsw_idx ON documents
  USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);

ALTER TABLE documents ENABLE ROW LEVEL SECURITY;

CREATE POLICY user_owns_documents ON documents
  USING (user_id = auth.uid())
  WITH CHECK (user_id = auth.uid());

Architecture

flowchart LR
  USER[User in app] --> SB[Supabase Auth JWT]
  SB --> PG[(Postgres + pgvector)]
  PG --> RLS[RLS scopes to auth.uid]
  AGENT[LangChain agent] --> RPC[match_documents RPC]
  RPC --> PG
  PG --> RESULTS[Top-k user docs]
  RESULTS --> AGENT

Step 1 — Create the match RPC

CREATE OR REPLACE FUNCTION match_documents(
  query_embedding vector(1536),
  match_count int DEFAULT 5,
  filter jsonb DEFAULT '{}'
)
RETURNS TABLE (id bigint, content text, metadata jsonb, similarity float)
LANGUAGE plpgsql STABLE
AS $$
BEGIN
  RETURN QUERY
  SELECT d.id, d.content, d.metadata,
         1 - (d.embedding <=> query_embedding) AS similarity
  FROM documents d
  WHERE d.user_id = auth.uid()
    AND d.metadata @> filter
  ORDER BY d.embedding <=> query_embedding
  LIMIT match_count;
END;
$$;

Step 2 — Wire into LangChain.js

import { SupabaseVectorStore } from "@langchain/community/vectorstores/supabase";
import { OpenAIEmbeddings } from "@langchain/openai";
import { createClient } from "@supabase/supabase-js";

const supabase = createClient(
  process.env.SUPABASE_URL!,
  process.env.SUPABASE_ANON_KEY!,
  { global: { headers: { Authorization: `Bearer ${userJwt}` } } },
);

const store = new SupabaseVectorStore(new OpenAIEmbeddings(), {
  client: supabase,
  tableName: "documents",
  queryName: "match_documents",
});

const docs = await store.similaritySearch("appointment policy", 5);

The user's JWT carries auth.uid() — RLS plus the RPC keeps results scoped automatically.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

Step 3 — Edge function for ingestion

// supabase/functions/ingest/index.ts
import { OpenAI } from "https://esm.sh/openai@4";
import { createClient } from "https://esm.sh/@supabase/supabase-js@2";

Deno.serve(async (req) => {
  const { content } = await req.json();
  const oai = new OpenAI({ apiKey: Deno.env.get("OPENAI_API_KEY") });
  const v = (await oai.embeddings.create({
    model: "text-embedding-3-small", input: content,
  })).data[0].embedding;

  const sb = createClient(
    Deno.env.get("SUPABASE_URL")!,
    Deno.env.get("SUPABASE_SERVICE_ROLE_KEY")!,
  );
  const userId = (await sb.auth.getUser(req.headers.get("Authorization"))).data.user!.id;
  await sb.from("documents").insert({ user_id: userId, content, embedding: v });
  return new Response("ok");
});

Step 4 — Add agent memory

CREATE TABLE agent_memories (
  id BIGSERIAL PRIMARY KEY,
  user_id UUID NOT NULL,
  agent TEXT NOT NULL,
  kind TEXT NOT NULL,  -- 'episodic' | 'semantic'
  content TEXT NOT NULL,
  embedding vector(1536) NOT NULL,
  created_at TIMESTAMPTZ DEFAULT now()
);
ALTER TABLE agent_memories ENABLE ROW LEVEL SECURITY;
CREATE POLICY uo ON agent_memories USING (user_id = auth.uid());

Step 5 — Realtime updates

Supabase realtime auto-streams changes. Enable on agent_memories and clients see new memories live.

Step 6 — Cost guard

Embedding API costs add up. Hash content before embedding to avoid duplicates:

import crypto from "node:crypto";
const hash = crypto.createHash("sha1").update(content).digest("hex");
// query documents WHERE metadata->>'hash' = hash before embedding

Pitfalls

  • Service-role key in browser — never. Use anon key + RLS.
  • auth.uid() returns null — RPC was called with anon JWT. Pass the user's JWT in headers.
  • Index built on empty table — fine, but rebuild after first 100k inserts for better recall.
  • Realtime row size — Supabase realtime caps at 1MB per row.

CallSphere production note

UrackIT — CallSphere's non-HIPAA support vertical — runs Supabase + ChromaDB for fast prototyping and per-user memory. Healthcare and Behavioral Health stay on dedicated Postgres with the healthcare_voice Prisma schema; OneRoof keeps RLS for landlord isolation. Across 115+ DB tables · 37 agents · 90+ tools · 6 verticals, the same RRF + RLS pattern ports cleanly between hosts. Plans: $149/$499/$1,499 — 14-day trial, 22% affiliate.

FAQ

Q: Free tier limits? 500 MB DB + 2 GB egress + 50k auth users. Fine for prototypes.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Q: How big can pgvector grow on Supabase? Pro plan + dedicated compute scales to 100M+ rows.

Q: Can I bring my own LLM? Yes — edge functions can call any provider; no Supabase lock-in.

Q: What about data residency? Supabase supports US, EU, AP regions; pick at project create time.

Q: Migrate off Supabase later? pg_dump + restore to any Postgres. No proprietary surfaces beyond auth + storage.

Sources

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Engineering

Build a Chat Agent with Haystack RAG + Open LLM (Llama 3.2, 2026)

Haystack 2.7's Agent component plus an Ollama-served Llama 3.2 gives you tool-calling RAG with citations. Here's a complete pipeline against your own document store.

Agentic AI

LangGraph Checkpointers in Production: Durable, Resumable Agents with Eval Replay

Use LangGraph's checkpointer to make agents resumable across crashes and human-in-the-loop pauses, then replay any checkpoint into your eval pipeline.

Agentic AI

LangGraph State-Machine Architecture: A Principal-Engineer Deep Dive (2026)

How LangGraph's StateGraph, channels, and reducers actually work — with a working multi-step agent, eval hooks at every node, and the patterns that survive production.

Agentic AI

Production RAG Agents with LangChain and RAGAS Evaluation in 2026

Build a production RAG agent with LangChain, then measure faithfulness, answer relevance, and context precision with RAGAS. The four metrics that matter and how to wire them up.

Agentic AI

Building Your First Agent with the OpenAI Agents SDK in 2026: A Hands-On Walkthrough

Step-by-step build of a working agent with the OpenAI Agents SDK — Agent class, tools, handoffs, tracing — plus an eval pipeline that catches regressions before merge.

Agentic AI

Agentic RAG with LangGraph: Iterative Retrieval, Self-Correction, and Eval Pipelines

Beyond single-shot RAG — agentic RAG with LangGraph that re-retrieves, self-grades, and rewrites queries. With evals that catch silent retrieval drift.