pgvector Tutorial: Adding Vector Search to Your Existing PostgreSQL Database

Why pgvector Changes the Game

Most teams building AI applications assume they need a dedicated vector database. They spin up Pinecone or Weaviate alongside their existing PostgreSQL instance, then wrestle with keeping data synchronized across two systems. pgvector eliminates that complexity by bringing vector similarity search directly into PostgreSQL.

pgvector is a PostgreSQL extension that adds a vector data type, distance operators, and approximate nearest neighbor (ANN) indexes. Your embeddings live in the same database as your application data, which means you can JOIN vectors with user tables, filter by metadata columns, and wrap everything in a single transaction.

Installing pgvector

On Ubuntu or Debian with PostgreSQL already installed:

flowchart TD
    DOC(["Document"])
    CHUNK["Chunker<br/>recursive plus overlap"]
    EMB["Embedding model"]
    META["Attach metadata<br/>source, page, tenant"]
    INDEX[("HNSW or IVF index<br/>in vector store")]
    Q(["Query"])
    QEMB["Embed query"]
    SEARCH["ANN search<br/>cosine similarity"]
    FILTER["Metadata filter<br/>tenant or date"]
    HITS(["Top-k chunks"])
    DOC --> CHUNK --> EMB --> META --> INDEX
    Q --> QEMB --> SEARCH
    INDEX --> SEARCH --> FILTER --> HITS
    style INDEX fill:#4f46e5,stroke:#4338ca,color:#fff
    style HITS fill:#059669,stroke:#047857,color:#fff

sudo apt-get install postgresql-16-pgvector

On macOS with Homebrew:

brew install pgvector

If you are running PostgreSQL in Docker, use an image that includes pgvector:

docker run -d --name pgvector-db \
  -e POSTGRES_PASSWORD=secret \
  -p 5432:5432 \
  pgvector/pgvector:pg16

Once installed, enable the extension in your database:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

CREATE EXTENSION IF NOT EXISTS vector;

Creating a Vector Column

Add an embedding column to any table. The dimension must match your embedding model — OpenAI text-embedding-3-small produces 1536 dimensions:

CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    title TEXT NOT NULL,
    content TEXT NOT NULL,
    embedding vector(1536),
    created_at TIMESTAMPTZ DEFAULT NOW()
);

Inserting Embeddings with Python

Use the psycopg driver with pgvector support:

import psycopg
from pgvector.psycopg import register_vector
from openai import OpenAI

client = OpenAI()
conn = psycopg.connect("postgresql://user:secret@localhost/mydb")
register_vector(conn)

def embed_and_store(title: str, content: str):
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=content
    )
    embedding = response.data[0].embedding

    conn.execute(
        "INSERT INTO documents (title, content, embedding) VALUES (%s, %s, %s)",
        (title, content, embedding)
    )
    conn.commit()

embed_and_store("pgvector Guide", "pgvector adds vector search to PostgreSQL...")

Querying by Similarity

pgvector provides three distance operators:

<-> — L2 (Euclidean) distance
<=> — cosine distance
<#> — negative inner product

For normalized embeddings, cosine distance is the most common choice:

SELECT id, title, embedding <=> '[0.1, 0.2, ...]'::vector AS distance
FROM documents
ORDER BY distance
LIMIT 10;

In Python:

def search(query: str, limit: int = 5):
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=query
    )
    query_vec = response.data[0].embedding

    results = conn.execute(
        "SELECT id, title, embedding <=> %s::vector AS distance "
        "FROM documents ORDER BY distance LIMIT %s",
        (query_vec, limit)
    ).fetchall()
    return results

IVFFlat vs HNSW Indexes

Without an index, pgvector performs exact nearest neighbor search — scanning every row. For large datasets, you need an ANN index.

IVFFlat partitions vectors into lists, then searches only the nearest lists at query time:

CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

Set lists to roughly sqrt(row_count). More lists means faster queries but lower recall. You must have data in the table before creating an IVFFlat index because it clusters existing vectors.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

HNSW builds a hierarchical navigable small world graph. It is more expensive to build but delivers better recall-speed tradeoffs:

CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

HNSW can be created on an empty table and handles inserts incrementally. For most production workloads, HNSW is the better default.

Tuning Query Performance

Increase the search scope at query time for higher recall:

-- IVFFlat: search more lists (default is 1)
SET ivfflat.probes = 10;

-- HNSW: expand candidate list (default is 40)
SET hnsw.ef_search = 100;

Higher values increase recall but slow down queries. Profile with EXPLAIN ANALYZE to find the right balance.

FAQ

Can I use pgvector with an existing production PostgreSQL database?

Yes. pgvector is a standard PostgreSQL extension. You run CREATE EXTENSION vector, add vector columns with ALTER TABLE, and your existing tables, indexes, and queries continue working unchanged. There is no migration or data export required.

How many vectors can pgvector handle before performance degrades?

With HNSW indexes, pgvector handles millions of vectors with sub-10ms query latency on modest hardware. Teams have reported good performance up to 10-20 million rows on a single instance. Beyond that, consider partitioning or a dedicated vector database.

Should I use IVFFlat or HNSW for my project?

Start with HNSW unless you have a specific reason not to. HNSW provides better recall at the same speed, supports incremental inserts, and can be created on an empty table. IVFFlat is useful when you need faster index build times and can tolerate slightly lower recall.

#Pgvector #PostgreSQL #VectorSearch #Embeddings #Database #AgenticAI #LearnAI #AIEngineering

pgvector Tutorial: Adding Vector Search to Your Existing PostgreSQL Database

Why pgvector Changes the Game

Installing pgvector

Creating a Vector Column

Inserting Embeddings with Python

Querying by Similarity

IVFFlat vs HNSW Indexes

Tuning Query Performance

FAQ

Can I use pgvector with an existing production PostgreSQL database?

How many vectors can pgvector handle before performance degrades?

Should I use IVFFlat or HNSW for my project?

Try CallSphere AI Voice Agents

Related Articles You May Like

Agentic RAG with LangGraph: Iterative Retrieval, Self-Correction, and Eval Pipelines

Production RAG Agents with LangChain and RAGAS Evaluation in 2026

Database Backup and Recovery for AI Agent State: Postgres + pgvector

Vector Database Benchmarks 2026: pgvector 0.9, Qdrant, Weaviate, Milvus, LanceDB

Embedding Fine-Tuning for Domain-Specific RAG

Vector Index Algorithms Compared: HNSW, IVF, ScaNN, DiskANN