Skip to content
Learn Agentic AI
Learn Agentic AI14 min read1 views

pgvector Tutorial: Adding Vector Search to Your Existing PostgreSQL Database

Learn how to install pgvector, create vector columns, build IVFFlat and HNSW indexes, and run similarity queries directly inside PostgreSQL without adding another database to your stack.

Why pgvector Changes the Game

Most teams building AI applications assume they need a dedicated vector database. They spin up Pinecone or Weaviate alongside their existing PostgreSQL instance, then wrestle with keeping data synchronized across two systems. pgvector eliminates that complexity by bringing vector similarity search directly into PostgreSQL.

pgvector is a PostgreSQL extension that adds a vector data type, distance operators, and approximate nearest neighbor (ANN) indexes. Your embeddings live in the same database as your application data, which means you can JOIN vectors with user tables, filter by metadata columns, and wrap everything in a single transaction.

Installing pgvector

On Ubuntu or Debian with PostgreSQL already installed:

flowchart TD
    START["pgvector Tutorial: Adding Vector Search to Your E…"] --> A
    A["Why pgvector Changes the Game"]
    A --> B
    B["Installing pgvector"]
    B --> C
    C["Creating a Vector Column"]
    C --> D
    D["Inserting Embeddings with Python"]
    D --> E
    E["Querying by Similarity"]
    E --> F
    F["IVFFlat vs HNSW Indexes"]
    F --> G
    G["Tuning Query Performance"]
    G --> H
    H["FAQ"]
    H --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
sudo apt-get install postgresql-16-pgvector

On macOS with Homebrew:

brew install pgvector

If you are running PostgreSQL in Docker, use an image that includes pgvector:

docker run -d --name pgvector-db \
  -e POSTGRES_PASSWORD=secret \
  -p 5432:5432 \
  pgvector/pgvector:pg16

Once installed, enable the extension in your database:

CREATE EXTENSION IF NOT EXISTS vector;

Creating a Vector Column

Add an embedding column to any table. The dimension must match your embedding model — OpenAI text-embedding-3-small produces 1536 dimensions:

CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    title TEXT NOT NULL,
    content TEXT NOT NULL,
    embedding vector(1536),
    created_at TIMESTAMPTZ DEFAULT NOW()
);

Inserting Embeddings with Python

Use the psycopg driver with pgvector support:

import psycopg
from pgvector.psycopg import register_vector
from openai import OpenAI

client = OpenAI()
conn = psycopg.connect("postgresql://user:secret@localhost/mydb")
register_vector(conn)

def embed_and_store(title: str, content: str):
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=content
    )
    embedding = response.data[0].embedding

    conn.execute(
        "INSERT INTO documents (title, content, embedding) VALUES (%s, %s, %s)",
        (title, content, embedding)
    )
    conn.commit()

embed_and_store("pgvector Guide", "pgvector adds vector search to PostgreSQL...")

Querying by Similarity

pgvector provides three distance operators:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

  • <-> — L2 (Euclidean) distance
  • <=> — cosine distance
  • <#> — negative inner product

For normalized embeddings, cosine distance is the most common choice:

SELECT id, title, embedding <=> '[0.1, 0.2, ...]'::vector AS distance
FROM documents
ORDER BY distance
LIMIT 10;

In Python:

def search(query: str, limit: int = 5):
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=query
    )
    query_vec = response.data[0].embedding

    results = conn.execute(
        "SELECT id, title, embedding <=> %s::vector AS distance "
        "FROM documents ORDER BY distance LIMIT %s",
        (query_vec, limit)
    ).fetchall()
    return results

IVFFlat vs HNSW Indexes

Without an index, pgvector performs exact nearest neighbor search — scanning every row. For large datasets, you need an ANN index.

IVFFlat partitions vectors into lists, then searches only the nearest lists at query time:

CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

Set lists to roughly sqrt(row_count). More lists means faster queries but lower recall. You must have data in the table before creating an IVFFlat index because it clusters existing vectors.

HNSW builds a hierarchical navigable small world graph. It is more expensive to build but delivers better recall-speed tradeoffs:

CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

HNSW can be created on an empty table and handles inserts incrementally. For most production workloads, HNSW is the better default.

Tuning Query Performance

Increase the search scope at query time for higher recall:

-- IVFFlat: search more lists (default is 1)
SET ivfflat.probes = 10;

-- HNSW: expand candidate list (default is 40)
SET hnsw.ef_search = 100;

Higher values increase recall but slow down queries. Profile with EXPLAIN ANALYZE to find the right balance.

FAQ

Can I use pgvector with an existing production PostgreSQL database?

Yes. pgvector is a standard PostgreSQL extension. You run CREATE EXTENSION vector, add vector columns with ALTER TABLE, and your existing tables, indexes, and queries continue working unchanged. There is no migration or data export required.

How many vectors can pgvector handle before performance degrades?

With HNSW indexes, pgvector handles millions of vectors with sub-10ms query latency on modest hardware. Teams have reported good performance up to 10-20 million rows on a single instance. Beyond that, consider partitioning or a dedicated vector database.

Should I use IVFFlat or HNSW for my project?

Start with HNSW unless you have a specific reason not to. HNSW provides better recall at the same speed, supports incremental inserts, and can be created on an empty table. IVFFlat is useful when you need faster index build times and can tolerate slightly lower recall.


#Pgvector #PostgreSQL #VectorSearch #Embeddings #Database #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Technical Guides

How to Train an AI Voice Agent on Your Business: Prompts, RAG, and Fine-Tuning

A practical guide to training an AI voice agent on your specific business — system prompts, RAG over knowledge bases, and when to fine-tune.

Learn Agentic AI

Semantic Search for AI Agents: Embedding Models, Chunking Strategies, and Retrieval Optimization

Comprehensive guide to semantic search for AI agents covering embedding model selection, document chunking strategies, and retrieval optimization techniques for production systems.

Learn Agentic AI

Text-to-SQL with Claude: Using Anthropic's API for Database Question Answering

Implement a text-to-SQL system using Anthropic's Claude API with tool use for SQL execution, multi-turn conversations, and structured output parsing for reliable database question answering.

Learn Agentic AI

Schema Representation for Text-to-SQL: How to Describe Your Database to LLMs

Master the art of schema representation for text-to-SQL systems. Learn how to format CREATE TABLE statements, add column descriptions, encode foreign key relationships, and provide sample data for maximum query accuracy.

Learn Agentic AI

Text-to-SQL Fundamentals: Converting Natural Language Questions to Database Queries

Learn what text-to-SQL is, how the architecture works from schema understanding to query generation, and why it is one of the most practical applications of large language models in enterprise software.

Learn Agentic AI

Embeddings and Vector Representations: How LLMs Understand Meaning

Learn what embeddings are, how they capture semantic meaning as vectors, how to use embedding models for search and clustering, and the role cosine similarity plays in AI applications.