Skip to content
Technology
Technology9 min read2 views

Vector Database Benchmarks 2026: pgvector 0.9, Qdrant, Weaviate, Milvus, LanceDB

The five vector databases competing for production traffic in 2026, benchmarked on QPS, recall, hybrid search, and operational cost.

The Field

Five vector databases dominate production deployments in 2026: pgvector (Postgres extension), Qdrant, Weaviate, Milvus, and LanceDB. Each is the right answer for different shapes of workload. This is a side-by-side based on April 2026 benchmarks and production reports.

The Side-by-Side

flowchart TB
    pgvector[pgvector 0.9<br/>Postgres extension] --> SQL[Use case: SQL-shaped apps]
    Qdrant[Qdrant<br/>Rust] --> Hybrid[Use case: hybrid + late interaction]
    Weaviate[Weaviate<br/>Go] --> Module[Use case: modular + GraphQL]
    Milvus[Milvus<br/>Go/C++] --> Scale[Use case: largest scale]
    LanceDB[LanceDB<br/>Rust + Lance] --> Embed[Use case: embedded / data lake]

pgvector 0.9

The Postgres extension. Version 0.9 (early 2026) added IVFFlat improvements, sparse vector support, and substantial speed boosts. For most teams already on Postgres, this is the easiest path.

  • Strengths: just Postgres, ACID transactions, full SQL, easy ops
  • Weaknesses: lower QPS than purpose-built vector DBs at very large scale; advanced features lag
  • Performance: ~5K-15K QPS on a single Postgres instance with HNSW index for typical 1024-dim vectors

Qdrant

The leader on hybrid search and late-interaction support in 2026. Native multi-vector support makes ColBERT-V2-style retrieval first class.

  • Strengths: best hybrid + late-interaction support, single-binary deployment, strong Rust core
  • Weaknesses: smaller community than pgvector
  • Performance: ~30K-80K QPS at typical configurations

Weaviate

Modular, GraphQL-first, integrates closely with embedding providers via "modules" (vectorizers, generators, rerankers).

  • Strengths: modular architecture, GraphQL queries, strong RAG-pattern support
  • Weaknesses: GraphQL API adds learning curve for SQL-native teams
  • Performance: ~25K-50K QPS

Milvus

The largest-scale option. Production deployments at hundreds of billions of vectors. Distributed-first architecture; clear cloud product (Zilliz).

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

  • Strengths: largest scale, mature distributed architecture, strong cloud offering
  • Weaknesses: heavier ops; deployment complexity higher than alternatives
  • Performance: 100K+ QPS at scale

LanceDB

The newer entrant, built on the Lance columnar format. Embedded-first (file-based) but with a server mode. Strong fit for data-lake architectures and ML workloads.

  • Strengths: embedded mode for zero-ops use cases, Lance format integrates with data lakes, fast columnar reads
  • Weaknesses: smaller ecosystem; newer
  • Performance: workload-dependent; very strong for read-heavy and batch

Feature Matrix

Feature pgvector Qdrant Weaviate Milvus LanceDB
HNSW yes yes yes yes yes
Sparse vectors yes (0.9) yes yes yes yes
ColBERT-V2 multi-vector partial yes partial partial partial
Hybrid (BM25 + dense) yes yes yes yes partial
Distributed partial (Citus) partial yes yes yes
Embedded mode no no no no yes

Choosing One

flowchart TD
    Q1{Already on Postgres?} -->|Yes| pg[pgvector]
    Q1 -->|No| Q2{Largest scale<br/>100B+ vectors?}
    Q2 -->|Yes| Mil[Milvus]
    Q2 -->|No| Q3{Hybrid + late interaction<br/>top priority?}
    Q3 -->|Yes| Qd[Qdrant]
    Q3 -->|No| Q4{Embedded /<br/>data-lake fit?}
    Q4 -->|Yes| LD[LanceDB]
    Q4 -->|No| We[Weaviate]

For most teams in 2026: pgvector if you have Postgres, Qdrant if you do not. Reach for Milvus only at very large scale.

Operational Considerations

The choice often comes down to ops more than benchmarks:

  • pgvector: your DBA already operates Postgres. No new system.
  • Qdrant: single binary; runs anywhere; operationally simple
  • Weaviate: cloud offering smooths ops; self-hosted is more involved
  • Milvus: serious distributed system; needs k8s and dedicated ops
  • LanceDB: embedded means no ops at all for some use cases

Cost Math

For a 10M-vector workload at 1024-dim with ~1K QPS at p99 < 100ms:

  • pgvector on a beefy Postgres instance: ~$1-2K/month
  • Qdrant on a managed plan or self-hosted: ~$1.5-3K/month
  • Weaviate Cloud: ~$2-5K/month
  • Milvus self-hosted: ~$2-4K/month (cluster + ops)
  • LanceDB on object storage: ~$200-500/month for batch-shaped reads

These numbers shift with hardware pricing. The absolute spread is large; pick on fit, not just price.

What CallSphere Uses

For our website's blog dedup and search we run pgvector inside our Postgres instance. For the multi-product agent memory layer where read-only scaling matters more, we run Qdrant. We chose pgvector for the blog because it was free of new ops; we chose Qdrant for the agent layer for its hybrid and multi-vector support.

Sources

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Learn Agentic AI

Vector Database Selection for AI Agents 2026: Pinecone vs Weaviate vs ChromaDB vs Qdrant

Technical comparison of vector databases for AI agent RAG systems: Pinecone, Weaviate, ChromaDB, and Qdrant benchmarked on performance, pricing, features, and scaling.

Learn Agentic AI

Agent Memory Systems: Short-Term, Long-Term, and Episodic Memory for AI Agents

Technical deep dive into agent memory architectures covering conversation context, vector DB persistence, and experience replay with implementation code for production systems.

Learn Agentic AI

Vector Databases for RAG: Comparing pgvector, Pinecone, Chroma, and Weaviate

A practical comparison of four popular vector databases for RAG — pgvector, Pinecone, Chroma, and Weaviate — covering setup, indexing, query performance, and when to choose each one.

Learn Agentic AI

Benchmarking Vector Databases: Latency, Throughput, and Recall at Scale

Learn how to rigorously benchmark vector databases with proper methodology — measuring latency, throughput, and recall under realistic conditions to make informed infrastructure decisions.

Learn Agentic AI

pgvector Tutorial: Adding Vector Search to Your Existing PostgreSQL Database

Learn how to install pgvector, create vector columns, build IVFFlat and HNSW indexes, and run similarity queries directly inside PostgreSQL without adding another database to your stack.

Learn Agentic AI

Weaviate Tutorial: GraphQL-Powered Vector Search with Built-In Modules

Learn to set up Weaviate, design schemas with vectorizer modules, import data, and run hybrid keyword-plus-vector searches using Weaviate's GraphQL API and Python client.