Vector Database Benchmarks 2026: pgvector 0.9, Qdrant, Weaviate, Milvus, LanceDB
The five vector databases competing for production traffic in 2026, benchmarked on QPS, recall, hybrid search, and operational cost.
The Field
Five vector databases dominate production deployments in 2026: pgvector (Postgres extension), Qdrant, Weaviate, Milvus, and LanceDB. Each is the right answer for different shapes of workload. This is a side-by-side based on April 2026 benchmarks and production reports.
The Side-by-Side
flowchart TB
pgvector[pgvector 0.9<br/>Postgres extension] --> SQL[Use case: SQL-shaped apps]
Qdrant[Qdrant<br/>Rust] --> Hybrid[Use case: hybrid + late interaction]
Weaviate[Weaviate<br/>Go] --> Module[Use case: modular + GraphQL]
Milvus[Milvus<br/>Go/C++] --> Scale[Use case: largest scale]
LanceDB[LanceDB<br/>Rust + Lance] --> Embed[Use case: embedded / data lake]
pgvector 0.9
The Postgres extension. Version 0.9 (early 2026) added IVFFlat improvements, sparse vector support, and substantial speed boosts. For most teams already on Postgres, this is the easiest path.
- Strengths: just Postgres, ACID transactions, full SQL, easy ops
- Weaknesses: lower QPS than purpose-built vector DBs at very large scale; advanced features lag
- Performance: ~5K-15K QPS on a single Postgres instance with HNSW index for typical 1024-dim vectors
Qdrant
The leader on hybrid search and late-interaction support in 2026. Native multi-vector support makes ColBERT-V2-style retrieval first class.
- Strengths: best hybrid + late-interaction support, single-binary deployment, strong Rust core
- Weaknesses: smaller community than pgvector
- Performance: ~30K-80K QPS at typical configurations
Weaviate
Modular, GraphQL-first, integrates closely with embedding providers via "modules" (vectorizers, generators, rerankers).
- Strengths: modular architecture, GraphQL queries, strong RAG-pattern support
- Weaknesses: GraphQL API adds learning curve for SQL-native teams
- Performance: ~25K-50K QPS
Milvus
The largest-scale option. Production deployments at hundreds of billions of vectors. Distributed-first architecture; clear cloud product (Zilliz).
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
- Strengths: largest scale, mature distributed architecture, strong cloud offering
- Weaknesses: heavier ops; deployment complexity higher than alternatives
- Performance: 100K+ QPS at scale
LanceDB
The newer entrant, built on the Lance columnar format. Embedded-first (file-based) but with a server mode. Strong fit for data-lake architectures and ML workloads.
- Strengths: embedded mode for zero-ops use cases, Lance format integrates with data lakes, fast columnar reads
- Weaknesses: smaller ecosystem; newer
- Performance: workload-dependent; very strong for read-heavy and batch
Feature Matrix
| Feature | pgvector | Qdrant | Weaviate | Milvus | LanceDB |
|---|---|---|---|---|---|
| HNSW | yes | yes | yes | yes | yes |
| Sparse vectors | yes (0.9) | yes | yes | yes | yes |
| ColBERT-V2 multi-vector | partial | yes | partial | partial | partial |
| Hybrid (BM25 + dense) | yes | yes | yes | yes | partial |
| Distributed | partial (Citus) | partial | yes | yes | yes |
| Embedded mode | no | no | no | no | yes |
Choosing One
flowchart TD
Q1{Already on Postgres?} -->|Yes| pg[pgvector]
Q1 -->|No| Q2{Largest scale<br/>100B+ vectors?}
Q2 -->|Yes| Mil[Milvus]
Q2 -->|No| Q3{Hybrid + late interaction<br/>top priority?}
Q3 -->|Yes| Qd[Qdrant]
Q3 -->|No| Q4{Embedded /<br/>data-lake fit?}
Q4 -->|Yes| LD[LanceDB]
Q4 -->|No| We[Weaviate]
For most teams in 2026: pgvector if you have Postgres, Qdrant if you do not. Reach for Milvus only at very large scale.
Operational Considerations
The choice often comes down to ops more than benchmarks:
- pgvector: your DBA already operates Postgres. No new system.
- Qdrant: single binary; runs anywhere; operationally simple
- Weaviate: cloud offering smooths ops; self-hosted is more involved
- Milvus: serious distributed system; needs k8s and dedicated ops
- LanceDB: embedded means no ops at all for some use cases
Cost Math
For a 10M-vector workload at 1024-dim with ~1K QPS at p99 < 100ms:
- pgvector on a beefy Postgres instance: ~$1-2K/month
- Qdrant on a managed plan or self-hosted: ~$1.5-3K/month
- Weaviate Cloud: ~$2-5K/month
- Milvus self-hosted: ~$2-4K/month (cluster + ops)
- LanceDB on object storage: ~$200-500/month for batch-shaped reads
These numbers shift with hardware pricing. The absolute spread is large; pick on fit, not just price.
What CallSphere Uses
For our website's blog dedup and search we run pgvector inside our Postgres instance. For the multi-product agent memory layer where read-only scaling matters more, we run Qdrant. We chose pgvector for the blog because it was free of new ops; we chose Qdrant for the agent layer for its hybrid and multi-vector support.
Sources
- pgvector documentation — https://github.com/pgvector/pgvector
- Qdrant documentation — https://qdrant.tech/documentation
- Weaviate documentation — https://weaviate.io/developers
- Milvus documentation — https://milvus.io/docs
- LanceDB documentation — https://lancedb.github.io/lancedb
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.