Postgres + Citus for Sharding: Scale AI Workloads Past 10TB (2026)
Citus turns Postgres into a distributed database via row-based or schema-based sharding. Pick the right model for AI workloads, distribute pgvector tables, and watch query plans actually parallelize.
TL;DR — When a single Postgres node can't keep up — typically past 10 TB or 50k writes/sec — Citus shards the table across worker nodes while keeping the SQL surface intact. Schema-based sharding (Citus 12+) is the right default for multi-tenant AI workloads.
What you'll build
A 1 coordinator + 4 worker Citus cluster running pgvector across shards, with schema-per-tenant isolation and parallel ANN queries.
Schema
CREATE EXTENSION citus;
CREATE EXTENSION vector;
-- Schema-based sharding (Citus 12+)
CREATE SCHEMA tenant_acme;
CREATE TABLE tenant_acme.documents (
id BIGSERIAL PRIMARY KEY,
body TEXT,
embedding vector(1536)
);
SELECT citus_schema_distribute('tenant_acme');
-- Row-based sharding for shared tables
CREATE TABLE shared_events (
tenant_id UUID,
event_id UUID,
ts TIMESTAMPTZ,
payload JSONB,
PRIMARY KEY (tenant_id, event_id)
);
SELECT create_distributed_table('shared_events', 'tenant_id');
Architecture
flowchart TD
APP[App] --> COORD[Coordinator]
COORD --> W1[Worker 1<br/>shards 0-7]
COORD --> W2[Worker 2<br/>shards 8-15]
COORD --> W3[Worker 3<br/>shards 16-23]
COORD --> W4[Worker 4<br/>shards 24-31]
W1 --> S1[(Shard data)]
W2 --> S2[(Shard data)]
W3 --> S3[(Shard data)]
W4 --> S4[(Shard data)]
Step 1 — Cluster setup
# On coordinator + each worker
echo "shared_preload_libraries = 'citus'" >> postgresql.conf
sudo systemctl restart postgresql
# On coordinator
psql -c "CREATE EXTENSION citus;"
psql -c "SELECT citus_set_coordinator_host('coord.internal', 5432);"
psql -c "SELECT citus_add_node('worker1.internal', 5432);"
psql -c "SELECT citus_add_node('worker2.internal', 5432);"
psql -c "SELECT citus_add_node('worker3.internal', 5432);"
psql -c "SELECT citus_add_node('worker4.internal', 5432);"
Step 2 — Distribute the table
-- Row-based: hash by tenant_id
SELECT create_distributed_table('shared_events', 'tenant_id');
-- Schema-based: each tenant gets a schema, distributed
CREATE SCHEMA tenant_acme;
-- tables created here...
SELECT citus_schema_distribute('tenant_acme');
Step 3 — Build pgvector indexes per shard
CREATE INDEX documents_hnsw_idx
ON tenant_acme.documents USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
Citus pushes the index down to every shard automatically.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Step 4 — Parallel ANN queries
EXPLAIN SELECT id FROM tenant_acme.documents
ORDER BY embedding <=> $1::vector LIMIT 10;
For schema-based, the query routes to the worker that owns that schema. For row-based, the planner runs the query on every shard in parallel and merges results.
Step 5 — Co-located joins
SELECT create_distributed_table('messages', 'tenant_id', colocate_with => 'shared_events');
-- Co-located join runs on each worker, no cross-node traffic
SELECT m.id, e.event_id FROM messages m
JOIN shared_events e USING (tenant_id, event_id)
WHERE m.tenant_id = $1;
Step 6 — Rebalance shards
SELECT citus_add_node('worker5.internal', 5432);
SELECT rebalance_table_shards('shared_events');
Online rebalance — no downtime, throttled to keep the workers responsive.
Pitfalls
- Cross-shard transactions — possible but slower. Co-locate to avoid them.
- Foreign keys cross shards — only allowed within a co-location group.
- Schema-based with too few tenants — overhead beats benefit under ~20 tenants.
- Forgetting to distribute reference tables — small lookup tables should use
create_reference_table. - SEQUENCE on non-distribution column — use BIGSERIAL with care; for global uniqueness consider UUIDs.
CallSphere production note
CallSphere's largest verticals (Healthcare and OneRoof) currently run on dedicated single-node Postgres + pgvector — Citus is on the roadmap for when cross-tenant aggregations exceed 30 TB. Schema-based sharding maps cleanly onto our 115+ DB tables model, with Healthcare's HIPAA workload (Prisma healthcare_voice) keeping its own isolated cluster, OneRoof RLS preserved per shard, and UrackIT's Supabase + ChromaDB stack untouched. 37 agents · 90+ tools · 6 verticals. Plans: $149/$499/$1,499 — 14-day trial, 22% affiliate.
FAQ
Q: Citus or vanilla read replicas? Replicas scale reads only. Citus scales writes and storage.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Q: Can I use Citus on managed Postgres? Azure Cosmos DB for PostgreSQL = managed Citus. Self-host elsewhere.
Q: Does pgvector work distributed? Yes — index per shard, parallel scan.
Q: Schema-based or row-based? Schema-based for clear tenant boundaries; row-based for high-cardinality shared workloads.
Q: When NOT to use Citus? Under ~10 TB and ~50k writes/sec, vertical scaling is cheaper and simpler.
Sources
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.