By Sagar Shankaran, Founder of CallSphere
Crunchy Bridge ships Postgres + pg_parquet + Iceberg + warehouse engine for AI/analytics. A walkthrough of provisioning, vector index sizing, and a hybrid OLTP-plus-data-lake setup that doesn't need Snowflake.
Key takeaways
TL;DR — Crunchy Bridge gives you fully managed Postgres with first-class Iceberg, Parquet, and an analytics engine alongside vanilla OLTP. For AI teams it means transactional + warehouse on one platform, and pg_duckdb-style speed on analytic queries.
A Crunchy Bridge cluster running OLTP workloads with pgvector and Iceberg tables on S3 — agents query both with ordinary SQL via the warehouse extension.
-- OLTP table
CREATE TABLE conversations (
id BIGSERIAL PRIMARY KEY,
org_id UUID,
body TEXT,
embedding vector(1536),
created_at TIMESTAMPTZ DEFAULT now()
);
-- Iceberg-mounted analytics view
CREATE FOREIGN TABLE analytics_events ()
SERVER iceberg
OPTIONS (location 's3://callsphere-lake/events/');
flowchart LR
APP[App writes] --> CB[(Crunchy Bridge OLTP)]
CB --> WAL[WAL]
WAL --> PARQ[pg_parquet S3 dump]
PARQ --> ICE[Iceberg tables]
AGENT[AI agent] --> CB
AGENT --> WH[Warehouse engine]
WH --> ICE
curl -X POST https://api.crunchybridge.com/clusters -H "Authorization: Bearer $CB_TOKEN" -d '{
"name": "callsphere-prod",
"plan_id": "memory-2",
"provider_id": "aws",
"region_id": "us-east-1",
"postgres_version_id": 17,
"extensions": ["vector", "pg_parquet", "pg_partman"]
}'
CREATE EXTENSION pg_parquet;
COPY (SELECT * FROM conversations WHERE created_at < now() - interval '30 days')
TO 's3://callsphere-lake/conversations_archive.parquet'
WITH (format 'parquet', compression 'zstd');
CREATE EXTENSION crunchy_iceberg;
SELECT iceberg_create_table(
'conversations_archive',
's3://callsphere-lake/conversations_archive/',
schema => 'org_id uuid, body text, created_at timestamptz'
);
-- Hot data + cold archive in one query
SELECT created_at, body FROM conversations
WHERE org_id = $1 AND created_at > now() - interval '30 days'
UNION ALL
SELECT created_at, body FROM conversations_archive
WHERE org_id = $1 AND created_at > now() - interval '180 days';
CREATE INDEX ON conversations USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 128);
SET hnsw.ef_search = 100;
SELECT id FROM conversations
ORDER BY embedding <=> $1::vector LIMIT 10;
Hot, indexed vectors stay on Bridge SSD; cold transcripts live cheaply on S3 Iceberg.
Bridge ships continuous WAL archiving + 10-day PITR by default. For longer retention, schedule pg_dump to S3:
pg_dump --format=custom $DATABASE_URL | aws s3 cp - s3://backups/cb-$(date +%Y%m%d).dump
memory plans matter for HNSW; io plans matter for analytic scans. Pick by workload.CallSphere's analytics warehouse runs on Crunchy Bridge — it ingests pgvector + transcripts from the OLTP primary daily, stores 12+ months of Parquet on S3, and answers cross-vertical queries across 115+ DB tables. Healthcare keeps PHI on a HIPAA-isolated Bridge cluster (Prisma healthcare_voice); OneRoof archives RLS-scoped events; UrackIT mirrors public chat to Bridge for analytics. 37 agents · 90+ tools · 6 verticals. Plans: $149/$499/$1,499 — 14-day trial, 22% affiliate.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Q: Bridge vs RDS? Bridge ships extensions (pgvector, pg_parquet, pg_partman) RDS lacks; RDS wins on AWS-native integrations.
Q: Multi-region HA? Bridge supports cross-region replicas on Business plans.
Q: HIPAA / SOC 2? Both available on Business and above; BAA on request.
Q: Pricing model? Hourly compute + storage + transfer. Predictable, no per-query surcharges.
Q: How fast is failover? 30-60 sec with the HA add-on.
Crunchy Bridge for AI Workloads: Managed Postgres with pg_parquet and Iceberg (2026) is also a cost-per-conversation problem hiding in plain sight. Once you instrument tokens-in, tokens-out, tool calls, ASR seconds, and TTS seconds against booked-revenue per call, the right tradeoff between Realtime API and an async ASR + LLM + TTS pipeline becomes obvious — and it's almost never the same answer for healthcare as it is for salons.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
The big fork is managed (OpenAI Realtime, ElevenLabs Conversational AI) versus self-hosted on GPUs you operate. Managed wins on cold-start, model freshness, and zero-ops; self-hosted wins on unit economics past a certain conversation volume and on data residency for regulated verticals. CallSphere runs hybrid: Realtime for live calls, self-hosted Whisper + a hosted LLM for async, both routed through a Go gateway that enforces per-tenant rate limits.
Latency budgets are non-negotiable on voice. End-to-end target is sub-800ms ASR-to-first-token and sub-1.4s first-audio-out; anything beyond that and turn-taking feels stilted. GPU residency in the same region as your TURN servers matters more than choosing a slightly bigger model.
Observability is the unglamorous backbone — every conversation produces logs, traces, sentiment scoring, and cost attribution piped to a per-tenant dashboard. HIPAA + SOC 2 aligned isolation keeps healthcare traffic separated from salon traffic at the storage layer, not just the API.
How does this apply to a CallSphere pilot specifically? Setup runs 3–5 business days, the trial is 14 days with no credit card, and pricing tiers are $149, $499, and $1,499 — so a vertical-specific pilot is a same-week decision, not a quarterly project. For a topic like "Crunchy Bridge for AI Workloads: Managed Postgres with pg_parquet and Iceberg (2026)", that means you're not starting from scratch — you're configuring an agent template that's already been hardened across thousands of conversations.
What does the typical first-week implementation look like? Day one is integration mapping (scheduler, CRM, messaging) and prompt tuning against your top 20 real call transcripts. Day two through five is shadow-mode running, where the agent transcribes and recommends but a human still answers, so you can compare side-by-side. Go-live is the moment your eval pass-rate clears your internal bar.
Where does this break down at scale? The honest answer: it scales until your tool catalog gets stale. The agent is only as good as the integrations it can actually call, so the operational discipline is keeping schemas, webhooks, and fallback paths green. The platform handles the rest — observability, retries, multi-region routing — without your team owning the GPU layer.
Want to see how this maps to your stack? Book a live walkthrough at calendly.com/sagar-callsphere/new-meeting, or try the vertical-specific demo at escalation.callsphere.tech. 14-day trial, no credit card, pilot live in 3–5 business days.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
Sentiment is not a single number per call - it is a curve. The shape (started positive, dropped at minute 4, recovered) tells you what your AI did wrong. Here is the per-utterance sentiment pipeline and the dashboards we ship by vertical.
Your agent's memory, embeddings, and conversation state all live in Postgres. Backups must include vector data and survive a full-region loss. Here's how CallSphere does PITR for 115+ tables.
Live news studios in 2026 deploy an AI fact-checker behind every anchor, validating claims against trusted sources and offering on-air corrections within 30 seconds. Here is the production stack.
pgvector 0.9 brings hybrid search, binary vectors, and improved indexing primitives. Why Postgres-native vector is good enough for most teams in 2026 honestly.
Mem0 on a single Postgres + pgvector instance, end to end. Schema, indexes, and the queries that keep latency under 200ms even with millions of memory records.
pgvector 0.8 with binary quantization cut HNSW build time 150x and hits 471 QPS at 99% recall on 50M vectors. Here is the production tuning guide for Postgres-shop teams.
© 2026 CallSphere LLC. All rights reserved.
Watch how CallSphere handles real customer calls, schedules appointments, and processes payments — live.
Try Live DemoBook a DemoCalculate Your ROI