By Sagar Shankaran, Founder of CallSphere
ClickHouse 26.2 added time-based block flushing and ClickPipes covers Kafka, S3, and Postgres CDC natively. Here's how CallSphere streams 50k voice agent transcripts a day into a ClickHouse cluster with sub-second p95 query latency.
Key takeaways
TL;DR — ClickHouse 26.2 ships time-based block flushing (
input_format_max_block_wait_ms), ClickPipes ingests Kafka and S3 natively, and BigLake-style federation lets you join cold archive with hot transcripts. For AI voice analytics, that means a single SQL surface from "this call ended 800 ms ago" to "all calls last quarter" — at sub-second latency.
A voice agent stack that handles 10k–100k calls a day generates structured events that an OLTP database (Postgres) gets crushed by once you start running aggregations: average sentiment by hour, top-10 intents in the last 5 minutes, lead-score histogram by vertical, talk/listen ratio by agent. ClickHouse is the canonical OLAP fit. The 2026 question is no longer "ClickHouse or not" — it's how you stream into it without batching delays or async-insert footguns.
The 26.2 release closed the last gap: low-throughput feeds (< 100 rows/sec) used to wait minutes for the default 1M-row block to flush. Time-based flushing fixes that, so a regional pod with 30 calls per minute still gets 3-second visibility.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
flowchart LR
Voice[Voice agent<br/>OpenAI Realtime API] -->|partial transcripts| Kafka[(Kafka topic<br/>call.transcript.partial)]
Voice -->|final transcript| Kafka2[(Kafka topic<br/>call.completed)]
Kafka -->|ClickPipes| CH[(ClickHouse Cloud<br/>transcripts table<br/>MergeTree, ORDER BY call_id, ts)]
Kafka2 -->|ClickPipes| CH
CH --> MV[Materialized view:<br/>sentiment_5min_rollup]
CH --> Dash[Grafana / Metabase]
CH --> AGT[Internal AI agent<br/>read-only ClickHouse user]
Partial transcripts land within 800 ms of the speech token; final transcripts land within 2s of call end. A materialized view keeps a 5-minute sentiment rollup hot for the supervisor dashboard.
CallSphere runs 37 specialist agents across 6 verticals with 90+ tools and 115+ DB tables. Pricing is $149 Starter / $499 Growth / $1499 Scale with a 14-day trial and 22% affiliate program. Healthcare post-call analytics uses GPT-4o-mini to compute a sentiment score from -1.0 to 1.0 and a lead score from 0 to 100, written into the call_analytics table — all of it queryable from ClickHouse alongside transcripts. Browse plans at /pricing or take a /demo. Healthcare specifics live at /industries/healthcare.
call.transcript.partial with 12 partitions and 7-day retention.input_format_max_block_wait_ms=3000 for low-throughput regional pods.CREATE TABLE call_transcripts (
call_id UUID,
vertical LowCardinality(String),
speaker LowCardinality(String), -- 'agent' | 'caller'
ts DateTime64(3),
text String,
sentiment Float32,
lead_score UInt8,
pii_redacted UInt8 DEFAULT 0
)
ENGINE = MergeTree
ORDER BY (call_id, ts)
PARTITION BY toYYYYMM(ts)
TTL ts + INTERVAL 365 DAY;
CREATE MATERIALIZED VIEW sentiment_5min_rollup
ENGINE = AggregatingMergeTree
ORDER BY (vertical, bucket)
AS SELECT
vertical,
toStartOfFiveMinute(ts) AS bucket,
avgState(sentiment) AS avg_sent,
countState() AS n
FROM call_transcripts
GROUP BY vertical, bucket;
INSERT per transcript chunk — you'll generate millions of tiny parts and hit the merge backlog. Always batch via ClickPipes or async_insert.ORDER BY ts only — sparse index becomes useless for per-call lookups; lead with call_id.TTL ts + INTERVAL 365 DAY or your storage bill will.vertical/speaker — 10x bigger storage on string columns with low cardinality.Why ClickHouse over Postgres + TimescaleDB? Once you cross 50M rows of transcripts, ClickHouse is 10–50x faster for analytical scans. Timescale is great up to 10M rows; past that, the columnar format wins.
Can we query the call recording itself? No — store the audio in S3 and put the S3 URL in ClickHouse. Use ClickHouse only for the structured transcript and metrics.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
How do we redact PII before it hits ClickHouse? Pipeline post #6 covers this. The short version: redact in Flink between Kafka and ClickHouse, never relying on ClickHouse to do it.
Latency goal? Sub-second p95 for dashboard queries; 3s ingest visibility for streaming feeds.
Multi-tenant? Yes — add a tenant_id column at the front of the order key and use row-policies for per-tenant isolation.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
OpenAI's GPT-Realtime-Whisper launches at $0.017/min for streaming STT. Side-by-side latency, accuracy, and cost math vs Deepgram and the field.
How to stream tokens, tool-call deltas, and intermediate steps from an agent — with code for both the OpenAI Agents SDK and LangChain — and the gotchas that bite in production.
Streaming changes the eval game — final-answer correctness isn't enough when users perceive the answer one token at a time. Here's the metric set that matters.
How to wire Vercel AI SDK 5 tool calls to a React UI with streaming, partial UI updates, and proper error handling that survives flaky network conditions.
Streaming index updates avoid the 'rebuild and redeploy' tax. The 2026 patterns for real-time vector indexing in production systems.
Streaming gives perceived speed; batch gives throughput. The 2026 deployment guide for when to pick each and how to do hybrid.
© 2026 CallSphere LLC. All rights reserved.