Streaming Call Transcripts Into ClickHouse for Sub-Second AI Voice Analytics in 2026
ClickHouse 26.2 added time-based block flushing and ClickPipes covers Kafka, S3, and Postgres CDC natively. Here's how CallSphere streams 50k voice agent transcripts a day into a ClickHouse cluster with sub-second p95 query latency.
TL;DR — ClickHouse 26.2 ships time-based block flushing (
input_format_max_block_wait_ms), ClickPipes ingests Kafka and S3 natively, and BigLake-style federation lets you join cold archive with hot transcripts. For AI voice analytics, that means a single SQL surface from "this call ended 800 ms ago" to "all calls last quarter" — at sub-second latency.
Why this pipeline
A voice agent stack that handles 10k–100k calls a day generates structured events that an OLTP database (Postgres) gets crushed by once you start running aggregations: average sentiment by hour, top-10 intents in the last 5 minutes, lead-score histogram by vertical, talk/listen ratio by agent. ClickHouse is the canonical OLAP fit. The 2026 question is no longer "ClickHouse or not" — it's how you stream into it without batching delays or async-insert footguns.
The 26.2 release closed the last gap: low-throughput feeds (< 100 rows/sec) used to wait minutes for the default 1M-row block to flush. Time-based flushing fixes that, so a regional pod with 30 calls per minute still gets 3-second visibility.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Architecture
flowchart LR
Voice[Voice agent<br/>OpenAI Realtime API] -->|partial transcripts| Kafka[(Kafka topic<br/>call.transcript.partial)]
Voice -->|final transcript| Kafka2[(Kafka topic<br/>call.completed)]
Kafka -->|ClickPipes| CH[(ClickHouse Cloud<br/>transcripts table<br/>MergeTree, ORDER BY call_id, ts)]
Kafka2 -->|ClickPipes| CH
CH --> MV[Materialized view:<br/>sentiment_5min_rollup]
CH --> Dash[Grafana / Metabase]
CH --> AGT[Internal AI agent<br/>read-only ClickHouse user]
Partial transcripts land within 800 ms of the speech token; final transcripts land within 2s of call end. A materialized view keeps a 5-minute sentiment rollup hot for the supervisor dashboard.
CallSphere implementation
CallSphere runs 37 specialist agents across 6 verticals with 90+ tools and 115+ DB tables. Pricing is $149 Starter / $499 Growth / $1499 Scale with a 14-day trial and 22% affiliate program. Healthcare post-call analytics uses GPT-4o-mini to compute a sentiment score from -1.0 to 1.0 and a lead score from 0 to 100, written into the call_analytics table — all of it queryable from ClickHouse alongside transcripts. Browse plans at /pricing or take a /demo. Healthcare specifics live at /industries/healthcare.
Build steps with code
- Provision ClickHouse Cloud with a dedicated service for analytics; pick a region close to your agent pod.
- Create the table with a sane order key.
- Set up a Kafka topic
call.transcript.partialwith 12 partitions and 7-day retention. - Wire ClickPipes in the ClickHouse Cloud UI: pick the Kafka source, select topic, map columns.
- Tune
input_format_max_block_wait_ms=3000for low-throughput regional pods. - Add a materialized view for the supervisor rollup.
- Grant a read-only user to your internal AI agent for ad-hoc analytics.
CREATE TABLE call_transcripts (
call_id UUID,
vertical LowCardinality(String),
speaker LowCardinality(String), -- 'agent' | 'caller'
ts DateTime64(3),
text String,
sentiment Float32,
lead_score UInt8,
pii_redacted UInt8 DEFAULT 0
)
ENGINE = MergeTree
ORDER BY (call_id, ts)
PARTITION BY toYYYYMM(ts)
TTL ts + INTERVAL 365 DAY;
CREATE MATERIALIZED VIEW sentiment_5min_rollup
ENGINE = AggregatingMergeTree
ORDER BY (vertical, bucket)
AS SELECT
vertical,
toStartOfFiveMinute(ts) AS bucket,
avgState(sentiment) AS avg_sent,
countState() AS n
FROM call_transcripts
GROUP BY vertical, bucket;
Pitfalls
- Naive
INSERTper transcript chunk — you'll generate millions of tiny parts and hit the merge backlog. Always batch via ClickPipes orasync_insert. ORDER BY tsonly — sparse index becomes useless for per-call lookups; lead withcall_id.- No TTL — voice analytics balloons fast; set
TTL ts + INTERVAL 365 DAYor your storage bill will. - Forgetting LowCardinality on
vertical/speaker— 10x bigger storage on string columns with low cardinality. - Querying raw partials for dashboards — always go through a materialized view; the rollup is 100x cheaper.
FAQ
Why ClickHouse over Postgres + TimescaleDB? Once you cross 50M rows of transcripts, ClickHouse is 10–50x faster for analytical scans. Timescale is great up to 10M rows; past that, the columnar format wins.
Can we query the call recording itself? No — store the audio in S3 and put the S3 URL in ClickHouse. Use ClickHouse only for the structured transcript and metrics.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
How do we redact PII before it hits ClickHouse? Pipeline post #6 covers this. The short version: redact in Flink between Kafka and ClickHouse, never relying on ClickHouse to do it.
Latency goal? Sub-second p95 for dashboard queries; 3s ingest visibility for streaming feeds.
Multi-tenant? Yes — add a tenant_id column at the front of the order key and use row-policies for per-tenant isolation.
Sources
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.