TL;DR — ClickHouse 26.2 ships time-based block flushing (input_format_max_block_wait_ms), ClickPipes ingests Kafka and S3 natively, and BigLake-style federation lets you join cold archive with hot transcripts. For AI voice analytics, that means a single SQL surface from "this call ended 800 ms ago" to "all calls last quarter" — at sub-second latency.

Why this pipeline

A voice agent stack that handles 10k–100k calls a day generates structured events that an OLTP database (Postgres) gets crushed by once you start running aggregations: average sentiment by hour, top-10 intents in the last 5 minutes, lead-score histogram by vertical, talk/listen ratio by agent. ClickHouse is the canonical OLAP fit. The 2026 question is no longer "ClickHouse or not" — it's how you stream into it without batching delays or async-insert footguns.

The 26.2 release closed the last gap: low-throughput feeds (< 100 rows/sec) used to wait minutes for the default 1M-row block to flush. Time-based flushing fixes that, so a regional pod with 30 calls per minute still gets 3-second visibility.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

Architecture

flowchart LR
  Voice[Voice agent<br/>OpenAI Realtime API] -->|partial transcripts| Kafka[(Kafka topic<br/>call.transcript.partial)]
  Voice -->|final transcript| Kafka2[(Kafka topic<br/>call.completed)]
  Kafka -->|ClickPipes| CH[(ClickHouse Cloud<br/>transcripts table<br/>MergeTree, ORDER BY call_id, ts)]
  Kafka2 -->|ClickPipes| CH
  CH --> MV[Materialized view:<br/>sentiment_5min_rollup]
  CH --> Dash[Grafana / Metabase]
  CH --> AGT[Internal AI agent<br/>read-only ClickHouse user]

Partial transcripts land within 800 ms of the speech token; final transcripts land within 2s of call end. A materialized view keeps a 5-minute sentiment rollup hot for the supervisor dashboard.

CallSphere implementation

CallSphere runs 37 specialist agents across 6 verticals with 90+ tools and 115+ DB tables. Pricing is $149 Starter / $499 Growth / $1499 Scale with a 14-day trial and 22% affiliate program. Healthcare post-call analytics uses GPT-4o-mini to compute a sentiment score from -1.0 to 1.0 and a lead score from 0 to 100, written into the call_analytics table — all of it queryable from ClickHouse alongside transcripts. Browse plans at /pricing or take a /demo. Healthcare specifics live at /industries/healthcare.

Build steps with code

Provision ClickHouse Cloud with a dedicated service for analytics; pick a region close to your agent pod.
Create the table with a sane order key.
Set up a Kafka topic call.transcript.partial with 12 partitions and 7-day retention.
Wire ClickPipes in the ClickHouse Cloud UI: pick the Kafka source, select topic, map columns.
Tune input_format_max_block_wait_ms=3000 for low-throughput regional pods.
Add a materialized view for the supervisor rollup.
Grant a read-only user to your internal AI agent for ad-hoc analytics.

CREATE TABLE call_transcripts (
  call_id     UUID,
  vertical    LowCardinality(String),
  speaker     LowCardinality(String),  -- 'agent' | 'caller'
  ts          DateTime64(3),
  text        String,
  sentiment   Float32,
  lead_score  UInt8,
  pii_redacted UInt8 DEFAULT 0
)
ENGINE = MergeTree
ORDER BY (call_id, ts)
PARTITION BY toYYYYMM(ts)
TTL ts + INTERVAL 365 DAY;

CREATE MATERIALIZED VIEW sentiment_5min_rollup
ENGINE = AggregatingMergeTree
ORDER BY (vertical, bucket)
AS SELECT
  vertical,
  toStartOfFiveMinute(ts) AS bucket,
  avgState(sentiment)     AS avg_sent,
  countState()            AS n
FROM call_transcripts
GROUP BY vertical, bucket;

Pitfalls

Naive INSERT per transcript chunk — you'll generate millions of tiny parts and hit the merge backlog. Always batch via ClickPipes or async_insert.
ORDER BY ts only — sparse index becomes useless for per-call lookups; lead with call_id.
No TTL — voice analytics balloons fast; set TTL ts + INTERVAL 365 DAY or your storage bill will.
Forgetting LowCardinality on vertical/speaker — 10x bigger storage on string columns with low cardinality.
Querying raw partials for dashboards — always go through a materialized view; the rollup is 100x cheaper.

FAQ

Why ClickHouse over Postgres + TimescaleDB? Once you cross 50M rows of transcripts, ClickHouse is 10–50x faster for analytical scans. Timescale is great up to 10M rows; past that, the columnar format wins.

Can we query the call recording itself? No — store the audio in S3 and put the S3 URL in ClickHouse. Use ClickHouse only for the structured transcript and metrics.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

How do we redact PII before it hits ClickHouse? Pipeline post #6 covers this. The short version: redact in Flink between Kafka and ClickHouse, never relying on ClickHouse to do it.

Latency goal? Sub-second p95 for dashboard queries; 3s ingest visibility for streaming feeds.

Multi-tenant? Yes — add a tenant_id column at the front of the order key and use row-policies for per-tenant isolation.

Streaming Call Transcripts Into ClickHouse for Sub-Second AI Voice Analytics in 2026

Why this pipeline

Architecture

CallSphere implementation

Build steps with code

Pitfalls

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

Streaming Agent Responses with OpenAI Agents SDK and LangChain in 2026

Token-Level Evaluation of Streaming Agents: TTFT, Stream Smoothness, and Mid-Stream Hallucination Detection

Vercel AI SDK 5: Tool Calling and Streaming Guide for React Apps

AGUI Protocol: Bridging Frontend and Backend Agents Cleanly

Call Analytics and Agent Performance Dashboard Guide

Streaming Tokens and Graph Events in LangGraph 1.0