By Sagar Shankaran, Founder of CallSphere
Kafka 4.0 ditches ZooKeeper, ships share groups (KIP-932), and previews Eligible Leader Replicas. Here's how we fan out 50k AI call events per minute to analytics, billing, CRM, and ML training.
Key takeaways
TL;DR — Kafka 4.0 is the first major release with zero ZooKeeper (KRaft default), KIP-848 next-gen consumer rebalance GA, KIP-932 share groups (queue semantics on a log), and KIP-966 Eligible Leader Replicas in preview. For AI call analytics, this is the difference between a 30-second rebalance stall and an invisible reassignment.
When an AI voice agent finishes a call, ten downstream systems want to know: billing, CRM, the embedding pipeline, the QA dashboard, the recording archiver, the ML training set, the affiliate attribution job, the SLA monitor, the fraud detector, and the founder's Slack. Kafka is the canonical fan-out backbone: produce once, consume independently, replay on demand.
Kafka 4.0 (March 2025, mainline through 2026) closed the last operator-pain gaps: KRaft is default, KIP-848 makes consumer rebalances incremental rather than stop-the-world, and ELR keeps a min.in-sync subset that's safe to elect even when the ISR shrinks.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
flowchart LR
Voice[Voice agent] -->|produce call.completed| K[(Kafka 4.0<br/>KRaft cluster<br/>3 brokers)]
K --> CG1[Consumer group: billing]
K --> CG2[Consumer group: crm-sync]
K --> CG3[Consumer group: embeddings]
K --> CG4[Consumer group: qa-dashboard]
K --> CG5[Share group: ml-training]
CG3 --> Vec[(Vector DB)]
CG2 --> CRM[(HubSpot/Salesforce)]
CG5 --> Train[Training pipeline]
Each consumer group reads independently with its own offset. Share groups (KIP-932) layer queue semantics on top of the log so workloads that don't want partition-pinned ordering get cooperative consumption.
CallSphere fans out call.completed events from all 6 verticals (Real Estate, Healthcare, IT Services, Salon, After-hours, Sales) into a single Kafka topic with 24 partitions. The Real Estate OneRoof pod uses NATS internally for the agent bus, then the orchestrator emits a final call.completed event into Kafka for analytics. Other verticals follow the same pattern. Pricing $149/$499/$1499; 14-day trial, 22% affiliate. 37 agents · 90+ tools · 115+ DB tables.
bin/kafka-storage.sh format -t $UUID -c kraft.properties.bin/kafka-topics.sh --create --topic call.completed --partitions 24 --replication-factor 3 --config retention.ms=604800000.acks=all on producers for ELR safety.group.protocol=consumer (KIP-848).from confluent_kafka import Producer, Consumer
p = Producer({
"bootstrap.servers": "kafka1:9092,kafka2:9092,kafka3:9092",
"acks": "all",
"enable.idempotence": True,
"compression.type": "zstd",
})
p.produce(
topic="call.completed",
key=call_id.encode(),
value=cloudevent_envelope_json,
headers=[("ce-type", b"com.callsphere.call.completed.v1")],
)
p.flush()
c = Consumer({
"bootstrap.servers": "kafka1:9092",
"group.id": "embeddings",
"group.protocol": "consumer", # KIP-848 next-gen
"auto.offset.reset": "earliest",
})
c.subscribe(["call.completed"])
while True:
msg = c.poll(1.0)
if msg and not msg.error():
upsert_embedding(msg.value())
c.commit(msg)
call.* topics with header-based filtering downstream.Do we still need ZooKeeper? No. Kafka 4.0 only runs in KRaft mode. Migrating clusters: do the migration on 3.x first, then upgrade to 4.0.
Share groups vs consumer groups? Share groups (KIP-932) are queue-like — multiple consumers ack individual messages, no partition pinning. Use for unordered work like ML inference.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
When are ELRs GA? Preview in 4.0, expected GA in a 4.x release. Until then, treat them as a safety preview.
How does CallSphere price this? Kafka costs land in our infra, not customer pricing — see /pricing for $149/$499/$1499 plans.
Can I demo a Kafka-fed analytics dashboard? Yes — book a demo.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
Sentiment is not a single number per call - it is a curve. The shape (started positive, dropped at minute 4, recovered) tells you what your AI did wrong. Here is the per-utterance sentiment pipeline and the dashboards we ship by vertical.
End-to-end performance profiling across LLM, retrieval, tool, and UI layers. The 2026 patterns for finding the real bottleneck in AI pipelines.
pg_duckdb embeds DuckDB inside Postgres so transactional and analytic queries share the same database. AI dashboards that took 90 sec on Postgres run in <1 sec via DuckDB — without leaving Postgres.
Score sentiment from –1.0 to 1.0, lead intent from 0 to 100, and extract structured entities from every call. Async pipeline with NATS, gpt-4o-mini, and a Postgres analytics table.
The 15 KPIs that matter for AI voice agent operations — from answer rate and FCR to cost per successful resolution.
Segment turns 700+ source events into one customer timeline. We wire Twilio Voice Insights + AI agent events into Segment, build CallSphere's revenue-attributed call dashboard, and avoid the PHI traps.
© 2026 CallSphere LLC. All rights reserved.