TL;DR — Kafka 4.0 is the first major release with zero ZooKeeper (KRaft default), KIP-848 next-gen consumer rebalance GA, KIP-932 share groups (queue semantics on a log), and KIP-966 Eligible Leader Replicas in preview. For AI call analytics, this is the difference between a 30-second rebalance stall and an invisible reassignment.

The pattern

When an AI voice agent finishes a call, ten downstream systems want to know: billing, CRM, the embedding pipeline, the QA dashboard, the recording archiver, the ML training set, the affiliate attribution job, the SLA monitor, the fraud detector, and the founder's Slack. Kafka is the canonical fan-out backbone: produce once, consume independently, replay on demand.

Kafka 4.0 (March 2025, mainline through 2026) closed the last operator-pain gaps: KRaft is default, KIP-848 makes consumer rebalances incremental rather than stop-the-world, and ELR keeps a min.in-sync subset that's safe to elect even when the ISR shrinks.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

How it works (architecture)

flowchart LR
  Voice[Voice agent] -->|produce call.completed| K[(Kafka 4.0<br/>KRaft cluster<br/>3 brokers)]
  K --> CG1[Consumer group: billing]
  K --> CG2[Consumer group: crm-sync]
  K --> CG3[Consumer group: embeddings]
  K --> CG4[Consumer group: qa-dashboard]
  K --> CG5[Share group: ml-training]
  CG3 --> Vec[(Vector DB)]
  CG2 --> CRM[(HubSpot/Salesforce)]
  CG5 --> Train[Training pipeline]

Each consumer group reads independently with its own offset. Share groups (KIP-932) layer queue semantics on top of the log so workloads that don't want partition-pinned ordering get cooperative consumption.

CallSphere implementation

CallSphere fans out call.completed events from all 6 verticals (Real Estate, Healthcare, IT Services, Salon, After-hours, Sales) into a single Kafka topic with 24 partitions. The Real Estate OneRoof pod uses NATS internally for the agent bus, then the orchestrator emits a final call.completed event into Kafka for analytics. Other verticals follow the same pattern. Pricing $149/$499/$1499; 14-day trial, 22% affiliate. 37 agents · 90+ tools · 115+ DB tables.

Build steps with code

Bootstrap KRaft cluster: bin/kafka-storage.sh format -t $UUID -c kraft.properties.
Create the topic with proper retention: bin/kafka-topics.sh --create --topic call.completed --partitions 24 --replication-factor 3 --config retention.ms=604800000.
Set min.insync.replicas=2 and acks=all on producers for ELR safety.
Use the new consumer group protocol: group.protocol=consumer (KIP-848).
Schema-registry the payload (CloudEvents 1.0 envelope, see post #12).
Wire a Kafka Connect sink for ClickHouse and S3 archival.
Monitor under-replicated partitions and consumer lag in Grafana.

from confluent_kafka import Producer, Consumer

p = Producer({
    "bootstrap.servers": "kafka1:9092,kafka2:9092,kafka3:9092",
    "acks": "all",
    "enable.idempotence": True,
    "compression.type": "zstd",
})

p.produce(
    topic="call.completed",
    key=call_id.encode(),
    value=cloudevent_envelope_json,
    headers=[("ce-type", b"com.callsphere.call.completed.v1")],
)
p.flush()

c = Consumer({
    "bootstrap.servers": "kafka1:9092",
    "group.id": "embeddings",
    "group.protocol": "consumer",   # KIP-848 next-gen
    "auto.offset.reset": "earliest",
})
c.subscribe(["call.completed"])
while True:
    msg = c.poll(1.0)
    if msg and not msg.error():
        upsert_embedding(msg.value())
        c.commit(msg)

Common pitfalls

Partition count too low — 6 partitions caps you at 6 parallel consumers per group; pick partitions for your peak parallelism, not your current load.
Acks=1 + idempotence=false — silent data loss on broker failover.
One topic per event type explosion — group related events under call.* topics with header-based filtering downstream.
Skipping the new consumer protocol — old protocol still works but rebalances stall on every deploy.
Treating retention.ms as backup — Kafka is a buffer, not a database; tier to S3 with KIP-405.

FAQ

Do we still need ZooKeeper? No. Kafka 4.0 only runs in KRaft mode. Migrating clusters: do the migration on 3.x first, then upgrade to 4.0.

Share groups vs consumer groups? Share groups (KIP-932) are queue-like — multiple consumers ack individual messages, no partition pinning. Use for unordered work like ML inference.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

When are ELRs GA? Preview in 4.0, expected GA in a 4.x release. Until then, treat them as a safety preview.

How does CallSphere price this? Kafka costs land in our infra, not customer pricing — see /pricing for $149/$499/$1499 plans.

Can I demo a Kafka-fed analytics dashboard? Yes — book a demo.

Kafka 4.0 for AI Call Analytics Fan-Out: KRaft, Share Groups, Eligible Leader Replicas

The pattern

How it works (architecture)

CallSphere implementation

Build steps with code

Common pitfalls

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

How to Build a Voice Agent Post-Call Analytics Pipeline

AI Voice Agent Analytics: The KPIs That Actually Matter

Event-Driven Agent Architectures: Using NATS, Kafka, and Redis Streams for Agent Communication

Analytics Dashboard for Agent Platform Users: Usage, Performance, and ROI Metrics

Message Queues for AI Agent Workloads: RabbitMQ, SQS, and Kafka Patterns

Chat Analytics: Tracking Conversations, Measuring Success, and Improving Agents