TL;DR — Helm 4 finally fixes three-way merge hell with server-side apply and uses kstatus for accurate readiness. Author your voice-agent chart around a library subchart, never use random naming, and ship the LiveKit + agent + worker triple as one release.

What you'll set up

A Helm 4 chart voice-agent with three Deployments (livekit, agent, worker), a HorizontalPodAutoscaler, a PodDisruptionBudget, and a values schema strict enough that bad merges fail helm install instead of producing weird pods.

Architecture

flowchart TD
  VAL[values.yaml] --> CHART[voice-agent chart]
  CHART --> LK[LiveKit Deployment]
  CHART --> AG[agent Deployment]
  CHART --> WK[worker Deployment]
  CHART --> HPA[HPA]
  CHART --> PDB[PDB]
  LIB[lib subchart] -.->|tpl helpers| CHART
  HELM[Helm 4 SSA] --> APIS[K8s API server]

Step 1 — Bootstrap the chart and library

```bash helm create voice-agent helm create voice-agent/charts/voice-lib --type library ```

Move repeated helpers (labels, env injection, image fully-qualified name) into voice-lib/templates/_helpers.tpl. Subcharts in Helm 4 still work the same way; you just stop copy-pasting include "common.labels" between charts.

Step 2 — Lock the values schema

```yaml

voice-agent/values.schema.json (excerpt)

{ "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "required": ["image", "openai", "livekit"], "properties": { "image": { "type": "object", "required": ["repository", "tag"], "properties": { "repository": { "type": "string" }, "tag": { "type": "string", "minLength": 1 }, "pullPolicy": { "enum": ["Always", "IfNotPresent", "Never"] } } }, "openai": { "type": "object", "required": ["model"], "properties": { "model": { "enum": ["gpt-realtime", "gpt-realtime-mini"] }, "voice": { "enum": ["alloy", "echo", "shimmer", "marin"] } } } } } ```

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

helm install now refuses values where image.tag is empty — the #1 cause of "ImagePullBackOff on latest" disasters.

Step 3 — Author the agent Deployment template

```yaml apiVersion: apps/v1 kind: Deployment metadata: name: {{ include "voice-agent.fullname" . }}-agent labels: {{- include "voice-agent.labels" . | nindent 4 }} spec: replicas: {{ .Values.agent.replicas }} selector: matchLabels: {{- include "voice-agent.selectorLabels" . | nindent 6 }} app.kubernetes.io/component: agent template: metadata: labels: {{- include "voice-agent.selectorLabels" . | nindent 8 }} app.kubernetes.io/component: agent spec: containers: - name: agent image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}" env: - name: OPENAI_API_KEY valueFrom: { secretKeyRef: { name: voice-secrets, key: openai } } - name: LIVEKIT_URL value: "ws://{{ include "voice-agent.fullname" . }}-livekit:7880" readinessProbe: httpGet: { path: /healthz/realtime, port: 8080 } initialDelaySeconds: 5 periodSeconds: 10 resources: {{- toYaml .Values.agent.resources | nindent 12 }} ```

/healthz/realtime should actually attempt a Realtime handshake — readiness on a TCP port is meaningless when the OpenAI key has rotated.

Step 4 — HPA + PDB for graceful scale

```yaml apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: { name: {{ include "voice-agent.fullname" . }}-agent } spec: scaleTargetRef: { apiVersion: apps/v1, kind: Deployment, name: {{ include "voice-agent.fullname" . }}-agent } minReplicas: {{ .Values.agent.minReplicas }} maxReplicas: {{ .Values.agent.maxReplicas }} metrics: - type: Pods pods: metric: { name: active_calls } target: { type: AverageValue, averageValue: "5" }

apiVersion: policy/v1 kind: PodDisruptionBudget metadata: { name: {{ include "voice-agent.fullname" . }}-agent } spec: minAvailable: 1 selector: matchLabels: app.kubernetes.io/component: agent ```

PDB is non-negotiable for voice — node drains would otherwise drop in-flight calls.

Step 5 — Install with server-side apply (Helm 4)

```bash helm install voice ./voice-agent \ --namespace voice --create-namespace \ --set image.tag=$(git rev-parse --short HEAD) \ --kube-server-side ```

--kube-server-side flips Helm 4 to its default of letting the API server merge — no more "Forbidden: field is immutable" surprises after a chart refactor.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Step 6 — Use kstatus for accurate readiness

```bash helm status voice -n voice -o json | jq '.info.status' ```

Helm 4 reports InProgress while pods are starting and Current only when kstatus says all resources have reached steady state — helm install --wait finally means what users always thought it meant.

Step 7 — Package, sign, and push to OCI

```bash helm package voice-agent helm push voice-agent-1.0.0.tgz oci://ghcr.io/acme/charts cosign sign ghcr.io/acme/charts/voice-agent:1.0.0 ```

Helm OCI is GA; cosign signs the chart manifest just like any container.

Pitfalls

fullname collisions — two releases of the same chart in one namespace will fight over Service names. Always use {{ .Release.Name }} somewhere in fullname.
Persistent volumes in voice charts — unless you absolutely need them (you don't, for a stateless realtime agent), don't ship them. They block PDB-respecting drains.
Library subchart versions — Helm caches resolved deps; bumping the library doesn't re-render unless you helm dependency update.
--wait without --wait-for-jobs doesn't block on Job completion. Migrations may not have run when helm install returns 0.
values.schema.json drift — if you forget to update the schema, helm install accepts garbage. Lint with helm lint in CI.

How CallSphere does this in production

CallSphere ships one Helm chart callsphere-voice with subcharts per vertical (healthcare, salon, behavioral-health, multi-family, contractors, dental). Each subchart enables/disables the right tools out of our 90+ tool catalog. Helm 4 server-side apply has cut our drift incidents in half. 37 agents, 115+ DB tables, k3s + Cloudflare Tunnel. $149/$499/$1499, 14-day trial, 22% affiliate — try the salon vertical (similar pattern).

FAQ

Q: Helm 4 vs Kustomize for AI agents? Helm for templated multi-tenancy (different vertical = different values). Kustomize for environment-only deltas. Many teams use both.

Q: How do I roll back a bad chart? helm rollback voice <revision> — but always test rollback in staging; a chart that adds an immutable field can't roll back without manual deletes.

Q: Should LiveKit be in the same chart? Yes if you self-host. Co-located = lowest WebRTC latency; separation pays off only at multi-tenant scale.

Q: What about MCP servers? Treat each MCP server as another Deployment in the chart with its own probe and PDB. The agent talks to them via in-cluster Service.

Helm Chart for AI Voice Agents: Helm 4 + Server-Side Apply (2026)

What you'll set up

Architecture

Step 1 — Bootstrap the chart and library

Step 2 — Lock the values schema

voice-agent/values.schema.json (excerpt)

Step 3 — Author the agent Deployment template

Step 4 — HPA + PDB for graceful scale

Step 5 — Install with server-side apply (Helm 4)

Step 6 — Use kstatus for accurate readiness

Step 7 — Package, sign, and push to OCI

Pitfalls

How CallSphere does this in production

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

Latency vs Cost: A Decision Matrix for Voice AI Spend in 2026

Build a Chat Agent with Haystack RAG + Open LLM (Llama 3.2, 2026)

Build a Voice Agent on Cloudflare Workers AI (No External LLM)

OpenAI's May 2026 WebRTC Rearchitecture: How Voice Latency Got Real

How to Build Voice Agent CI/CD with Evals as Gate (GitHub Actions)

Logistics Dispatch Voice Agent 2026: Driver Hotline + Load Assignment Hands-Free