By Sagar Shankaran, Founder of CallSphere
Order schema migrations, model warm-up, and traffic cutover for an AI voice agent using ArgoCD sync waves and ApplicationSet progressive syncs. Real YAML and gotchas.
Key takeaways
TL;DR — Sync waves order resources within an Application; progressive syncs order Applications within an ApplicationSet. AI agents need both: DB migration first, model preload second, traffic cutover last.
An ArgoCD ApplicationSet that fans out the same voice-agent chart to three k3s clusters (dev → staging → prod) with progressive sync, plus a single Application that uses sync waves to ensure Postgres migrations run before the agent Deployment, and the Deployment becomes Ready before a HealthCheck Job hits the model.
flowchart TD
GIT[deploy repo] --> APPSET[ApplicationSet progressive]
APPSET --> DEV[App dev]
APPSET --> STG[App staging]
APPSET --> PRD[App prod]
PRD --> W0[Wave 0: PG migrations]
W0 --> W1[Wave 1: Secrets + ConfigMap]
W1 --> W2[Wave 2: Deployment]
W2 --> W3[Wave 3: Service + Ingress]
W3 --> W4[Wave 4: PostSync warm-up Job]
```bash helm install argocd argo/argo-cd -n argocd --create-namespace \ --set "applicationsetcontroller.enable.progressive.syncs=true" ```
This flag is off by default; without it, the strategy: block on ApplicationSet is silently ignored.
```yaml apiVersion: batch/v1 kind: Job metadata: name: pg-migrate annotations: argocd.argoproj.io/sync-wave: "0" argocd.argoproj.io/hook: PreSync argocd.argoproj.io/hook-delete-policy: BeforeHookCreation spec: template: spec: restartPolicy: Never containers: - name: migrate image: ghcr.io/acme/voice-agent:${IMAGE_TAG} command: ["alembic", "upgrade", "head"]
apiVersion: batch/v1 kind: Job metadata: name: warmup annotations: argocd.argoproj.io/sync-wave: "4" argocd.argoproj.io/hook: PostSync spec: template: spec: restartPolicy: Never containers: - name: warm image: ghcr.io/acme/voice-agent:${IMAGE_TAG} command: ["python", "scripts/warmup.py"] ```
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
ArgoCD waits for every resource in wave N to be Healthy before starting wave N+1 — so the Deployment never starts until migration succeeds, and Service traffic only lights up after the Deployment is Ready.
```yaml apiVersion: argoproj.io/v1alpha1 kind: ApplicationSet metadata: name: voice-agent namespace: argocd spec: generators: - list: elements: - cluster: dev - cluster: staging - cluster: prod strategy: type: RollingSync rollingSync: steps: - matchExpressions: [{ key: cluster, operator: In, values: [dev] }] - matchExpressions: [{ key: cluster, operator: In, values: [staging] }] maxUpdate: "100%" - matchExpressions: [{ key: cluster, operator: In, values: [prod] }] maxUpdate: "20%" template: metadata: { name: 'voice-{{cluster}}' } spec: project: default source: repoURL: https://gitlab.com/acme/deploy.git targetRevision: main path: voice helm: valueFiles: ['values.yaml', 'values-{{cluster}}.yaml'] destination: server: https://kubernetes.default.svc namespace: voice syncPolicy: automated: { prune: true, selfHeal: true } ```
maxUpdate: 20% on prod means a 5-cluster fleet rolls one cluster at a time and waits for each to go Healthy — bad model versions get caught on cluster #1, not #5.
ArgoCD's default health for a Deployment is "Ready replicas == desired". For an AI agent, that's not enough — the model could 500 on every request. Add a Lua health check:
```yaml
data: resource.customizations.health.apps_Deployment: | hs = {} if obj.metadata.annotations["voice.agent/health"] == "ok" then hs.status = "Healthy"; hs.message = "Voice probe ok" else hs.status = "Progressing"; hs.message = "Awaiting voice probe" end return hs ```
Have a separate cron Job that pings /realtime and PATCHes the annotation. ArgoCD now refuses to roll waves forward until a real WebRTC handshake works.
```yaml
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
data: trigger.on-degraded: | - when: app.status.health.status == 'Degraded' send: [slack-degraded] template.slack-degraded: | message: | :rotating_light: {{.app.metadata.name}} degraded on {{.app.spec.destination.name}} Sync: {{.app.status.sync.revision}} ```
```bash
argocd app rollback voice-prod --revision
Or in pure GitOps style, git revert the deploy repo; ArgoCD picks it up in <1m.
automated: selfHeal: true in prod can re-create resources you intentionally deleted. Disable on prod ApplicationSets, leave on dev.strategy block is parsed but ignored. Verify with kubectl logs -n argocd deploy/argocd-applicationset-controller | grep progressive.CallSphere runs ArgoCD on each tenant's k3s for the behavioral-health and healthcare HIPAA SKUs. We use sync waves to migrate 115+ Postgres tables before any of the 37 voice agents come up, and progressive sync to cut over from one model version to the next 20% at a time across our edge fleet. The Lua health check pings the actual OpenAI Realtime endpoint per agent, so a partially broken release fails the wave instead of silently degrading. $149/$499/$1499, 14-day trial, 22% affiliate, demo.
Q: Sync waves vs Argo Rollouts? Sync waves order resources within an Application; Argo Rollouts orders traffic within a Deployment (canary/blue-green). Use both.
Q: How do I gate prod on staging soak time?
Add an argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true to staging and a manual sync window on prod, or use a CronJob that PATCHes syncPolicy: { automated: null } after hours.
Q: Can ApplicationSets target external clusters?
Yes — register them with argocd cluster add. Each gets a Secret in argocd ns; the controller loads it as a destination.
Q: What about secrets? Pair ArgoCD with External Secrets Operator or Sealed Secrets — never commit plain Secrets, even with sync waves.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
How we built a fault-tolerant HVAC emergency triage and tech-dispatch platform on Kubernetes — three-tier CQRS, 11 micro-agents on the OpenAI Agents SDK + LangGraph, NATS JetStream, DTMF/SMS/WebSocket acceptance, circuit breakers, and an evaluation pipeline that catches regressions before they wake a tech at 3 AM.
Haystack 2.7's Agent component plus an Ollama-served Llama 3.2 gives you tool-calling RAG with citations. Here's a complete pipeline against your own document store.
Run STT, LLM, and TTS entirely on Cloudflare's edge — no OpenAI, no ElevenLabs. Real working code with Whisper, Llama 3.3 70B, and Deepgram Aura.
Version your prompts in git, run a 50-case eval suite on every PR, block merges below threshold, and ship a new agent prompt with confidence — full GitHub Actions tutorial.
Replace expensive outbound SDR tooling with a self-hosted dialer that runs OpenAI Realtime agents at 100 concurrent calls. Full architecture and code.
HVAC companies miss 40–60% of inbound. Build a 4-agent dispatch (intake, scheduling, parts, emergency) that integrates with ServiceTitan in 600 lines.
© 2026 CallSphere LLC. All rights reserved.
Watch how CallSphere handles real customer calls, schedules appointments, and processes payments — live.
Try Live DemoBook a DemoCalculate Your ROI