TL;DR — Install GitLab Runner with the Kubernetes executor (so each job is a pod), bind a GitLab Agent for cluster context, and let Flux watch a separate deploy repo for image bumps. Three moving parts, one auditable trail.

What you'll set up

A GitLab project for an AI voice agent that builds + tests in CI, pushes a signed image to the GitLab container registry, and triggers a Flux reconcile in a paired deploy repo. The runner itself runs on the same k3s where the voice agent runs — no public IP needed for the cluster.

Architecture

flowchart LR
  DEV[Push to GitLab] --> CI[GitLab CI]
  CI --> RUNNER[K8s Runner pods]
  RUNNER --> REG[GitLab Registry]
  CI -->|bot commit| DEPLOY[(deploy repo)]
  DEPLOY --> FLUX[Flux on k3s]
  FLUX --> AGENT[voice-agent Deployment]
  AGENT --> LK[LiveKit + OpenAI Realtime]

Step 1 — Install GitLab Runner with the Kubernetes executor

```yaml

values.yaml

gitlabUrl: https://gitlab.com/ runnerToken: ${RUNNER_TOKEN} runners: config: | [[runners]] name = "k3s-voice" executor = "kubernetes" [runners.kubernetes] namespace = "gitlab-runner" cpu_request = "200m" memory_request = "512Mi" helper_image = "gitlab/gitlab-runner-helper:latest" privileged = true # needed for buildx in DinD ```

```bash helm upgrade --install gitlab-runner -n gitlab-runner --create-namespace \ -f values.yaml gitlab/gitlab-runner ```

Each CI job becomes a pod. Builds finish, pod evicts, no orphan state — the cleanest CI primitive on Kubernetes.

Step 2 — Wire the GitLab Agent (KAS) for declarative cluster context

The GitLab Agent gives CI jobs scoped KUBECONFIG without storing service-account tokens in CI variables.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

```yaml

.gitlab/agents/voice/config.yaml

ci_access: projects: - id: acme/voice-agent ```

In CI you then reference it as environment.kubernetes.namespace and the runner injects $KUBECONFIG for that scope only.

Step 3 — Build, eval, and push from .gitlab-ci.yml

```yaml stages: [test, eval, build, deploy]

variables: PYTHON_VERSION: "3.12" IMAGE: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA

test: stage: test image: python:3.12-slim script: - pip install uv && uv sync --frozen - uv run pytest tests/unit

llm-eval: stage: eval image: python:3.12-slim script: - uv run python evals/run.py --suite voice --threshold 0.92 variables: OPENAI_API_KEY: $OPENAI_API_KEY # masked + protected

build: stage: build image: docker:26 services: [docker:26-dind] script: - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY - docker buildx build --push --platform linux/amd64,linux/arm64 -t $IMAGE . ```

Step 4 — Bump the deploy repo from CI

```yaml deploy: stage: deploy image: alpine/git:latest rules: [{ if: $CI_COMMIT_BRANCH == "main" }] script: - git clone https://oauth2:\[email protected]/acme/deploy.git - cd deploy - sed -i "s|image: .*|image: $IMAGE|" voice/values.yaml - git commit -am "voice: bump to $CI_COMMIT_SHA" - git push ```

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

The deploy repo is what Flux watches. CI never kubectl applys — that's a one-way GitOps flow.

Step 5 — Flux on the cluster picks it up

```yaml apiVersion: source.toolkit.fluxcd.io/v1 kind: GitRepository metadata: { name: deploy, namespace: flux-system } spec: interval: 30s url: https://gitlab.com/acme/deploy.git ref: { branch: main }

apiVersion: kustomize.toolkit.fluxcd.io/v1 kind: Kustomization metadata: { name: voice, namespace: flux-system } spec: interval: 1m path: ./voice sourceRef: { kind: GitRepository, name: deploy } prune: true ```

Push to main → CI bumps deploy repo → Flux reconciles in 30-60s → new pods rolling.

Step 6 — Add merge-request review apps

```yaml review: stage: deploy rules: [{ if: $CI_PIPELINE_SOURCE == "merge_request_event" }] environment: name: review/$CI_MERGE_REQUEST_IID url: https://mr-\$CI_MERGE_REQUEST_IID.preview.example.com on_stop: stop-review script: - helm upgrade --install voice-mr-$CI_MERGE_REQUEST_IID ./chart \ --set image.tag=$CI_COMMIT_SHA \ --set ingress.host=mr-$CI_MERGE_REQUEST_IID.preview.example.com ```

A real WebRTC URL per MR — QA can dial in and test prompt changes before merge.

Pitfalls

DinD vs buildx with cache — DinD ephemeral storage kills layer cache. Use docker buildx create --driver kubernetes for persistent BuildKit pods.
Runner privileged: true — required for DinD but expands blast radius. Pin to a dedicated namespace and PSP/PSS.
Masked variables that aren't masked — values under 8 chars or containing newlines silently won't mask. Verify with echo "[MASKED]" test job.
GitLab Agent connectivity — KAS needs WebSocket egress on port 443. Some egress firewalls drop long-lived WS; whitelist explicitly.
Review-app DNS races — wildcard cert + 30s DNS propagation = tests fail. Pre-warm DNS with a check job.

How CallSphere does this in production

CallSphere uses GitHub Actions for the public monorepo and a private GitLab instance for tenant-specific behavioral-health forks (HIPAA isolation). The pattern is identical: build, eval, sign, push to a private registry, bump a deploy repo. Flux on each tenant's k3s pulls only their image. 37 voice agents, 90+ tools, 115+ DB tables behind Cloudflare Tunnel. Pricing $149/$499/$1499, 14-day trial, 22% affiliate — see pricing.

FAQ

Q: Why a separate deploy repo? Auditability and blast-radius. The deploy repo is the single source of truth for what's running; rolling back is a git revert.

Q: Can I skip Flux and just kubectl apply from CI? You can, but you lose drift detection. Flux re-applies if anything diverges; CI fires once and forgets.

Q: How do I run an LLM eval without leaking the API key to MR pipelines? Mark the variable as Protected — only main and protected tags get it. Forks in MRs run a stub eval.

Q: GitLab vs GitHub for AI work? GitHub has more SLSA tooling (actions/attest-build-provenance); GitLab has tighter merge-request review apps. Pick on workflow fit, not features.

GitLab CI/CD for AI Voice Deployments: Runner, Agent, GitOps (2026)

What you'll set up

Architecture

Step 1 — Install GitLab Runner with the Kubernetes executor

values.yaml

Step 2 — Wire the GitLab Agent (KAS) for declarative cluster context

.gitlab/agents/voice/config.yaml

Step 3 — Build, eval, and push from .gitlab-ci.yml

Step 4 — Bump the deploy repo from CI

Step 5 — Flux on the cluster picks it up

```yaml apiVersion: source.toolkit.fluxcd.io/v1 kind: GitRepository metadata: { name: deploy, namespace: flux-system } spec: interval: 30s url: https://gitlab.com/acme/deploy.git ref: { branch: main }

Step 6 — Add merge-request review apps

Pitfalls

How CallSphere does this in production

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

Latency vs Cost: A Decision Matrix for Voice AI Spend in 2026

Build a Chat Agent with Haystack RAG + Open LLM (Llama 3.2, 2026)

Build a Voice Agent on Cloudflare Workers AI (No External LLM)

OpenAI's May 2026 WebRTC Rearchitecture: How Voice Latency Got Real

How to Build Voice Agent CI/CD with Evals as Gate (GitHub Actions)

Logistics Dispatch Voice Agent 2026: Driver Hotline + Load Assignment Hands-Free