---
title: "GitHub Actions Pipeline for AI Voice Agents: Build, Sign, Deploy (2026)"
description: "End-to-end GitHub Actions workflow for an OpenAI Realtime + LiveKit voice agent: matrix build, eval gate, cosign + SLSA provenance, and a kubectl rollout to k3s. Real YAML included."
canonical: https://callsphere.ai/blog/vw6h-github-actions-pipeline-ai-voice-agents-2026
category: "AI Engineering"
tags: ["GitHub Actions", "CI/CD", "Voice AI", "LiveKit", "Tutorial"]
author: "CallSphere Team"
published: 2026-03-15T00:00:00.000Z
updated: 2026-05-07T16:46:13.898Z
---

# GitHub Actions Pipeline for AI Voice Agents: Build, Sign, Deploy (2026)

> End-to-end GitHub Actions workflow for an OpenAI Realtime + LiveKit voice agent: matrix build, eval gate, cosign + SLSA provenance, and a kubectl rollout to k3s. Real YAML included.

> **TL;DR** — A solid voice-agent pipeline has five gates: lint, unit tests, an LLM eval suite, a signed container build with SLSA provenance, and a kubectl-driven progressive rollout. GitHub Actions does all five in one workflow with `actions/attest-build-provenance@v2`, `sigstore/cosign-installer`, and a self-hosted ARC runner.

## What you'll set up

A GitHub Actions workflow (`.github/workflows/voice-agent.yml`) that runs on every PR and on `main`: it lints, unit-tests, runs an OpenAI-Evals based regression suite, builds a multi-arch image, signs it with cosign keyless OIDC, attests SLSA build provenance, and deploys to a k3s cluster via Cloudflare Tunnel.

## Architecture

```mermaid
flowchart LR
  PR[PR push] --> LINT[lint + ruff + mypy]
  LINT --> UNIT[pytest unit]
  UNIT --> EVAL[LLM eval gate]
  EVAL --> BUILD[buildx multi-arch]
  BUILD --> SIGN[cosign keyless]
  SIGN --> PROV[SLSA provenance]
  PROV --> PUSH[ghcr.io push]
  PUSH --> ROLL[kubectl rollout]
  ROLL --> K3S[k3s edge cluster]
```

## Step 1 — Define the matrix and concurrency guards

```yaml
name: voice-agent
on:
  pull_request:
  push: { branches: [main] }
concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
permissions:
  contents: read
  id-token: write   # OIDC for cosign keyless
  packages: write   # ghcr push
  attestations: write
jobs:
  test:
    runs-on: ubuntu-24.04
    strategy:
      matrix: { python: ["3.11", "3.12"] }
```

The `id-token: write` is non-negotiable for keyless cosign signing — without it the OIDC token issuer rejects the request.

## Step 2 — Lint, type-check, and run an LLM eval gate

```yaml
    steps:
      - uses: actions/checkout@v4
      - uses: astral-sh/setup-uv@v4
        with: { python-version: ${{ matrix.python }} }
      - run: uv sync --frozen
      - run: uv run ruff check .
      - run: uv run mypy src/
      - run: uv run pytest -q tests/unit
      - name: LLM eval regression
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
        run: |
          uv run python evals/run.py --suite voice-regression \
            --threshold 0.92 --max-cost-usd 1.50
```

The eval gate runs ~30 prompts against a frozen test set of (input, expected_intent, expected_tool_calls) and fails the build if pass-rate drops below 92%. We cap spend at $1.50 per CI run via the `--max-cost-usd` flag in our eval harness — past that we abort.

## Step 3 — Build, sign, and attest the image

```yaml
  build:
    needs: test
    runs-on: ubuntu-24.04
    outputs: { digest: ${{ steps.push.outputs.digest }} }
    steps:
      - uses: actions/checkout@v4
      - uses: docker/setup-buildx-action@v3
      - uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - id: push
        uses: docker/build-push-action@v6
        with:
          push: true
          platforms: linux/amd64,linux/arm64
          tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
          cache-from: type=gha
          cache-to: type=gha,mode=max
      - uses: sigstore/cosign-installer@v3
      - run: cosign sign --yes ghcr.io/${{ github.repository }}@${{ steps.push.outputs.digest }}
      - uses: actions/attest-build-provenance@v2
        with:
          subject-name: ghcr.io/${{ github.repository }}
          subject-digest: ${{ steps.push.outputs.digest }}
          push-to-registry: true
```

`actions/attest-build-provenance@v2` writes the SLSA v1.0 provenance to the Sigstore transparency log and pushes it as a referrer in OCI 1.1 — anyone can verify with `cosign verify-attestation --type slsaprovenance ...`.

## Step 4 — Deploy to k3s via Cloudflare Tunnel

```yaml
  deploy:
    needs: build
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-24.04
    steps:
      - uses: cloudflare/cloudflared-action@v1
        with:
          tunnel-token: ${{ secrets.CF_TUNNEL_TOKEN }}
      - run: |
          echo "${{ secrets.K3S_KUBECONFIG_B64 }}" | base64 -d > kubeconfig
          export KUBECONFIG=$PWD/kubeconfig
          kubectl set image deploy/voice-agent \
            agent=ghcr.io/${{ github.repository }}@${{ needs.build.outputs.digest }} -n voice
          kubectl rollout status deploy/voice-agent -n voice --timeout=180s
```

Pinning by digest (not tag) means the image cosign signed is exactly the one running. `kubectl rollout status` blocks until the new ReplicaSet is healthy or fails, so the workflow turns red on bad rollouts.

## Step 5 — Add a smoke test that hits the live agent

```yaml
      - name: Voice smoke test
        run: |
          uv run python smoke/realtime_ping.py \
            --url wss://agent.example.com/realtime \
            --prompt "What's your name?" \
            --expect-keyword "voice agent"
```

The smoke test opens a real WebRTC session (we use `aiortc`), speaks a synthesized prompt, and asserts the agent's STT contains an expected keyword. Catches DNS/cert/Realtime-key rotation issues that unit tests can't.

## Step 6 — Branch protection rules

In repo settings: require the `test`, `build`, and `deploy` checks; require code-owner review on `evals/`; require signed commits. Combined with the cosign attestation, `main` becomes provably "this image came from this commit reviewed by these humans".

## Step 7 — Self-hosted runners for speed

For voice work, GitHub-hosted runners hit egress limits and have ~3-min cold start on Buildx. We run `actions/actions-runner-controller` (ARC) on the same k3s cluster — runners spin up in ~5s, share the buildx cache, and never hit GitHub's bandwidth meter.

## Pitfalls

- **OIDC token without `id-token: write`** — cosign will fail with a cryptic "no token found" error. Set the permission per-job, not just at workflow level.
- **Concurrency `cancel-in-progress: true`** kills running deploys. Use a separate `deploy` concurrency group that doesn't cancel.
- **Eval cost runaway** — never let an LLM eval suite call the API in a loop without a hard `--max-cost-usd` cap; one bad prompt template can rack up $50 in 5 minutes.
- **Cache poisoning on PRs from forks** — `cache-to: type=gha,mode=max` will not write from fork PRs (good), but you must verify the cache was actually used on `main` builds, not just rebuilt blindly.
- **kubectl rollout status timeout** — default is 0 (forever). Always set `--timeout` or your workflow hangs for hours.

## How CallSphere does this in production

CallSphere ships 37 voice agents across 6 verticals through a single GitHub Actions monorepo workflow. We run an OpenAI-Evals based gate over a frozen 200-prompt test suite per vertical (healthcare, salon, behavioral health, multi-family, contractors, dental). Images are signed with cosign keyless and pushed to GHCR, then deployed to a k3s edge cluster fronted by Cloudflare Tunnel — no public ingress on the Postgres at 72.62.162.83. 90+ tools and 115+ DB tables are migrated via a separate `db-migrate` job that runs before `deploy` so schema drift never reaches production. Pricing is $149 / $499 / $1499 with a 14-day [trial](/trial), 22% lifetime [affiliate](/affiliate) — try a [demo](/demo) or read the [healthcare](/industries/healthcare) build.

## FAQ

**Q: Should the eval suite block PRs or only `main`?**
Block PRs. A bad prompt change shouldn't sit in `main` for 30 minutes before someone notices in Datadog.

**Q: Why cosign keyless instead of a KMS key?**
Keyless ties signatures to the GitHub OIDC identity (workflow + repo + ref). Rotating a KMS key is painful; rotating an OIDC identity is automatic.

**Q: How do I keep secrets out of logs?**
Use `secrets.OPENAI_API_KEY` (auto-masked), and set `ACTIONS_STEP_DEBUG` only on private repos. Never `echo` a secret.

**Q: ARC runners or GitHub-hosted?**
ARC for monorepos with frequent buildx work; GitHub-hosted for everything else. The crossover point is around 50 builds/day.

## Sources

- [OpenAI Agents SDK and GitHub Actions skills](https://developers.openai.com/blog/skills-agents-sdk)
- [Achieving SLSA 3 Compliance with GitHub Actions and Sigstore](https://github.blog/security/supply-chain-security/slsa-3-compliance-with-github-actions/)
- [actions/attest-build-provenance](https://github.com/actions/attest-build-provenance)
- [sigstore/cosign documentation](https://docs.sigstore.dev/quickstart/quickstart-cosign/)
- [LiveKit Agents framework](https://github.com/livekit/agents)

---

Source: https://callsphere.ai/blog/vw6h-github-actions-pipeline-ai-voice-agents-2026