SBOM + SLSA Provenance for AI Builds: CycloneDX + ML-BOM (2026)
Generate a CycloneDX SBOM + ML-BOM for an AI voice agent, attest SLSA provenance with cosign, verify with policy in Kubernetes via Kyverno. Real CI YAML and policy.
TL;DR — In 2026, an AI build needs both a code SBOM (CycloneDX 1.7) and an ML-BOM (model weights, training data lineage). Sign both with cosign, attest SLSA v1.0 provenance, and let Kyverno block any unsigned image from the cluster.
What you'll set up
A CI pipeline that builds the voice agent image, generates a CycloneDX SBOM with Syft, generates a CycloneDX ML-BOM for the model assets used, signs everything with cosign keyless, and enforces verification at admission with Kyverno.
Architecture
flowchart LR
SRC[Source] --> BUILD[Build image]
BUILD --> SYFT[Syft → CycloneDX SBOM]
BUILD --> MLBOM[CycloneDX ML-BOM tool]
SYFT --> SIGN[cosign attest]
MLBOM --> SIGN
BUILD --> PROV[SLSA provenance]
PROV --> SIGN
SIGN --> REG[OCI registry]
REG --> ADM[Kyverno admission]
ADM --> POD[Pod allowed]
Step 1 — Generate SBOM with Syft
```yaml
- uses: anchore/sbom-action@v0 with: image: ghcr.io/acme/voice-agent@${{ steps.push.outputs.digest }} format: cyclonedx-json output-file: voice-agent.cdx.json ```
Syft scans the image: every Python wheel, every OS package, every Go binary embedded — listed with version, license, and CPE.
Step 2 — Generate the ML-BOM
CycloneDX 1.7 (March 2026) added first-class ML-BOM:
```yaml
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
- name: ML-BOM
run: |
pip install cyclonedx-bom
cyclonedx-py environment --output-format JSON --output ml-deps.cdx.json
AI-specific entries
jq '.components += [ { "type": "machine-learning-model", "name": "openai-gpt-realtime", "version": "2026-04-01", "supplier": {"name": "OpenAI"}, "modelCard": { "modelParameters": { "approach": {"type": "supervised"}, "task": "speech-to-speech" } } } ]' ml-deps.cdx.json > voice-agent-ml.cdx.json ```
The ML-BOM lists every model the agent calls. Auditors love this.
Step 3 — Sign the image and attach SBOM
```yaml
- uses: sigstore/cosign-installer@v3
- run: | DIGEST=ghcr.io/acme/voice-agent@${{ steps.push.outputs.digest }} cosign sign --yes $DIGEST cosign attest --yes --predicate voice-agent.cdx.json --type cyclonedx $DIGEST cosign attest --yes --predicate voice-agent-ml.cdx.json --type https://cyclonedx.org/schema/v1.7/ml-bom $DIGEST ```
Step 4 — SLSA build provenance
```yaml
- uses: actions/attest-build-provenance@v2 with: subject-name: ghcr.io/acme/voice-agent subject-digest: ${{ steps.push.outputs.digest }} push-to-registry: true ```
Provenance is signed via GitHub OIDC + Sigstore Fulcio; verifiable with no key distribution.
Step 5 — Kyverno admission policy
```yaml apiVersion: kyverno.io/v2beta1 kind: ClusterPolicy metadata: { name: verify-voice-agent } spec: validationFailureAction: Enforce rules: - name: verify-signed-and-sbom match: { any: [{ resources: { kinds: [Pod], namespaces: [voice] }}]} verifyImages: - imageReferences: ["ghcr.io/acme/voice-agent*"] attestors: - entries: - keyless: subject: "https://github.com/acme/voice-agent/.github/workflows/build.yml@" issuer: "https://token.actions.githubusercontent.com" attestations: - type: cyclonedx attestors: - entries: - keyless: subject: "https://github.com/acme/voice-agent/.github/workflows/build.yml@" issuer: "https://token.actions.githubusercontent.com" - type: https://slsa.dev/provenance/v1.0 ```
Now any pod referencing an unsigned, unattested, or wrong-builder image is rejected at admission. No exceptions.
Step 6 — Verify locally before push
```bash cosign verify ghcr.io/acme/voice-agent@$DIGEST \ --certificate-identity-regexp "https://github.com/acme/voice-agent/.github/workflows/build.yml@*" \ --certificate-oidc-issuer https://token.actions.githubusercontent.com
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
cosign verify-attestation ghcr.io/acme/voice-agent@$DIGEST --type cyclonedx \ --certificate-identity-regexp ... | jq '.payload | @base64d | fromjson' ```
Step 7 — Continuous vulnerability scan against the SBOM
```yaml
- uses: anchore/scan-action@v3 with: sbom: voice-agent.cdx.json fail-build: true severity-cutoff: high ```
Re-scan SBOMs nightly — a new CVE on a pinned wheel will flag without a rebuild.
Pitfalls
- Forgetting
id-token: writein CI breaks cosign keyless silently. - CycloneDX vs SPDX — most regulators accept either. Pick one and stick to it; mixing creates dual-maintenance.
- ML-BOM tooling is young — generate by hand or with the
cyclonedx-py5.x prerelease for AI-specific fields. Stable APIs in CycloneDX 1.7+. - Kyverno policy strict regex rejects valid PRs from forks. Use
@*not exact ref. - Image registry rate limits — verifying 100 pods per minute on Docker Hub will rate-limit. Use a private registry or a Kyverno verification cache.
How CallSphere does this in production
CallSphere generates a CycloneDX SBOM and ML-BOM per voice-agent build, attests SLSA v1.0 provenance, and Kyverno rejects unsigned images at admission across our k3s edge fleet. Healthcare and behavioral-health tenants get a per-vertical attestation report monthly. 37 agents, 90+ tools, 115+ DB tables, $149/$499/$1499, 14-day trial, 22% affiliate.
FAQ
Q: Do I really need ML-BOM if I only use OpenAI? Yes — auditors want lineage. Even "we call OpenAI gpt-realtime version X" is a single-row ML-BOM and worth having.
Q: Cosign keyless vs key-based? Keyless ties signatures to your CI identity. Key-based requires a KMS and rotation. Keyless wins for almost everyone.
Q: How big are SBOMs? A typical Python AI image: 200-500 KB CycloneDX JSON. Negligible.
Q: Kyverno vs Gatekeeper?
Kyverno's verifyImages is purpose-built for cosign; Gatekeeper needs the Cosign Provider. Use Kyverno for image policy.
Sources
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.