Skip to content
AI Infrastructure
AI Infrastructure10 min read0 views

Docker Multi-Stage AI Agent Images: uv + Distroless = 80MB (2026)

Shrink an AI voice agent image from 950MB to 80MB with a Python 3.13 multi-stage build, uv for deps, and gcr.io/distroless/python3 nonroot. Real Dockerfile + benchmarks.

TL;DR — Build with python:3.13-slim + uv for deps, copy a virtualenv into gcr.io/distroless/python3-debian12:nonroot, and your AI voice agent ships at ~80 MB with no shell, no apt, no pip. Smaller surface, faster pulls, fewer CVEs.

What you'll set up

A two-stage Dockerfile that builds an OpenAI Realtime + LiveKit agent with uv, then copies the resolved virtualenv into a distroless runtime. The final image is non-root, has zero package managers, and starts in <300 ms.

Architecture

flowchart LR
  SRC[src + pyproject.toml] --> S1[Stage 1: python:3.13-slim + uv]
  S1 -->|uv sync --frozen| VENV[/.venv/]
  VENV --> S2[Stage 2: distroless/python3 nonroot]
  S2 --> IMG[80MB image]
  IMG --> K8S[k3s pod]

Step 1 — Pin Python and add uv

```dockerfile

syntax=docker/dockerfile:1.7

ARG PY=3.13

FROM python:${PY}-slim AS builder ENV UV_LINK_MODE=copy \ UV_COMPILE_BYTECODE=1 \ UV_PROJECT_ENVIRONMENT=/opt/venv RUN --mount=type=cache,target=/root/.cache \ pip install --no-cache-dir uv==0.5.4 WORKDIR /app ```

UV_LINK_MODE=copy is mandatory when the venv is going to be moved across stages. UV_COMPILE_BYTECODE=1 precompiles .pyc, which gives a measurable cold-start improvement on distroless (where you can't recompile at runtime).

Step 2 — Resolve dependencies with the lockfile

```dockerfile COPY pyproject.toml uv.lock ./ RUN --mount=type=cache,target=/root/.cache/uv \ uv sync --frozen --no-dev --no-install-project COPY src/ ./src/ RUN uv sync --frozen --no-dev ```

The split — install deps first, then copy code, then install project — gives proper Docker layer caching: code changes don't bust dependency resolution.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

Step 3 — Switch to distroless and copy the venv

```dockerfile FROM gcr.io/distroless/python3-debian12:nonroot

COPY --from=builder /opt/venv /opt/venv COPY --from=builder /app/src /app/src

ENV PATH=/opt/venv/bin:$PATH \ PYTHONPATH=/app \ PYTHONDONTWRITEBYTECODE=1 \ PYTHONUNBUFFERED=1

WORKDIR /app USER nonroot ENTRYPOINT ["python", "-m", "src.agent"] ```

The final image: ~80 MB, no shell, no apt, no curl. USER nonroot (uid 65532) is built into the image — you can't accidentally run as root.

Step 4 — Add a healthcheck that doesn't need a shell

Distroless has no curl. Use Python:

```dockerfile HEALTHCHECK --interval=10s --timeout=2s \ CMD ["python", "-c", "import urllib.request,sys; sys.exit(0 if urllib.request.urlopen('http://127.0.0.1:8080/healthz', timeout=1).status==200 else 1)"] ```

For Kubernetes, drop the Docker HEALTHCHECK and use a real probe in the Pod spec — but include both for local docker run debugging.

Step 5 — Build with buildx + provenance

```bash docker buildx create --use --name voicebuilder docker buildx build \ --platform linux/amd64,linux/arm64 \ --provenance=mode=max \ --sbom=true \ -t ghcr.io/acme/voice-agent:$(git rev-parse --short HEAD) \ --push . ```

--provenance=mode=max writes a SLSA-compatible provenance attestation; --sbom=true emits SPDX. Both are stored as OCI referrers — invisible to old clients but verifiable by cosign.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Step 6 — Verify the image really is small + non-root

```bash docker inspect ghcr.io/acme/voice-agent:abc123 \ --format '{{ .Size }} bytes user={{ .Config.User }}'

81 MB user=nonroot

docker run --rm --read-only --user 65532:65532 \ ghcr.io/acme/voice-agent:abc123 python -c "import sys;print(sys.version)" ```

--read-only confirms the agent doesn't need to write to /tmp (set up an in-memory volume in K8s if it does).

Step 7 — Bench cold start vs slim

```bash hyperfine --warmup 1 \ 'docker run --rm voice-agent:slim python -c "import livekit"' \ 'docker run --rm voice-agent:distroless python -c "import livekit"'

slim: 540ms ± 30ms

distroless: 290ms ± 18ms

```

About 45% faster cold start because there's no shell init, no /etc/profile, and the image fits in page cache.

Pitfalls

  • Native deps with no manylinux wheel (e.g. some grpc-tools) need build essentials in stage 1; without build-essential, the wheel is built in the final image and bloats it.
  • uv sync without --frozen in CI can produce different lock results than dev. Always --frozen in the build.
  • /tmp is read-only on distroless by default. Mount an emptyDir{ medium: Memory } if libs (matplotlib cache!) need it.
  • No apt for OS CVE patches — but distroless rebuilds nightly. Pin gcr.io/distroless/python3-debian12:nonroot only by digest in the registry, not by tag.
  • Glibc mismatch — if you build on alpine then copy to debian12 you'll get ImportError. Stay on debian-based slim → debian-based distroless.

How CallSphere does this in production

CallSphere ships every voice-agent image at ~85 MB on gcr.io/distroless/python3-debian12:nonroot. With 37 agents pulling on every pod restart across a fleet of k3s nodes, going from 950 MB to 85 MB cut our image-pull p95 from 12 s to 0.9 s. We also gained back ~12 GB of container-host disk per node. 90+ tools, 115+ DB tables, $149/$499/$1499, 14-day trial, 22% affiliate, demo.

FAQ

Q: Why not Alpine? musl libc breaks some Python wheels (notably grpcio, numpy on older releases). Distroless uses glibc — safer for AI stacks.

Q: How do I debug a distroless container? Use gcr.io/distroless/python3-debian12:debug-nonroot for the same image with busybox. Switch only in dev.

Q: Can I add tini for signal handling? Distroless includes a proper PID-1 /usr/bin/python invocation already; you don't need tini.

Q: SBOM where? docker buildx imagetools inspect <image> --format '{{ json .SBOM }}' shows the SPDX SBOM stored as referrer.

Sources

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.