---
title: "Docker Multi-Stage AI Agent Images: uv + Distroless = 80MB (2026)"
description: "Shrink an AI voice agent image from 950MB to 80MB with a Python 3.13 multi-stage build, uv for deps, and gcr.io/distroless/python3 nonroot. Real Dockerfile + benchmarks."
canonical: https://callsphere.ai/blog/vw6h-docker-multi-stage-ai-agent-distroless-uv-2026
category: "AI Infrastructure"
tags: ["Docker", "Distroless", "uv", "Python", "Tutorial"]
author: "CallSphere Team"
published: 2026-03-27T00:00:00.000Z
updated: 2026-05-07T16:46:15.343Z
---

# Docker Multi-Stage AI Agent Images: uv + Distroless = 80MB (2026)

> Shrink an AI voice agent image from 950MB to 80MB with a Python 3.13 multi-stage build, uv for deps, and gcr.io/distroless/python3 nonroot. Real Dockerfile + benchmarks.

> **TL;DR** — Build with `python:3.13-slim` + `uv` for deps, copy a virtualenv into `gcr.io/distroless/python3-debian12:nonroot`, and your AI voice agent ships at ~80 MB with no shell, no apt, no `pip`. Smaller surface, faster pulls, fewer CVEs.

## What you'll set up

A two-stage Dockerfile that builds an OpenAI Realtime + LiveKit agent with uv, then copies the resolved virtualenv into a distroless runtime. The final image is non-root, has zero package managers, and starts in  S1[Stage 1: python:3.13-slim + uv]
  S1 -->|uv sync --frozen| VENV[/.venv/]
  VENV --> S2[Stage 2: distroless/python3 nonroot]
  S2 --> IMG[80MB image]
  IMG --> K8S[k3s pod]
```

## Step 1 — Pin Python and add uv

```dockerfile

# syntax=docker/dockerfile:1.7

ARG PY=3.13

FROM python:${PY}-slim AS builder
ENV UV_LINK_MODE=copy \
    UV_COMPILE_BYTECODE=1 \
    UV_PROJECT_ENVIRONMENT=/opt/venv
RUN --mount=type=cache,target=/root/.cache \
    pip install --no-cache-dir uv==0.5.4
WORKDIR /app
```

`UV_LINK_MODE=copy` is mandatory when the venv is going to be moved across stages. `UV_COMPILE_BYTECODE=1` precompiles `.pyc`, which gives a measurable cold-start improvement on distroless (where you can't recompile at runtime).

## Step 2 — Resolve dependencies with the lockfile

```dockerfile
COPY pyproject.toml uv.lock ./
RUN --mount=type=cache,target=/root/.cache/uv \
    uv sync --frozen --no-dev --no-install-project
COPY src/ ./src/
RUN uv sync --frozen --no-dev
```

The split — install deps first, then copy code, then install project — gives proper Docker layer caching: code changes don't bust dependency resolution.

## Step 3 — Switch to distroless and copy the venv

```dockerfile
FROM gcr.io/distroless/python3-debian12:nonroot

COPY --from=builder /opt/venv /opt/venv
COPY --from=builder /app/src /app/src

ENV PATH=/opt/venv/bin:$PATH \
    PYTHONPATH=/app \
    PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1

WORKDIR /app
USER nonroot
ENTRYPOINT ["python", "-m", "src.agent"]
```

The final image: ~80 MB, no shell, no apt, no curl. `USER nonroot` (uid 65532) is built into the image — you can't accidentally run as root.

## Step 4 — Add a healthcheck that doesn't need a shell

Distroless has no curl. Use Python:

```dockerfile
HEALTHCHECK --interval=10s --timeout=2s \
  CMD ["python", "-c", "import urllib.request,sys; sys.exit(0 if urllib.request.urlopen('[http://127.0.0.1:8080/healthz](http://127.0.0.1:8080/healthz)', timeout=1).status==200 else 1)"]
```

For Kubernetes, drop the Docker HEALTHCHECK and use a real probe in the Pod spec — but include both for local `docker run` debugging.

## Step 5 — Build with buildx + provenance

```bash
docker buildx create --use --name voicebuilder
docker buildx build \
  --platform linux/amd64,linux/arm64 \
  --provenance=mode=max \
  --sbom=true \
  -t ghcr.io/acme/voice-agent:$(git rev-parse --short HEAD) \
  --push .
```

`--provenance=mode=max` writes a SLSA-compatible provenance attestation; `--sbom=true` emits SPDX. Both are stored as OCI referrers — invisible to old clients but verifiable by cosign.

## Step 6 — Verify the image really is small + non-root

```bash
docker inspect ghcr.io/acme/voice-agent:abc123 \
  --format '{{ .Size }} bytes user={{ .Config.User }}'

# 81 MB user=nonroot

docker run --rm --read-only --user 65532:65532 \
  ghcr.io/acme/voice-agent:abc123 python -c "import sys;print(sys.version)"
```

`--read-only` confirms the agent doesn't need to write to `/tmp` (set up an in-memory volume in K8s if it does).

## Step 7 — Bench cold start vs slim

```bash
hyperfine --warmup 1 \
  'docker run --rm voice-agent:slim python -c "import livekit"' \
  'docker run --rm voice-agent:distroless python -c "import livekit"'

# slim:        540ms ± 30ms

# distroless:  290ms ± 18ms

```

About 45% faster cold start because there's no shell init, no `/etc/profile`, and the image fits in page cache.

## Pitfalls

- **Native deps with no manylinux wheel** (e.g. some `grpc-tools`) need build essentials in stage 1; without `build-essential`, the wheel is built in the final image and bloats it.
- **`uv sync` without `--frozen`** in CI can produce different lock results than dev. Always `--frozen` in the build.
- **`/tmp` is read-only on distroless** by default. Mount an `emptyDir{ medium: Memory }` if libs (matplotlib cache!) need it.
- **No `apt` for OS CVE patches** — but distroless rebuilds nightly. Pin `gcr.io/distroless/python3-debian12:nonroot` only by digest in the registry, not by tag.
- **Glibc mismatch** — if you build on `alpine` then copy to `debian12` you'll get `ImportError`. Stay on debian-based slim → debian-based distroless.

## How CallSphere does this in production

CallSphere ships every voice-agent image at ~85 MB on `gcr.io/distroless/python3-debian12:nonroot`. With 37 agents pulling on every pod restart across a fleet of k3s nodes, going from 950 MB to 85 MB cut our image-pull p95 from 12 s to 0.9 s. We also gained back ~12 GB of container-host disk per node. 90+ tools, 115+ DB tables, $149/$499/$1499, 14-day [trial](/trial), 22% [affiliate](/affiliate), [demo](/demo).

## FAQ

**Q: Why not Alpine?**
musl libc breaks some Python wheels (notably `grpcio`, `numpy` on older releases). Distroless uses glibc — safer for AI stacks.

**Q: How do I debug a distroless container?**
Use `gcr.io/distroless/python3-debian12:debug-nonroot` for the same image with `busybox`. Switch only in dev.

**Q: Can I add tini for signal handling?**
Distroless includes a proper PID-1 `/usr/bin/python` invocation already; you don't need tini.

**Q: SBOM where?**
`docker buildx imagetools inspect  --format '{{ json .SBOM }}'` shows the SPDX SBOM stored as referrer.

## Sources

- [Distroless Python Containers with uv — Nerd Level Tech](https://nerdleveltech.com/distroless-python-containers-with-uv-tutorial)
- [Multi-Stage Docker Builds for Python AI APIs](https://dasroot.net/posts/2026/02/multi-stage-docker-builds-python-ai-apis/)
- [Docker Multi-Stage Builds: The Complete Guide for 2026](https://devtoolbox.dedyn.io/blog/docker-multi-stage-builds-guide)
- [Using Alpine, Distroless, and Multi-Stage Builds — OneUptime](https://oneuptime.com/blog/post/2026-01-16-docker-reduce-image-size/view)
- [Distroless Python with uv — OneUptime](https://oneuptime.com/blog/post/2026-02-17-how-to-build-a-distroless-python-container-image-with-a-virtual-environment-for-cloud-run/view)

---

Source: https://callsphere.ai/blog/vw6h-docker-multi-stage-ai-agent-distroless-uv-2026