Skip to content
AI Infrastructure
AI Infrastructure11 min read0 views

How to Deploy a Voice Agent to k3s with Twilio + Cloudflare Tunnel

Containerize your Node.js Twilio bridge, deploy to k3s with a single Helm-less manifest, and expose the WebSocket via Cloudflare Tunnel — no public IP, no LoadBalancer fees.

TL;DR — A voice agent on k3s + Cloudflare Tunnel costs less than $10/mo and gives you a stable HTTPS hostname for Twilio webhooks. No LoadBalancer, no public IP, no NAT traversal headaches.

What you'll build

A Dockerized Node.js Twilio + OpenAI Realtime bridge running on a single k3s node, exposed to the internet through a named Cloudflare Tunnel. Twilio hits https://voice.example.com/incoming, your agent answers, and you have zero open inbound ports on the host firewall.

Prerequisites

  1. A VPS (Hetzner CX22 or similar, ~$5/mo) with Ubuntu 22.04+.
  2. A domain on Cloudflare (free plan is fine).
  3. Docker and k3sup or k3s install script.
  4. Twilio number and OpenAI API key.
  5. ~45 minutes start to finish.

Architecture

flowchart LR
  TW[Twilio] -->|HTTPS| CF[Cloudflare Edge]
  CF -- QUIC tunnel --> CFD[cloudflared in cluster]
  CFD --> SVC[Service voice-bridge]
  SVC --> POD[Pod node bridge]
  POD --> OAI[OpenAI Realtime]

Step 1 — Install k3s

```bash curl -sfL https://get.k3s.io | sh - sudo cp /etc/rancher/k3s/k3s.yaml ~/.kube/config sudo chown $USER ~/.kube/config kubectl get nodes # should show Ready ```

Step 2 — Containerize the bridge

```dockerfile

Dockerfile

FROM node:20-alpine WORKDIR /app COPY package*.json ./ RUN npm ci --omit=dev COPY . . EXPOSE 8080 CMD ["node", "server.js"] ```

```bash docker build -t voice-bridge:0.1.0 . docker save voice-bridge:0.1.0 | sudo k3s ctr images import - ```

Step 3 — Kubernetes manifest

```yaml

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

voice-bridge.yaml

apiVersion: v1 kind: Secret metadata: { name: voice-secrets } type: Opaque stringData: OPENAI_API_KEY: sk-... HOST: voice.example.com

apiVersion: apps/v1 kind: Deployment metadata: { name: voice-bridge } spec: replicas: 1 selector: { matchLabels: { app: voice-bridge }} template: metadata: { labels: { app: voice-bridge }} spec: containers: - name: bridge image: voice-bridge:0.1.0 imagePullPolicy: Never ports: [{ containerPort: 8080 }] envFrom: [{ secretRef: { name: voice-secrets }}] resources: requests: { cpu: 100m, memory: 256Mi } limits: { cpu: 1000m, memory: 1Gi } readinessProbe: httpGet: { path: /health, port: 8080 } periodSeconds: 5


apiVersion: v1 kind: Service metadata: { name: voice-bridge } spec: selector: { app: voice-bridge } ports: [{ port: 80, targetPort: 8080 }] ```

```bash kubectl apply -f voice-bridge.yaml ```

Step 4 — Cloudflare Tunnel

Create the tunnel from the Cloudflare Zero Trust dashboard, or with the CLI:

```bash cloudflared tunnel login cloudflared tunnel create voice-bridge cloudflared tunnel route dns voice-bridge voice.example.com ```

Save the tunnel credentials JSON file as a k8s secret:

```bash kubectl create secret generic cloudflared-creds \ --from-file=credentials.json=$HOME/.cloudflared/.json ```

Step 5 — Deploy cloudflared as a sidecar deployment

```yaml apiVersion: v1 kind: ConfigMap metadata: { name: cloudflared-config } data: config.yaml: | tunnel: credentials-file: /etc/cloudflared/credentials.json ingress: - hostname: voice.example.com service: http://voice-bridge.default.svc.cluster.local:80 - service: http_status:404


apiVersion: apps/v1 kind: Deployment metadata: { name: cloudflared } spec: replicas: 1 selector: { matchLabels: { app: cloudflared }} template: metadata: { labels: { app: cloudflared }} spec: containers: - name: cloudflared image: cloudflare/cloudflared:2026.3.0 args: ["tunnel", "--config", "/etc/cloudflared/config.yaml", "run"] volumeMounts: - { name: cfg, mountPath: /etc/cloudflared/config.yaml, subPath: config.yaml } - { name: creds, mountPath: /etc/cloudflared/credentials.json, subPath: credentials.json } volumes: - { name: cfg, configMap: { name: cloudflared-config }} - { name: creds, secret: { secretName: cloudflared-creds }} ```

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

```bash kubectl apply -f cloudflared.yaml ```

Step 6 — Point Twilio at the public hostname

In the Twilio console, set the number's voice webhook to https://voice.example.com/incoming (HTTP POST). Test by dialing.

Step 7 — Twilio request signature with forwarded headers

Cloudflare changes the IP and headers. Twilio signature validation needs:

```ts import twilio from "twilio"; app.use((req, res, next) => { const sig = req.header("X-Twilio-Signature")!; const url = "https://voice.example.com" + req.originalUrl; if (twilio.validateRequest(process.env.TWILIO_AUTH!, sig, url, req.body)) return next(); res.status(403).end(); }); ```

Common pitfalls

  • imagePullPolicy: Always with a local image — k3s tries Docker Hub. Use Never after ctr images import.
  • WebSocket upgrade headers stripped: Cloudflare passes them by default; verify with wscat.
  • No HPA on a stateful WebSocket: scale by adding replicas behind a sticky session, not by autoscaling mid-call.
  • Forgetting cluster DNS: http://voice-bridge.default.svc.cluster.local:80 is the right ingress target.

How CallSphere does this in production

CallSphere runs all 6 verticals on a k3s cluster (Hetzner) with Cloudflare Tunnel for ingress. Push to main does NOT auto-deploy — we rebuild via docker build, k3s ctr images import, then kubectl set image. This pattern handles ~50 concurrent calls per pod; horizontal scale comes from more pods, not bigger ones. Pricing covers infra; demo shows it live.

FAQ

Why not LoadBalancer + EIP? A static EIP is $3.60/mo on AWS plus data transfer. Cloudflare Tunnel is free.

Latency cost of going through Cloudflare? ~20–40ms typically. Lower if your origin is near a Cloudflare PoP.

Can I run multiple pods? Yes — Twilio sticks to one WebSocket per call, but new calls are load-balanced across pods.

TLS termination? Cloudflare handles it. Inside the cluster runs HTTP — fine because the only ingress is the tunnel.

Sources

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.