By Sagar Shankaran, Founder of CallSphere
Containerize your Node.js Twilio bridge, deploy to k3s with a single Helm-less manifest, and expose the WebSocket via Cloudflare Tunnel — no public IP, no LoadBalancer fees.
Key takeaways
TL;DR — A voice agent on k3s + Cloudflare Tunnel costs less than $10/mo and gives you a stable HTTPS hostname for Twilio webhooks. No LoadBalancer, no public IP, no NAT traversal headaches.
A Dockerized Node.js Twilio + OpenAI Realtime bridge running on a single k3s node, exposed to the internet through a named Cloudflare Tunnel. Twilio hits https://voice.example.com/incoming, your agent answers, and you have zero open inbound ports on the host firewall.
k3sup or k3s install script.flowchart LR
TW[Twilio] -->|HTTPS| CF[Cloudflare Edge]
CF -- QUIC tunnel --> CFD[cloudflared in cluster]
CFD --> SVC[Service voice-bridge]
SVC --> POD[Pod node bridge]
POD --> OAI[OpenAI Realtime]
```bash curl -sfL https://get.k3s.io | sh - sudo cp /etc/rancher/k3s/k3s.yaml ~/.kube/config sudo chown $USER ~/.kube/config kubectl get nodes # should show Ready ```
```dockerfile
FROM node:20-alpine WORKDIR /app COPY package*.json ./ RUN npm ci --omit=dev COPY . . EXPOSE 8080 CMD ["node", "server.js"] ```
```bash docker build -t voice-bridge:0.1.0 . docker save voice-bridge:0.1.0 | sudo k3s ctr images import - ```
```yaml
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
apiVersion: apps/v1 kind: Deployment metadata: { name: voice-bridge } spec: replicas: 1 selector: { matchLabels: { app: voice-bridge }} template: metadata: { labels: { app: voice-bridge }} spec: containers: - name: bridge image: voice-bridge:0.1.0 imagePullPolicy: Never ports: [{ containerPort: 8080 }] envFrom: [{ secretRef: { name: voice-secrets }}] resources: requests: { cpu: 100m, memory: 256Mi } limits: { cpu: 1000m, memory: 1Gi } readinessProbe: httpGet: { path: /health, port: 8080 } periodSeconds: 5
apiVersion: v1 kind: Service metadata: { name: voice-bridge } spec: selector: { app: voice-bridge } ports: [{ port: 80, targetPort: 8080 }] ```
```bash kubectl apply -f voice-bridge.yaml ```
Create the tunnel from the Cloudflare Zero Trust dashboard, or with the CLI:
```bash cloudflared tunnel login cloudflared tunnel create voice-bridge cloudflared tunnel route dns voice-bridge voice.example.com ```
Save the tunnel credentials JSON file as a k8s secret:
```bash
kubectl create secret generic cloudflared-creds \
--from-file=credentials.json=$HOME/.cloudflared/
```yaml
apiVersion: v1
kind: ConfigMap
metadata: { name: cloudflared-config }
data:
config.yaml: |
tunnel:
apiVersion: apps/v1 kind: Deployment metadata: { name: cloudflared } spec: replicas: 1 selector: { matchLabels: { app: cloudflared }} template: metadata: { labels: { app: cloudflared }} spec: containers: - name: cloudflared image: cloudflare/cloudflared:2026.3.0 args: ["tunnel", "--config", "/etc/cloudflared/config.yaml", "run"] volumeMounts: - { name: cfg, mountPath: /etc/cloudflared/config.yaml, subPath: config.yaml } - { name: creds, mountPath: /etc/cloudflared/credentials.json, subPath: credentials.json } volumes: - { name: cfg, configMap: { name: cloudflared-config }} - { name: creds, secret: { secretName: cloudflared-creds }} ```
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
```bash kubectl apply -f cloudflared.yaml ```
In the Twilio console, set the number's voice webhook to https://voice.example.com/incoming (HTTP POST). Test by dialing.
Cloudflare changes the IP and headers. Twilio signature validation needs:
```ts import twilio from "twilio"; app.use((req, res, next) => { const sig = req.header("X-Twilio-Signature")!; const url = "https://voice.example.com" + req.originalUrl; if (twilio.validateRequest(process.env.TWILIO_AUTH!, sig, url, req.body)) return next(); res.status(403).end(); }); ```
imagePullPolicy: Always with a local image — k3s tries Docker Hub. Use Never after ctr images import.wscat.http://voice-bridge.default.svc.cluster.local:80 is the right ingress target.CallSphere runs all 6 verticals on a k3s cluster (Hetzner) with Cloudflare Tunnel for ingress. Push to main does NOT auto-deploy — we rebuild via docker build, k3s ctr images import, then kubectl set image. This pattern handles ~50 concurrent calls per pod; horizontal scale comes from more pods, not bigger ones. Pricing covers infra; demo shows it live.
Why not LoadBalancer + EIP? A static EIP is $3.60/mo on AWS plus data transfer. Cloudflare Tunnel is free.
Latency cost of going through Cloudflare? ~20–40ms typically. Lower if your origin is near a Cloudflare PoP.
Can I run multiple pods? Yes — Twilio sticks to one WebSocket per call, but new calls are load-balanced across pods.
TLS termination? Cloudflare handles it. Inside the cluster runs HTTP — fine because the only ingress is the tunnel.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
Deploy GPT-Realtime-2 on Azure AI Foundry. Region availability, networking, data residency, BAA, and the gotchas teams hit in the first 48 hours.
Haystack 2.7's Agent component plus an Ollama-served Llama 3.2 gives you tool-calling RAG with citations. Here's a complete pipeline against your own document store.
Run STT, LLM, and TTS entirely on Cloudflare's edge — no OpenAI, no ElevenLabs. Real working code with Whisper, Llama 3.3 70B, and Deepgram Aura.
Version your prompts in git, run a 50-case eval suite on every PR, block merges below threshold, and ship a new agent prompt with confidence — full GitHub Actions tutorial.
Replace expensive outbound SDR tooling with a self-hosted dialer that runs OpenAI Realtime agents at 100 concurrent calls. Full architecture and code.
Each Cloudflare agent runs on a Durable Object with its own SQLite, WebSockets, and scheduling. Agents Week 2026 shipped MCP, Code Mode, and 10GB SQLite per agent.
© 2026 CallSphere LLC. All rights reserved.
Watch how CallSphere handles real customer calls, schedules appointments, and processes payments — live.
Try Live DemoBook a DemoCalculate Your ROI