TL;DR — fly.io is the simplest way to put a voice agent within 50ms of every user worldwide. Drop a Dockerfile, declare regions in fly.toml, and fly deploy ships your Pipecat bot to all of them.

What you'll build

A Pipecat-based voice agent containerized with Docker, deployed across iad, fra, syd. Fly's Anycast routes each user to the nearest healthy machine; the fly-replay header keeps WebRTC sessions sticky to one region for the duration of a call.

Prerequisites

flyctl CLI installed and fly auth login.
Pipecat 0.0.50+ (pip install pipecat-ai).
OPENAI_API_KEY and DAILY_API_KEY (Pipecat default transport) saved as Fly secrets.
Docker for local builds.
A fly.toml and a Dockerfile.

Architecture

flowchart LR
  Ucs[US user] --> A[Anycast]
  Ufr[EU user] --> A
  Uau[AU user] --> A
  A --> RIad[Machine in iad]
  A --> RFra[Machine in fra]
  A --> RSyd[Machine in syd]

Step 1 — Pipecat bot

bot.py:

```python import asyncio from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.runner import PipelineRunner from pipecat.pipeline.task import PipelineTask from pipecat.transports.services.daily import DailyParams, DailyTransport from pipecat.services.openai import OpenAILLMService from pipecat.services.deepgram import DeepgramSTTService from pipecat.services.cartesia import CartesiaTTSService

async def main(room_url, token): transport = DailyTransport(room_url, token, "CallSphere", DailyParams(audio_in_enabled=True, audio_out_enabled=True))

stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_KEY"))
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o-mini")
tts = CartesiaTTSService(api_key=os.getenv("CARTESIA_KEY"))

pipeline = Pipeline([transport.input(), stt, llm, tts, transport.output()])
await PipelineRunner().run(PipelineTask(pipeline))

if name == "main": asyncio.run(main(os.environ["ROOM_URL"], os.environ["TOKEN"])) ```

Step 2 — Tiny HTTP front

server.py exposes a route that spawns a bot subprocess per call (Fly machines can fork):

```python from fastapi import FastAPI, Request import subprocess, uuid, os

app = FastAPI()

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

@app.post("/start") async def start(req: Request): body = await req.json() env = os.environ | {"ROOM_URL": body["room_url"], "TOKEN": body["token"]} subprocess.Popen(["python", "bot.py"], env=env) return {"id": str(uuid.uuid4())}

@app.get("/healthz") def healthz(): return {"ok": True} ```

Step 3 — Dockerfile

```dockerfile FROM python:3.12-slim

RUN apt-get update && apt-get install -y ffmpeg && rm -rf /var/lib/apt/lists/*

WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 7860 CMD ["uvicorn", "server:app", "--host", "0.0.0.0", "--port", "7860"] ```

Step 4 — fly.toml

```toml app = "callsphere-voice-fly" primary_region = "iad"

[build] dockerfile = "Dockerfile"

[http_service] internal_port = 7860 force_https = true auto_stop_machines = "stop" auto_start_machines = true min_machines_running = 1 processes = ["app"]

[[vm]] cpu_kind = "shared" cpus = 2 memory = "2gb"

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

[deploy] strategy = "rolling" ```

Add regions:

```bash fly deploy fly scale count 2 --region iad fly scale count 1 --region fra fly scale count 1 --region syd ```

Step 5 — Sticky sessions with fly-replay

WebRTC SDP exchange must hit the same machine. If a request lands on iad but the session lives in syd, return the fly-replay header:

```python @app.post("/sdp") async def sdp(req: Request): sid = req.headers.get("X-Session-Id") region = lookup_region(sid) # your KV if region != os.environ["FLY_REGION"]: return Response(status_code=204, headers={"fly-replay": f"region={region}"}) return handle_sdp(await req.json()) ```

Step 6 — Set secrets

```bash fly secrets set OPENAI_API_KEY=sk-... fly secrets set DEEPGRAM_KEY=... fly secrets set CARTESIA_KEY=... fly secrets set DAILY_API_KEY=... ```

Common pitfalls

Forgetting auto_stop_machines = stop — idle machines cost money.
Deploying without min_machines_running — first call cold-starts in 8s.
No fly-replay — WebRTC reconnects fail on cross-region routing.
CPU vs performance VM — voice with VAD wants performance, not shared.

How CallSphere does this in production

CallSphere's voice plane runs on a dedicated 72.62.162.83 box (k3s) for predictable latency, but we ship our affiliate dashboards (/affiliate, 22% commission) on Fly across 4 regions for low-latency partner UX. 37 agents, 90+ tools, 6 verticals — pricing $149/$499/$1499 with a 14-day trial.

FAQ

Why not just one region? EU users get 200ms RTT to US-east; voice falls apart over 250ms.

Cost for 3-region voice? ~$45/mo for the warm pool + outbound bandwidth.

Volume scaling? fly scale count per region, or auto_start_machines for traffic-driven.

Can I use LiveKit? Yes — Daily and LiveKit both work on Fly.

Logs? fly logs streams from all regions.

Deploy a Voice Agent on fly.io with Multi-Region Routing

What you'll build

Prerequisites

Architecture

Step 1 — Pipecat bot

Step 2 — Tiny HTTP front

Step 3 — Dockerfile

Step 4 — fly.toml

Step 5 — Sticky sessions with fly-replay

Step 6 — Set secrets

Common pitfalls

How CallSphere does this in production

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

Build a Chat Agent with Haystack RAG + Open LLM (Llama 3.2, 2026)

Build a Voice Agent on Cloudflare Workers AI (No External LLM)

How to Build Voice Agent CI/CD with Evals as Gate (GitHub Actions)

Build a CallSphere-Style Outbound Voice Campaign Tool

Build a CallSphere-Style Multi-Agent for HVAC Dispatch

Build a Chat Agent with LangChain.js + Ollama (Local, 2026)

Product

Resources

Company

Legal

Industries

Integrations

Solutions

Compare

Pillar Guides

See AI Voice Agents in Action