By Sagar Shankaran, Founder of CallSphere
Ship a production voice agent in 5 minutes on Railway: FastAPI bridge, OpenAI Realtime, Postgres for sessions, and a one-click template. No Docker knowledge required.
Key takeaways
TL;DR — Railway gives you Postgres, a Python service, environment variables, and a public HTTPS URL with two clicks. Drop in a FastAPI WebSocket bridge between Twilio and OpenAI Realtime, push to GitHub, and Railway redeploys on every commit. Total time from
git initto live voice agent: under 5 minutes.
A FastAPI service hosted on Railway that:
/incoming/media WebSocket to OpenAI Realtimegit pushrailway login).OPENAI_API_KEY, Twilio number.flowchart LR
C[Caller] --> T[Twilio]
T -->|HTTP TwiML| RW[Railway FastAPI]
T -->|wss media| RW
RW <-->|wss| OAI[OpenAI Realtime]
RW -->|asyncpg| PG[(Railway Postgres)]
GH[GitHub repo] -->|push| RW
```python
import os, json, base64, asyncio, asyncpg, websockets from fastapi import FastAPI, WebSocket, Request from fastapi.responses import Response
app = FastAPI() pool: asyncpg.Pool
@app.on_event("startup") async def startup(): global pool pool = await asyncpg.create_pool(os.environ["DATABASE_URL"]) async with pool.acquire() as c: await c.execute("""create table if not exists turns ( id serial primary key, call_sid text, role text, text text, ts timestamptz default now())""")
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
@app.post("/incoming")
async def incoming(req: Request):
host = req.headers["host"]
return Response(content=f"""
@app.websocket("/media") async def media(ws: WebSocket): await ws.accept() async with websockets.connect( "wss://api.openai.com/v1/realtime?model=gpt-realtime", additional_headers={"Authorization": f"Bearer {os.environ['OPENAI_API_KEY']}", "OpenAI-Beta": "realtime=v1"} ) as ai: await ai.send(json.dumps({ "type": "session.update", "session": { "instructions": "You are a concise voice agent.", "voice": "marin", "input_audio_format": "g711_ulaw", "output_audio_format": "g711_ulaw", "turn_detection": {"type": "server_vad"} } })) sid = "" async def to_ai(): async for raw in ws.iter_text(): ev = json.loads(raw) nonlocal_sid = ev.get("streamSid") if ev.get("event") == "media": await ai.send(json.dumps({"type": "input_audio_buffer.append", "audio": ev["media"]["payload"]})) async def to_caller(): async for raw in ai: ev = json.loads(raw) if ev["type"] == "response.audio.delta": await ws.send_text(json.dumps({"event": "media", "streamSid": sid, "media": {"payload": ev["delta"]}})) if ev["type"] == "response.done": text = ev["response"]["output"][0]["content"][0]["transcript"] async with pool.acquire() as c: await c.execute("insert into turns(call_sid, role, text) values($1, 'assistant', $2)", sid, text) await asyncio.gather(to_ai(), to_caller()) ```
requirements.txt and railway.toml``` fastapi==0.115.0 uvicorn[standard]==0.32.0 websockets==13.1 asyncpg==0.30.0 ```
```toml
[build] builder = "NIXPACKS" [deploy] startCommand = "uvicorn app:app --host 0.0.0.0 --port $PORT" restartPolicyType = "ON_FAILURE" ```
Railway's Nixpacks builder detects Python automatically; no Dockerfile needed.
In Railway dashboard: New Project → Deploy from GitHub repo → pick the FastAPI repo. Add a Postgres plugin from the same project; Railway sets DATABASE_URL automatically.
In the service settings, add OPENAI_API_KEY. Railway redeploys on save.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Railway generates https://your-app.up.railway.app. Plug into Twilio → number → Voice Webhook → POST https://your-app.up.railway.app/incoming.
Pin the Railway "OpenTelemetry" template, set OTEL_EXPORTER_OTLP_ENDPOINT to a Honeycomb/Tempo URL. Latency per turn shows up in spans automatically with the standard FastAPI OTel instrumentation.
Bump replicas from 1 to N in the dashboard. Railway puts a load balancer in front; sticky sessions on x-twilio-signature keep call legs pinned.
Pro ($5/mo + usage) or pin always-on.asyncpg.create_pool(min_size=2, max_size=10).runtime.txt or PYTHON_VERSION=3.11 env var.CallSphere doesn't run on Railway — we use bare k3s + Postgres on Hetzner for cost predictability at our scale (~$1k/mo infra for 6 verticals). For early-stage builders, Railway is the fastest way to ship a real voice agent with a real database. CallSphere's 37 agents, 90+ tools, 115+ DB tables, 6 verticals run on FastAPI :8084 with the same code patterns shown here. $149/$499/$1499, 14-day trial, 22% affiliate.
Q: Railway vs Render vs Fly? Railway: easiest CLI + UI, Postgres bundled. Render: similar, slightly slower deploys. Fly: best for multi-region. Pick Railway for speed.
Q: Can I use a one-click template?
Yes — Railway's marketplace has Deploy OpenAI Voice Assistant and Deploy Faster Whisper templates that wire most of this for you.
Q: Latency?
Railway runs in us-west and us-east; voice-to-voice ~750ms vs Twilio + OpenAI on East Coast.
Q: Cost at 100k call-min/month? Compute ~$30, Postgres ~$10, OpenAI Realtime ~$30k. Infra is rounding error — pick what's productive.
Q: HIPAA? Railway doesn't sign BAAs as of May 2026. For HIPAA, run on AWS/GCP/Azure with their BAA.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
A VoIP telephone number is a phone number that routes calls over the internet instead of copper lines. Learn what a VoIP number is, how to get one, what it costs, and how to pair it with an AI voice agent in 2026.
How we built a fault-tolerant HVAC emergency triage and tech-dispatch platform on Kubernetes — three-tier CQRS, 11 micro-agents on the OpenAI Agents SDK + LangGraph, NATS JetStream, DTMF/SMS/WebSocket acceptance, circuit breakers, and an evaluation pipeline that catches regressions before they wake a tech at 3 AM.
Haystack 2.7's Agent component plus an Ollama-served Llama 3.2 gives you tool-calling RAG with citations. Here's a complete pipeline against your own document store.
Run STT, LLM, and TTS entirely on Cloudflare's edge — no OpenAI, no ElevenLabs. Real working code with Whisper, Llama 3.3 70B, and Deepgram Aura.
On May 4 2026 OpenAI published its Realtime stack rebuild — split-relay plus transceiver edge. Here is what changed and what it means for production voice agents.
Version your prompts in git, run a 50-case eval suite on every PR, block merges below threshold, and ship a new agent prompt with confidence — full GitHub Actions tutorial.
© 2026 CallSphere LLC. All rights reserved.
Watch how CallSphere handles real customer calls, schedules appointments, and processes payments — live.
Try Live DemoBook a DemoCalculate Your ROI