Build a Bun + Hono + OpenAI Realtime Voice Agent on the Edge (2026)
Bun 1.3 + Hono is 2x faster than Node + Express for WebSocket relays. Wire it to gpt-realtime-2 and deploy to Fly.io edge for sub-500ms voice-to-voice in 6 regions.
TL;DR — Bun 1.3 starts in ~25ms cold, Hono is ~14kb, and OpenAI's gpt-realtime-2 (introduced 2026) gives you GPT-5-class reasoning over voice. Combined: a single
bun run server.tsships a 6-region edge voice agent.
What you'll build
A WebSocket relay between browser PCM and OpenAI Realtime, deployed to 6 Fly.io regions with anycast. Browsers get routed to the nearest edge, p95 voice-to-voice ~480ms.
Prerequisites
- Bun 1.3+,
hono@^4.6. - Fly.io CLI (
brew install flyctl) and an OpenAI key. - Domain on Cloudflare for TLS pass-through.
Architecture
flowchart LR
BR[Browser] --> CF[Cloudflare anycast]
CF --> FY[Fly edge nearest of 6]
FY -- WS --> H[Hono relay on Bun]
H -- WS --> OA[OpenAI Realtime gpt-realtime-2]
Step 1 — server.ts
```ts import { Hono } from "hono"; import { upgradeWebSocket } from "hono/bun";
const app = new Hono(); const URL = "wss://api.openai.com/v1/realtime?model=gpt-realtime-2";
app.get("/ws", upgradeWebSocket(() => {
let oa: WebSocket;
return {
onOpen: (_e, ws) => {
oa = new WebSocket(URL, {
headers: { Authorization: Bearer ${process.env.OPENAI_API_KEY},
"OpenAI-Beta": "realtime=v1" },
} as any);
oa.onopen = () => oa.send(JSON.stringify({
type: "session.update",
session: { voice: "verse",
turn_detection: { type: "semantic_vad" } } }));
oa.onmessage = (m) => ws.send(m.data);
},
onMessage: (e) => oa?.readyState === 1 && oa.send(e.data),
onClose: () => oa?.close(),
};
}));
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
export default { port: 8787, fetch: app.fetch, websocket: { /* bun ws */ } }; ```
Step 2 — Dockerfile
```dockerfile FROM oven/bun:1.3-alpine WORKDIR /app COPY bun.lockb package.json ./ RUN bun install --frozen-lockfile COPY . . EXPOSE 8787 CMD ["bun", "run", "server.ts"] ```
Step 3 — fly.toml
```toml app = "voice-edge" primary_region = "iad" [build] [http_service] internal_port = 8787 force_https = true [[regions]] iad [deploy] strategy = "rolling" ```
fly deploy and fly regions add lhr nrt syd fra gru for 6 edges.
Step 4 — Browser PCM
Use a 24kHz AudioWorklet to capture PCM16 chunks every 20ms and forward as base64-wrapped input_audio_buffer.append events.
Step 5 — gpt-realtime-2 reasoning
The 2026 gpt-realtime-2 model handles complex multi-tool calls in one turn. Set session.instructions with up to 8K tokens of policy without hurting latency.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Step 6 — Anycast tip
Cloudflare's *.callsphere.ai → voice-edge.fly.dev with proxied = false (DNS-only) so WebSocket round-trips bypass the proxy.
Pitfalls
- Bun's WS spec drift: Some
ws.sendoverloads differ from Node — test on Bun, not just locally. - Fly cold start: Set
min_machines_running = 1per region to keep voice cold-starts <50ms. - gpt-realtime-2 cost: Audio in/out roughly $0.07/$0.27 per minute (estimate; check current pricing).
How CallSphere does this in production
CallSphere's edge voice fleet handles 1.2M+ minutes/month across 6 verticals with 37 agents and 90+ tools. Healthcare (FastAPI), OneRoof (Next.js 16 + React 19), Salon (NestJS 10 + Prisma), Sales (Node.js 20 + React 18 + Vite). All voice flows route through a Bun + Hono relay. $149/$499/$1,499, 14-day trial, 22% affiliate.
FAQ
Why not Node? Bun's WebSocket implementation is ~2x faster on raw throughput.
Cloudflare Workers? Workers cap WS connections at 6 hours and have no persistent state — Fly + Bun is simpler.
TURN servers? WebSocket relays don't need them; only WebRTC direct does.
Cost? Fly: ~$5/region/month for 256MB shared CPU. OpenAI: ~$0.20-$0.30/min.
Sources
- Hono on Bun WebSockets - https://hono.dev/helpers/websocket
- QuotyAI - Bun + Hono backend - https://quotyai.com/blog/why-i-picked-bun-and-hono/
- OpenAI gpt-realtime - https://openai.com/index/introducing-gpt-realtime/
- StartupHub - GPT-Realtime-2 2026 - https://www.startuphub.ai/ai-news/artificial-intelligence/2026/openai-s-new-voice-api-models
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.