By Sagar Shankaran, Founder of CallSphere
Hono vs openai api: wire Hono's WebSocket helpers, the OpenAI Realtime API, and Bun runtime into a sub-700ms voice agent. Real TypeScript code, deploy targets, and pitfalls.
Key takeaways
TL;DR — Hono ships a one-file WebSocket relay between a browser and the OpenAI Realtime API. With
gpt-realtime($32/M audio-in, $64/M audio-out as of late 2025) you can hit ~600-800ms voice-to-voice on a single Bun process. Hono's edge-friendly routing means the same code runs on Cloudflare Workers, Vercel Edge, Deno Deploy, or Node 22.
A TypeScript backend that serves a static HTML mic page and exposes a /realtime WebSocket. Browser audio (PCM16 24kHz) is forwarded to OpenAI Realtime; model audio + transcripts are streamed back. Tool calls (e.g. book_appointment) are handled server-side and the result is fed back into the same session.
hono@^4.6, @hono/node-ws (Node) or built-in Bun WS.gpt-realtime GA from Aug 2025).getUserMedia (Chrome 120+, Safari 17+).flowchart LR
BR[Browser mic] -- WS PCM16 --> H[Hono /realtime]
H -- WS gpt-realtime --> OA[OpenAI Realtime API]
OA -- audio.delta --> H --> BR
OA -- response.function_call --> H
H -- tool result --> OA
```ts import { Hono } from "hono"; import { upgradeWebSocket } from "hono/bun";
const app = new Hono(); const OPENAI_WS = "wss://api.openai.com/v1/realtime?model=gpt-realtime";
app.get("/", (c) => c.html(<script type="module" src="/client.js"></script>));
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
app.get(
"/realtime",
upgradeWebSocket(() => ({
onOpen: (e, ws) => {
const oa = new WebSocket(OPENAI_WS, {
headers: {
Authorization: Bearer ${process.env.OPENAI_API_KEY},
"OpenAI-Beta": "realtime=v1",
},
} as any);
(ws as any).oa = oa;
oa.onmessage = (m) => ws.send(m.data);
},
onMessage: (e, ws) => (ws as any).oa?.send(e.data),
onClose: (, ws) => (ws as any).oa?.close(),
})),
);
export default { port: 8787, fetch: app.fetch, websocket: { /* bun ws */ } }; ```
```ts oa.onopen = () => oa.send(JSON.stringify({ type: "session.update", session: { voice: "alloy", input_audio_transcription: { model: "gpt-4o-mini-transcribe" }, turn_detection: { type: "server_vad", threshold: 0.55 }, tools: [{ type: "function", name: "book_appointment", description: "Book a slot", parameters: { type: "object", properties: { iso: { type: "string" } } } }], } })); ```
```ts
const ctx = new AudioContext({ sampleRate: 24000 });
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
const src = ctx.createMediaStreamSource(stream);
await ctx.audioWorklet.addModule("/pcm-worklet.js");
const node = new AudioWorkletNode(ctx, "pcm");
src.connect(node);
const ws = new WebSocket(ws://${location.host}/realtime);
node.port.onmessage = (e) => ws.readyState === 1 && ws.send(JSON.stringify({
type: "input_audio_buffer.append",
audio: btoa(String.fromCharCode(...new Uint8Array(e.data)))
}));
```
```ts oa.onmessage = async (m) => { const evt = JSON.parse(m.data.toString()); if (evt.type === "response.function_call_arguments.done") { const args = JSON.parse(evt.arguments); const result = await db.book(args.iso); oa.send(JSON.stringify({ type: "conversation.item.create", item: { type: "function_call_output", call_id: evt.call_id, output: JSON.stringify(result) }, })); oa.send(JSON.stringify({ type: "response.create" })); } ws.send(m.data); // forward all events to browser }; ```
bun build --target=bun src/index.ts then fly deploy or wrangler deploy (Hono's WebSocket adapter ships for both). Add fly scale memory 512 and set OPENAI_API_KEY as a secret.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
commit on server VAD: Don't send input_audio_buffer.commit when turn_detection: server_vad is set; the model commits automatically.CallSphere runs 37 production agents across 6 verticals with 90+ tools and 115+ Postgres tables. The Healthcare stack (FastAPI), OneRoof real-estate (Next.js 16 + React 19), Salon (NestJS 10 + Prisma), and Sales (Node.js 20 + React 18 + Vite) all share a Hono-based realtime relay that handles 1.2M concurrent voice minutes/month with ~720ms p95 voice-to-voice. Pricing is $149/$499/$1,499 with a 14-day no-card trial and a 22% recurring affiliate.
Why Hono over Express? Hono is ~14kb, runs on every JS runtime, and has first-class WebSocket helpers for Bun, Node, Workers, and Deno without code changes.
Can I use Node instead of Bun? Yes — swap hono/bun for @hono/node-ws. Bun is ~2x faster on cold start.
What's the cost per minute? gpt-realtime is ~$0.06/min audio in + $0.24/min audio out — call it ~$0.20/min for typical voice agent traffic.
Does WebRTC work too? Yes. For browser-direct WebRTC, mint an ephemeral key via /v1/realtime/sessions and skip the relay entirely.
This guide is written for engineers and operators evaluating hono vs openai api in real production systems. Hono vs openai api sits alongside http localhost in the daily work of teams shipping production AI. The notes below give a plain-language reference for terms used throughout the article.
For teams that want to ship hono vs openai api in voice and chat agents this quarter, CallSphere runs 37 agents and 90+ function tools across 6 verticals on a single dashboard. Start a 14-day trial, see live demo agents, or compare tiers on /pricing.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
The voice AI market hits $47.5B by 2034. For gyms and PT studios, voice agents now make economic sense for member intake, upsells, and reactivation campaigns.
With the voice AI market at $47.5B by 2034 and OpenAI's realtime release this week, every dealership and service shop should be evaluating voice agents. Here's how.
Spring 2026 AC season starts now. With the voice AI market at $47.5B by 2034, HVAC shops without after-hours voice agents will lose to those that have them.
OpenAI's GPT-Realtime-Translate handles 70 input languages live at $0.034/min. Here is what that means for multilingual restaurant takeout — and how CallSphere ships it.
OpenAI's GPT-Realtime-Translate hits 70 languages at $0.034/min. For dental practices in diverse metros, this changes who picks up the phone — and who books the appointment.
Google Cloud Next rebranded Vertex AI as Gemini Enterprise Agent Platform with 2M context. Here is what that means for salon and beauty bookings — and where CallSphere fits.
© 2026 CallSphere LLC. All rights reserved.
Watch how CallSphere handles real customer calls, schedules appointments, and processes payments — live.
Try Live DemoBook a DemoCalculate Your ROI