TL;DR — Don't ship your API key to the browser. Use a Next.js Route Handler to mint a 60-second ephemeral_key, then let the browser open a WebRTC peer connection straight to OpenAI. Audio capture, playback, and barge-in come for free with WebRTC.

What you'll build

A Next.js 14 (App Router) page with a single "Talk" button. Click it, grant microphone permission, and speak — the OpenAI Realtime model replies through WebRTC with sub-500ms latency on a good connection. Deploy to Vercel and the same code becomes a public voice demo.

Prerequisites

Next.js 14+ (App Router), React 18.
OpenAI API key with Realtime access.
Node 20+ and npm install (no extra deps required for the core).
Familiarity with React Server Components and Route Handlers.
Browser supporting WebRTC + getUserMedia (everything modern).

Architecture

sequenceDiagram
  participant B as Browser
  participant N as Next.js (Route Handler)
  participant O as OpenAI Realtime
  B->>N: GET /api/realtime/session
  N->>O: POST /v1/realtime/sessions (Bearer key)
  O-->>N: { client_secret.value }
  N-->>B: ephemeral_key
  B->>O: SDP offer + Bearer ephemeral_key
  O-->>B: SDP answer
  B<-->O: Audio (RTP) + DataChannel events

Step 1 — Route Handler that mints an ephemeral key

```ts // app/api/realtime/session/route.ts export async function GET() { const r = await fetch("https://api.openai.com/v1/realtime/sessions", { method: "POST", headers: { Authorization: `Bearer ${process.env.OPENAI_API_KEY!}`, "Content-Type": "application/json", }, body: JSON.stringify({ model: "gpt-4o-realtime-preview-2025-06-03", voice: "alloy", modalities: ["audio", "text"], instructions: "You are a CallSphere demo assistant. Be concise and warm.", }), }); return Response.json(await r.json()); } ```

This returns { client_secret: { value, expires_at } } — the value is the short-lived bearer your browser will use.

Step 2 — Client component with the talk button

```tsx // app/page.tsx "use client"; import { useState, useRef } from "react";

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

export default function Page() { const [active, setActive] = useState(false); const pcRef = useRef<RTCPeerConnection | null>(null); const audioRef = useRef<HTMLAudioElement | null>(null);

async function start() { const { client_secret } = await fetch("/api/realtime/session").then(r => r.json()); const ephemeral = client_secret.value;

const pc = new RTCPeerConnection();
pcRef.current = pc;

// Receive remote audio
pc.ontrack = (e) => { audioRef.current!.srcObject = e.streams[0]; };

// Send mic
const ms = await navigator.mediaDevices.getUserMedia({ audio: true });
pc.addTrack(ms.getAudioTracks()[0]);

// Data channel for events
const dc = pc.createDataChannel("oai-events");
dc.onmessage = (e) => console.log("event:", JSON.parse(e.data));

const offer = await pc.createOffer();
await pc.setLocalDescription(offer);

const sdpRes = await fetch(
  "https://api.openai.com/v1/realtime?model=gpt-4o-realtime-preview-2025-06-03",
  {
    method: "POST",
    body: offer.sdp,
    headers: {
      Authorization: \`Bearer ${ephemeral}\`,
      "Content-Type": "application/sdp",
    },
  }
);
await pc.setRemoteDescription({ type: "answer", sdp: await sdpRes.text() });
setActive(true);

}

return (

); } ```

Step 3 — Send a session.update via DataChannel

The default session config is fine, but you usually want to override the system prompt:

```ts dc.onopen = () => dc.send(JSON.stringify({ type: "session.update", session: { instructions: "You are CallSphere. Always end with: would you like a demo?", turn_detection: { type: "server_vad", threshold: 0.5 }, }, })); ```

Step 4 — Add a "hang up" button

```tsx function stop() { pcRef.current?.getSenders().forEach((s) => s.track?.stop()); pcRef.current?.close(); pcRef.current = null; setActive(false); } ```

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Step 5 — Deploy to Vercel

```bash vercel --prod ```

Set OPENAI_API_KEY in Vercel project settings. The Route Handler runs at the edge or Node runtime — both work. Public URL is your demo.

Common pitfalls

Sending API key to client: never. Always go through the Route Handler.
Ephemeral key expired: it lasts ~60s. Mint a fresh one per session.
Autoplay blocked: the <audio autoPlay> works only after a user gesture — your "Talk" button satisfies this.
CORS errors on /v1/realtime POST: OpenAI returns a Content-Type: application/sdp body; don't res.json() it.

How CallSphere does this in production

The public demo at /demo uses this exact pattern with per-industry prompts (Healthcare, Real Estate, Salon, Forex, Hospitality, Behavioral Health). The Real Estate "OneRoof" demo additionally connects to a Go gateway over NATS for tool calls — but the WebRTC handshake is the same Next.js code. See it live at /demo or start a 14-day trial.

FAQ

Is WebRTC faster than WebSocket? Yes, by 100–300ms typically — RTC handles the audio path natively without your code re-encoding chunks.

Can I record the conversation? Yes — pc.getSenders()[0].track gives you the local mic; pipe it to a MediaRecorder. Remote audio is the ontrack stream.

Does WebRTC work behind corporate firewalls? Mostly — you may need a TURN server. OpenAI's endpoint typically traverses NAT cleanly.

How do I add tools? Send a session.update with a tools array via DataChannel; handle response.function_call_arguments.done events.

How to Build a Next.js 14 Voice Demo with OpenAI Realtime + WebRTC

What you'll build

Prerequisites

Architecture

Step 1 — Route Handler that mints an ephemeral key

Step 2 — Client component with the talk button

Step 3 — Send a session.update via DataChannel

Step 4 — Add a "hang up" button

Step 5 — Deploy to Vercel

Common pitfalls

How CallSphere does this in production

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

Build a Chat Agent with Haystack RAG + Open LLM (Llama 3.2, 2026)

WebRTC Mobile Testing with BrowserStack + Sauce Labs (2026)

Build a Voice Agent on Cloudflare Workers AI (No External LLM)

OpenAI's May 2026 WebRTC Rearchitecture: How Voice Latency Got Real

Building a Custom Calling Platform: Enterprise Guide

How to Build Voice Agent CI/CD with Evals as Gate (GitHub Actions)