---
title: "How to Build a Next.js 14 Voice Demo with OpenAI Realtime + WebRTC"
description: "Mint an ephemeral OpenAI key from a Next.js Route Handler, connect via WebRTC from the browser, and ship a working voice demo to Vercel in one afternoon."
canonical: https://callsphere.ai/blog/vw1h-build-nextjs-14-voice-demo-webrtc-openai-realtime-ephemeral
category: "AI Voice Agents"
tags: ["Tutorial", "Build", "Next.js", "WebRTC", "OpenAI Realtime"]
author: "CallSphere Team"
published: 2026-03-24T00:00:00.000Z
updated: 2026-05-07T06:45:00.317Z
---

# How to Build a Next.js 14 Voice Demo with OpenAI Realtime + WebRTC

> Mint an ephemeral OpenAI key from a Next.js Route Handler, connect via WebRTC from the browser, and ship a working voice demo to Vercel in one afternoon.

> **TL;DR** — Don't ship your API key to the browser. Use a Next.js Route Handler to mint a 60-second `ephemeral_key`, then let the browser open a WebRTC peer connection straight to OpenAI. Audio capture, playback, and barge-in come for free with WebRTC.

## What you'll build

A Next.js 14 (App Router) page with a single "Talk" button. Click it, grant microphone permission, and speak — the OpenAI Realtime model replies through WebRTC with sub-500ms latency on a good connection. Deploy to Vercel and the same code becomes a public voice demo.

## Prerequisites

1. Next.js 14+ (App Router), React 18.
2. OpenAI API key with Realtime access.
3. Node 20+ and `npm install` (no extra deps required for the core).
4. Familiarity with React Server Components and Route Handlers.
5. Browser supporting WebRTC + getUserMedia (everything modern).

## Architecture

```mermaid
sequenceDiagram
  participant B as Browser
  participant N as Next.js (Route Handler)
  participant O as OpenAI Realtime
  B->>N: GET /api/realtime/session
  N->>O: POST /v1/realtime/sessions (Bearer key)
  O-->>N: { client_secret.value }
  N-->>B: ephemeral_key
  B->>O: SDP offer + Bearer ephemeral_key
  O-->>B: SDP answer
  BO: Audio (RTP) + DataChannel events
```

## Step 1 — Route Handler that mints an ephemeral key

```ts
// app/api/realtime/session/route.ts
export async function GET() {
  const r = await fetch("[https://api.openai.com/v1/realtime/sessions](https://api.openai.com/v1/realtime/sessions)", {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.OPENAI_API_KEY!}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      model: "gpt-4o-realtime-preview-2025-06-03",
      voice: "alloy",
      modalities: ["audio", "text"],
      instructions: "You are a CallSphere demo assistant. Be concise and warm.",
    }),
  });
  return Response.json(await r.json());
}
```

This returns `{ client_secret: { value, expires_at } }` — the `value` is the short-lived bearer your browser will use.

## Step 2 — Client component with the talk button

```tsx
// app/page.tsx
"use client";
import { useState, useRef } from "react";

export default function Page() {
  const [active, setActive] = useState(false);
  const pcRef = useRef(null);
  const audioRef = useRef(null);

async function start() {
    const { client_secret } = await fetch("/api/realtime/session").then(r => r.json());
    const ephemeral = client_secret.value;

```
const pc = new RTCPeerConnection();
pcRef.current = pc;

// Receive remote audio
pc.ontrack = (e) => { audioRef.current!.srcObject = e.streams[0]; };

// Send mic
const ms = await navigator.mediaDevices.getUserMedia({ audio: true });
pc.addTrack(ms.getAudioTracks()[0]);

// Data channel for events
const dc = pc.createDataChannel("oai-events");
dc.onmessage = (e) => console.log("event:", JSON.parse(e.data));

const offer = await pc.createOffer();
await pc.setLocalDescription(offer);

const sdpRes = await fetch(
  "https://api.openai.com/v1/realtime?model=gpt-4o-realtime-preview-2025-06-03",
  {
    method: "POST",
    body: offer.sdp,
    headers: {
      Authorization: \`Bearer ${ephemeral}\`,
      "Content-Type": "application/sdp",
    },
  }
);
await pc.setRemoteDescription({ type: "answer", sdp: await sdpRes.text() });
setActive(true);
```

}

return (

        {active ? "Live" : "Talk"}

  );
}
```

## Step 3 — Send a session.update via DataChannel

The default session config is fine, but you usually want to override the system prompt:

```ts
dc.onopen = () => dc.send(JSON.stringify({
  type: "session.update",
  session: {
    instructions: "You are CallSphere. Always end with: would you like a demo?",
    turn_detection: { type: "server_vad", threshold: 0.5 },
  },
}));
```

## Step 4 — Add a "hang up" button

```tsx
function stop() {
  pcRef.current?.getSenders().forEach((s) => s.track?.stop());
  pcRef.current?.close();
  pcRef.current = null;
  setActive(false);
}
```

## Step 5 — Deploy to Vercel

```bash
vercel --prod
```

Set `OPENAI_API_KEY` in Vercel project settings. The Route Handler runs at the edge or Node runtime — both work. Public URL is your demo.

## Common pitfalls

- **Sending API key to client**: never. Always go through the Route Handler.
- **Ephemeral key expired**: it lasts ~60s. Mint a fresh one per session.
- **Autoplay blocked**: the `` works only after a user gesture — your "Talk" button satisfies this.
- **CORS errors on `/v1/realtime` POST**: OpenAI returns a `Content-Type: application/sdp` body; don't `res.json()` it.

## How CallSphere does this in production

The public demo at `/demo` uses this exact pattern with per-industry prompts (Healthcare, Real Estate, Salon, Forex, Hospitality, Behavioral Health). The Real Estate "OneRoof" demo additionally connects to a Go gateway over NATS for tool calls — but the WebRTC handshake is the same Next.js code. [See it live at /demo](/demo) or [start a 14-day trial](/trial).

## FAQ

**Is WebRTC faster than WebSocket?** Yes, by 100–300ms typically — RTC handles the audio path natively without your code re-encoding chunks.

**Can I record the conversation?** Yes — `pc.getSenders()[0].track` gives you the local mic; pipe it to a `MediaRecorder`. Remote audio is the `ontrack` stream.

**Does WebRTC work behind corporate firewalls?** Mostly — you may need a TURN server. OpenAI's endpoint typically traverses NAT cleanly.

**How do I add tools?** Send a `session.update` with a `tools` array via DataChannel; handle `response.function_call_arguments.done` events.

## Sources

- [OpenAI Realtime WebRTC guide](https://platform.openai.com/docs/guides/realtime-webrtc)
- [Next.js Route Handlers](https://nextjs.org/docs/app/building-your-application/routing/route-handlers)
- [WebRTC RTCPeerConnection MDN](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection)
- [OpenAI Realtime sessions endpoint](https://platform.openai.com/docs/api-reference/realtime-sessions)

---

Source: https://callsphere.ai/blog/vw1h-build-nextjs-14-voice-demo-webrtc-openai-realtime-ephemeral
