TL;DR — pnpm create t3-app@latest scaffolds a Next.js 15 App Router project with tRPC v11, Prisma, Auth.js v5 and Tailwind. Bolt OpenAI Realtime + WebRTC ephemeral keys onto it and you ship a typed, authenticated voice agent in one afternoon.

What you'll build

A logged-in user opens /voice, the page mints an ephemeral OpenAI key from a tRPC procedure, the browser opens a WebRTC peer connection to OpenAI Realtime, and call transcripts are written to Postgres via Prisma — all type-safe.

Prerequisites

Node 20+, pnpm@9.
pnpm create t3-app@latest --noInstall and pick: tRPC, Prisma, Tailwind, Auth.js.
OPENAI_API_KEY and a Postgres URL.

Architecture

flowchart TD
  U[User] --> NX[Next.js 15 App Router]
  NX --> TR[tRPC v11 - mintEphemeral]
  TR --> OA[POST /v1/realtime/sessions]
  OA --> NX
  NX -- WebRTC SDP --> RT[OpenAI Realtime]
  NX --> PR[Prisma transcripts]

Step 1 — Add the realtime procedure

```ts // server/api/routers/voice.ts import { z } from "zod"; import { protectedProcedure, createTRPCRouter } from "@/server/api/trpc";

export const voiceRouter = createTRPCRouter({ mintEphemeral: protectedProcedure .input(z.object({ voice: z.enum(["alloy","verse"]).default("alloy") })) .mutation(async ({ input }) => { const r = await fetch("https://api.openai.com/v1/realtime/sessions", { method: "POST", headers: { Authorization: Bearer ${process.env.OPENAI_API_KEY}, "Content-Type": "application/json" }, body: JSON.stringify({ model: "gpt-realtime", voice: input.voice }), }); return (await r.json()) as { client_secret: { value: string } }; }), }); ```

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

Step 2 — Browser WebRTC handshake

```tsx "use client"; import { api } from "@/trpc/react";

export function VoiceCall() { const mint = api.voice.mintEphemeral.useMutation(); async function start() { const { client_secret } = await mint.mutateAsync({ voice: "alloy" }); const pc = new RTCPeerConnection(); pc.ontrack = (e) => (audioEl.current!.srcObject = e.streams[0]); const ms = await navigator.mediaDevices.getUserMedia({ audio: true }); ms.getTracks().forEach((t) => pc.addTrack(t, ms)); const dc = pc.createDataChannel("oai-events"); const offer = await pc.createOffer(); await pc.setLocalDescription(offer); const ans = await fetch( "https://api.openai.com/v1/realtime?model=gpt-realtime", { method: "POST", body: offer.sdp, headers: { Authorization: Bearer ${client_secret.value}, "Content-Type": "application/sdp" } }); await pc.setRemoteDescription({ type: "answer", sdp: await ans.text() }); } return ; } ```

Step 3 — Persist transcripts

```prisma model Transcript { id String @id @default(cuid()) userId String text String role String createdAt DateTime @default(now()) } ```

```ts dc.addEventListener("message", async (e) => { const evt = JSON.parse(e.data); if (evt.type === "response.audio_transcript.done") await api.voice.saveTurn.mutate({ text: evt.transcript, role: "assistant" }); }); ```

Step 4 — Auth gate

Wrap the page in auth() — Auth.js v5 has stable App Router support. Unauthenticated users see a redirect; the tRPC procedure already uses protectedProcedure.

Step 5 — Deploy

vercel --prod works out of the box; set OPENAI_API_KEY and DATABASE_URL as environment variables. Use Vercel Postgres or Neon.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Pitfalls

WebRTC + corporate firewalls: Some networks block UDP 3478. Fall back to WebSocket if pc.iceConnectionState stays checking >5s.
Ephemeral key TTL: Default 60s — mint right before createOffer.
Auth.js v5 cookies: Edge middleware sometimes strips __Secure- cookies — set useSecureCookies: process.env.NODE_ENV === "production" explicitly.

How CallSphere does this in production

CallSphere's platform combines T3-style typing across 37 agents, 90+ tools, 115+ DB tables, and 6 verticals. OneRoof (Next.js 16 + React 19) is the closest analog: Auth.js gates the realtime endpoint, Prisma persists transcripts, and tRPC carries every tool call. $149/$499/$1,499, 14-day no-card trial, 22% affiliate.

FAQ

Is T3 still relevant in 2026? Yes — best free TS full-stack starter, but consider T4 (T3 + Vercel AI SDK + RAG) for AI-heavy apps.

Can I swap Prisma for Drizzle? create-t3-app --drizzle is supported in 2026.

Why WebRTC over WebSocket? Browsers handle echo cancellation + jitter natively over WebRTC.

Where does the API key live? Server only — clients receive only ephemeral keys.

Sources

Create T3 App - https://create.t3.gg/
T3 Stack 2026 review - https://starterpick.com/blog/t3-stack-2026
OpenAI Realtime WebRTC - https://developers.openai.com/api/docs/guides/realtime-webrtc
T4 stack - https://www.sitepoint.com/t4-stack-nextjs-16-vercel-ai-sdk-local-rag-tutorial/

Build an AI Voice Agent on the T3 Stack (Next.js + tRPC + Prisma, 2026)

What you'll build

Prerequisites

Architecture

Step 1 — Add the realtime procedure

Step 2 — Browser WebRTC handshake

Step 3 — Persist transcripts

Step 4 — Auth gate

Step 5 — Deploy

Pitfalls

How CallSphere does this in production

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

Build a Multi-Region Voice Agent on Fly.io for Sub-500ms Global Latency (2026)

Build an AI Voice Agent with SolidStart + SolidJS + OpenAI Realtime (2026)

TensorFlow.js + ML5.js Voice Agents in the Browser: 2026 Architecture

Build an AI Voice Agent with Nuxt 3 + Vue 3.5 + OpenAI Realtime (2026)

Build a Voice Agent with Bolna (Open-Source Production Stack)

Build an AI Voice Agent with SvelteKit + WebRTC + OpenAI Realtime (2026)