TL;DR — Don't dump JSON blobs in one table. Split call → turn → tool_call → transcript → analytics into normalized tables with the right indexes. CallSphere runs 115+ tables; this post shows the 6 that matter for voice.

What you'll build

A Prisma schema that captures every call, every turn, every tool invocation, plus post-call sentiment and lead scoring — and queries it in <50ms even at 1M calls. You'll see the indexes, the foreign keys, and the JSONB fields where they belong.

Prerequisites

PostgreSQL 15+ (anything 13+ works).
npm install prisma @prisma/client.
Familiarity with Prisma migrations.
A voice agent already producing transcripts (see post 1 or 2).
~30 minutes for the first migration.

Architecture

erDiagram
  CALL ||--o{ TURN : contains
  CALL ||--o{ TOOL_CALL : invokes
  CALL ||--|| ANALYTICS : has
  TURN ||--o{ TRANSCRIPT_CHUNK : has
  AGENT ||--o{ CALL : handles

Step 1 — The Prisma schema

```prisma // schema.prisma generator client { provider = "prisma-client-js" } datasource db { provider = "postgresql"; url = env("DATABASE_URL") }

model Agent { id String @id @default(cuid()) name String vertical String // healthcare, salon, real_estate, ... promptHash String voice String createdAt DateTime @default(now()) calls Call[] }

model Call { id String @id @default(cuid()) agentId String agent Agent @relation(fields: [agentId], references: [id]) fromNumber String toNumber String startedAt DateTime @default(now()) endedAt DateTime? durationMs Int? status CallStatus @default(in_progress) recordingUrl String? meta Json? turns Turn[] toolCalls ToolCall[] analytics Analytics? @@index([agentId, startedAt]) @@index([fromNumber, startedAt]) @@index([status]) }

enum CallStatus { in_progress completed failed transferred }

model Turn { id String @id @default(cuid()) callId String call Call @relation(fields: [callId], references: [id], onDelete: Cascade) role String // "user" | "assistant" text String @db.Text startedAt DateTime endedAt DateTime audioUrl String? @@index([callId, startedAt]) }

model ToolCall { id String @id @default(cuid()) callId String call Call @relation(fields: [callId], references: [id], onDelete: Cascade) name String args Json result Json? errorText String? latencyMs Int invokedAt DateTime @default(now()) @@index([callId, invokedAt]) @@index([name, invokedAt]) }

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

model Analytics { callId String @id call Call @relation(fields: [callId], references: [id], onDelete: Cascade) sentiment Float // -1.0 to 1.0 leadScore Int // 0-100 intent String entities Json computedAt DateTime @default(now()) @@index([leadScore]) @@index([sentiment]) } ```

Step 2 — Migrate

```bash npx prisma migrate dev --name init_voice_schema ```

Step 3 — Persist a call from your bridge

```ts import { PrismaClient } from "@prisma/client"; const db = new PrismaClient();

// On Twilio "start" event const call = await db.call.create({ data: { agentId, fromNumber: from, toNumber: to, meta: { streamSid, callSid }, }, });

// On every model turn await db.turn.create({ data: { callId: call.id, role: "assistant", text, startedAt, endedAt }, });

// On every tool execution const t0 = Date.now(); const result = await runTool(name, args); await db.toolCall.create({ data: { callId: call.id, name, args, result, latencyMs: Date.now() - t0 }, });

// On call end await db.call.update({ where: { id: call.id }, data: { endedAt: new Date(), durationMs: Date.now() - call.startedAt.getTime(), status: "completed", }, }); ```

Step 4 — Compute analytics post-call

```ts import OpenAI from "openai"; const oai = new OpenAI();

async function analyze(callId: string) { const turns = await db.turn.findMany({ where: { callId }, orderBy: { startedAt: "asc" }}); const transcript = turns.map(t => `${t.role}: ${t.text}`).join("\n");

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

const r = await oai.chat.completions.create({ model: "gpt-4o-mini", response_format: { type: "json_object" }, messages: [{ role: "user", content: `Score this call. Return JSON {sentiment:-1..1, lead_score:0..100, intent:string, entities:object}.

${transcript}` }], }); const data = JSON.parse(r.choices[0].message.content!); await db.analytics.create({ data: { callId, ...data, leadScore: data.lead_score }}); } ```

Step 5 — Query patterns that scale

```ts // Top leads in last 7 days for a vertical const topLeads = await db.call.findMany({ where: { agent: { vertical: "healthcare" }, startedAt: { gte: new Date(Date.now() - 7 * 86400_000) }, analytics: { leadScore: { gte: 70 } }, }, include: { analytics: true }, orderBy: { analytics: { leadScore: "desc" } }, take: 50, }); ```

The compound index (agentId, startedAt) plus (leadScore) keeps this under 30ms at 1M rows.

Step 6 — Partitioning at 10M+ rows

When Call crosses 10M rows, switch to monthly partitions:

```sql CREATE TABLE call_2026_05 PARTITION OF call FOR VALUES FROM ('2026-05-01') TO ('2026-06-01'); ```

Common pitfalls

JSONB everywhere: indexable but slow for filters. Promote hot fields to columns.
No onDelete: Cascade: orphan turns when calls get deleted. Always cascade child rows.
Single-column indexes only: (agent_id, started_at) compound is 10x faster for vertical+time queries.
Computing analytics inline: blocks the call end. Push to a queue and process async.

How CallSphere does this in production

CallSphere runs 115+ DB tables across 6 verticals on Postgres 15. The schema above is a simplified slice — production adds tenant isolation, soft-delete, and 14 indexes per call table. Healthcare's analytics worker computes sentiment + lead_score within 60s of call end and writes back via this exact Analytics table. Pricing covers all storage; trial here.

FAQ

MongoDB instead? Possible, but joins on call → turns → analytics are clean in Postgres and harder in Mongo. We chose Postgres.

Why store the full transcript text? Search, eval datasets, and BI. Disk is cheap; embeddings are derived.

How big does this get? ~50KB per 5-minute call (turns + tool calls + analytics). 1M calls ≈ 50GB. Manageable on a single db.r6g.large.

Audio recordings in DB? No — S3 with a URL pointer. Postgres for metadata only.

How to Build a Voice Agent Data Layer with PostgreSQL and Prisma

What you'll build

Prerequisites

Architecture

Step 1 — The Prisma schema

Step 2 — Migrate

Step 3 — Persist a call from your bridge

Step 4 — Compute analytics post-call

Step 5 — Query patterns that scale

Step 6 — Partitioning at 10M+ rows

Common pitfalls

How CallSphere does this in production

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

Build a Chat Agent with Haystack RAG + Open LLM (Llama 3.2, 2026)

Build a Voice Agent on Cloudflare Workers AI (No External LLM)

How to Build Voice Agent CI/CD with Evals as Gate (GitHub Actions)

Build a CallSphere-Style Outbound Voice Campaign Tool

Build a CallSphere-Style Multi-Agent for HVAC Dispatch

Build a Voice Agent on AWS App Runner with FastAPI + Bedrock (2026)