Skip to content
AI Infrastructure
AI Infrastructure12 min read0 views

Postgres + TimescaleDB for AI Voice Metrics: Hypertables That Don't Slow Down (2026)

Voice agents emit ASR latency, TTS latency, barge-in counts, and turn-level token use. TimescaleDB's hypertables + continuous aggregates handle billions of rows while keeping dashboards under 200 ms.

TL;DR — TimescaleDB turns Postgres into a time-series engine optimized for ingest + downsampled rollups. For AI voice telemetry — ASR, TTS, LLM turn latencies — it gives you Prometheus-grade queries while staying in your transactional DB.

What you'll build

A voice_metrics hypertable storing per-turn measurements, two continuous aggregates (1-minute and 1-hour rollups), and a Grafana panel that powers a real-time voice agent SLO dashboard.

Schema

CREATE EXTENSION IF NOT EXISTS timescaledb;

CREATE TABLE voice_metrics (
  ts TIMESTAMPTZ NOT NULL,
  tenant_id UUID NOT NULL,
  agent_id UUID NOT NULL,
  call_id UUID NOT NULL,
  asr_ms INT,
  llm_ms INT,
  tts_ms INT,
  total_ms INT,
  barge_in BOOLEAN DEFAULT false,
  tokens_in INT,
  tokens_out INT
);

SELECT create_hypertable('voice_metrics', 'ts',
  chunk_time_interval => INTERVAL '1 day');

CREATE INDEX ON voice_metrics (tenant_id, ts DESC);
CREATE INDEX ON voice_metrics (agent_id, ts DESC);

Architecture

flowchart LR
  AGENT[Voice agent turn] --> EMIT[Emit metric]
  EMIT --> HYP[(voice_metrics<br/>hypertable)]
  HYP --> CAG1[1-min continuous agg]
  HYP --> CAG2[1-hour continuous agg]
  CAG1 --> GRAFANA[Grafana SLO panel]
  CAG2 --> ALERTS[Alertmanager]

Step 1 — Continuous aggregates

CREATE MATERIALIZED VIEW voice_metrics_1min
WITH (timescaledb.continuous) AS
SELECT time_bucket('1 minute', ts) AS bucket,
       tenant_id, agent_id,
       avg(total_ms) AS avg_ms,
       percentile_cont(0.95) WITHIN GROUP (ORDER BY total_ms) AS p95_ms,
       count(*) AS turns,
       sum((barge_in)::int) AS barges
FROM voice_metrics
GROUP BY bucket, tenant_id, agent_id;

SELECT add_continuous_aggregate_policy('voice_metrics_1min',
  start_offset => INTERVAL '2 hours',
  end_offset   => INTERVAL '1 minute',
  schedule_interval => INTERVAL '30 seconds');

Step 2 — Compression policy

ALTER TABLE voice_metrics SET (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'tenant_id, agent_id',
  timescaledb.compress_orderby = 'ts DESC'
);

SELECT add_compression_policy('voice_metrics', INTERVAL '7 days');

Compressed chunks shrink by 10–30x, query speed actually increases.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

Step 3 — Retention policy

SELECT add_retention_policy('voice_metrics', INTERVAL '180 days');

Old chunks drop automatically — no manual cleanup, no DELETE bloat.

Step 4 — Emit metrics from the agent

import { prisma } from "@/lib/db";

export async function recordTurn(m: {
  tenantId: string; agentId: string; callId: string;
  asrMs: number; llmMs: number; ttsMs: number;
  tokensIn: number; tokensOut: number; bargeIn: boolean;
}) {
  await prisma.$executeRaw`
    INSERT INTO voice_metrics (
      ts, tenant_id, agent_id, call_id,
      asr_ms, llm_ms, tts_ms, total_ms,
      tokens_in, tokens_out, barge_in
    ) VALUES (
      now(), ${m.tenantId}::uuid, ${m.agentId}::uuid, ${m.callId}::uuid,
      ${m.asrMs}, ${m.llmMs}, ${m.ttsMs}, ${m.asrMs + m.llmMs + m.ttsMs},
      ${m.tokensIn}, ${m.tokensOut}, ${m.bargeIn}
    )
  `;
}

Step 5 — Query for dashboards

SELECT bucket, p95_ms, turns, barges
FROM voice_metrics_1min
WHERE tenant_id = $1
  AND bucket >= now() - interval '6 hours'
ORDER BY bucket;

Grafana hits this directly via the Postgres data source.

Step 6 — Alert when p95 budget breaks

# alertmanager / Grafana alert rule
expr: voice_metrics_p95_ms{agent="appointments"} > 1500
for:  5m
labels: { severity: page }

Pitfalls

  • Chunk interval too small — 1-hour chunks at 10k rps create millions of chunks. Stick to 1 day for typical voice loads.
  • Continuous agg without policy — view exists but never refreshes. Add the policy.
  • Querying raw table for dashboards — always hit the continuous aggregate; raw is for forensics.
  • Compression before retention — order matters. Compress first, retain second.

CallSphere production note

CallSphere's voice telemetry hypertable powers per-tenant SLO dashboards across 115+ DB tables and 37 agents. Healthcare's HIPAA bucket lives on a separate healthcare_voice Prisma schema with stricter retention; OneRoof aggregates per-property metrics under RLS; UrackIT mirrors counters into Supabase + ChromaDB. 90+ tools · 6 verticals, p95 dashboard load <180 ms even at 6 months retention. Plans: $149/$499/$1,499 — 14-day trial, 22% affiliate.

FAQ

Q: TimescaleDB or vanilla partitioned Postgres? Continuous aggregates and compression are the wins. Below ~10M rows/day, vanilla works.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Q: Does TimescaleDB break logical replication? Hypertables replicate fine if both sides have the extension installed.

Q: Compression hurts ad-hoc queries? The opposite — compressed columns are faster to scan than uncompressed ones.

Q: Can I use Tiger Data Cloud (managed)? Yes — same hypertable API, fully managed; pricing scales with data volume.

Q: Continuous aggregates lag behind by how much? Refreshed every 30 sec by policy. Last minute is approximate, prior minutes are exact.

Sources

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Infrastructure

Defense, ITAR & AI Voice Vendor Compliance in 2026

ITAR technical-data definitions don't care if a human or an LLM produced the output. CMMC Level 2 has been mandatory since November 2025. Here is what an AI voice vendor needs to ship to defense in 2026.

AI Infrastructure

Monitoring WebSocket Health: Heartbeats and Prometheus in 2026

How to actually observe a WebSocket fleet: ping/pong heartbeats, Prometheus metrics that matter, dead-man switches, and the alerts that fire before customers notice.

AI Infrastructure

WebRTC Over QUIC and the Future of Realtime: Where Voice AI Goes After 2026

WebTransport is Baseline as of March 2026. Media Over QUIC ships in production within the year. Here is what changes for AI voice agents — and what stays the same.

AI Engineering

Latency vs Cost: A Decision Matrix for Voice AI Spend in 2026

Every 100ms of latency costs you. So does every cent per minute. Here is the decision matrix we use across 6 verticals to pick where to spend and where to save on voice AI infrastructure.

Agentic AI

The Agent Evaluation Stack in 2026: From Trace to Eval Score

How the modern agent eval stack actually flows: instrument, trace, dataset, evaluator, score, CI gate. The full pipeline that keeps agents from regressing.

AI Strategy

AI Agent M&A Activity 2026: Aircall–Vogent, Meta–PlayAI, OpenAI's Six Deals

Q1 2026 saw a record acquisition wave: Aircall bought Vogent (May), Meta acquired Manus and PlayAI, OpenAI closed six deals. The voice AI consolidation phase has begun.