---
title: "Tracing OpenAI Realtime Calls End-to-End"
description: "OpenAI Realtime traces look great in the OpenAI dashboard but vanish when the call leaves their servers. Here's how to stitch SIP, WebRTC, your tools, and Realtime into one trace."
canonical: https://callsphere.ai/blog/vw3c-tracing-openai-realtime-end-to-end
category: "AI Infrastructure"
tags: ["OpenAI Realtime", "Tracing", "Voice AI", "Observability"]
author: "CallSphere Team"
published: 2026-03-26T00:00:00.000Z
updated: 2026-05-07T09:59:38.162Z
---

# Tracing OpenAI Realtime Calls End-to-End

> OpenAI Realtime traces look great in the OpenAI dashboard but vanish when the call leaves their servers. Here's how to stitch SIP, WebRTC, your tools, and Realtime into one trace.

> **TL;DR** — OpenAI's Traces dashboard ends at OpenAI. To trace a real voice call you need to inject your own `traceparent` and join SIP, WebRTC media, model events, and tools into one root.

## What goes wrong

```mermaid
flowchart TD
  Client[Client] --> Edge[Cloudflare Worker]
  Edge -->|WS upgrade| DO[Durable Object]
  DO --> AI[(OpenAI Realtime WS)]
  AI --> DO
  DO --> Client
  DO -.hibernation.-> Storage[(Persisted state)]
```

CallSphere reference architecture

The OpenAI Agents SDK emits beautiful traces — model calls, tool calls, handoffs — into OpenAI's dashboard. The Realtime API does too, via session-level traces. Both stop at OpenAI's edge. Your phone-system layer (Twilio, Telnyx, your SIP trunk), your media transport (WebRTC), and your tool executors (databases, CRM, calendars) sit *outside* their view. When a call goes wrong you're flipping between three dashboards and a Postgres query, manually correlating timestamps.

The fix is to make *your* trace the parent and have OpenAI's traces become children. Inject a `traceparent` header on the WebSocket upgrade or HTTPS POST that opens the Realtime session, and propagate that ID through your tool calls, RAG lookups, and SIP signaling.

## How to monitor

Build a single root span per call:

1. Root: `callsphere.call` (one per phone number ringing in)
2. Child: `sip.invite` (Twilio webhook → your gateway)
3. Child: `webrtc.peer_connection` (media negotiation)
4. Child: `gen_ai.realtime.session` (the OpenAI session — they emit nested spans inside)
5. Children of (4): `gen_ai.tool.execute` per tool, `gen_ai.client` per model turn

Use OTel context propagation. The Realtime API doesn't accept `traceparent` directly, but you can stash your trace ID in the session metadata and re-attach on the model side.

## CallSphere stack

CallSphere runs Realtime for the Healthcare and Real Estate verticals. The Healthcare FastAPI on `:8084` answers Twilio webhooks, mints a Realtime ephemeral key, and proxies the SDP through our edge. We open a root `callsphere.call` span when Twilio fires the inbound webhook. The trace ID is shoved into Realtime session metadata. Tool calls (insurance verification, EHR lookup) reuse the same trace context via OTel's HTTP propagator.

Real Estate's 6-container NATS pod is harder — the trace context flies across six microservices over NATS. We custom-coded a NATS header propagator (NATS doesn't carry HTTP headers natively) so the trace ID survives. The Sales WebSocket layer (PM2 + 8 workers) and the After-hours Bull/Redis queue use the same propagator pattern. The result: one click in Honeycomb shows the entire call, including the OpenAI-internal spans we pull from their trace export.

We see ~480ms first-token-out on Realtime calls; the trace tells us exactly which 480ms came from us vs them. $1499 enterprise tier on [/pricing](/pricing) gets per-call trace links in the call recording UI.

## Implementation

1. **Mint the trace ID at call ingress.**

```python
@app.post("/twilio/inbound")
async def inbound(request: Request):
    with tracer.start_as_current_span("callsphere.call") as root:
        trace_id = root.get_span_context().trace_id
        ephemeral = await mint_realtime_key(metadata={"trace_id": format(trace_id, "032x")})
        return twiml_with_session(ephemeral)
```

1. **Read OpenAI's trace export** (their Traces API supports webhook export as of Q1 2026) and graft their spans under your root using the metadata trace_id.
2. **Propagate over NATS** with a custom header carrier:

```python
from opentelemetry.propagate import inject
def publish_with_trace(subject, payload):
    headers = {}
    inject(headers)
    nats.publish(subject, payload, headers=headers)
```

1. **Tag tool spans** with `gen_ai.tool.name` and `gen_ai.tool.call.id` so they line up under the model turn that requested them.
2. **Persist the call_id ↔ trace_id map** in Postgres (we use the `calls` table) so support engineers can paste a phone number and get the trace.

## FAQ

**Q: Does the Realtime API natively emit OTel spans?**
A: As of Q1 2026, no — it emits OpenAI-format traces accessible via the dashboard and an export webhook. You graft them under your root.

**Q: How do I trace TURN/STUN delays?**
A: We instrument the WebRTC client with timing events (`onicegatheringstatechange`, etc.) and emit them as span events on `webrtc.peer_connection`.

**Q: Can I trace barge-in events?**
A: Yes — emit a span event `gen_ai.audio.barge_in` with `audio.elapsed_ms` so you can see how often users interrupt.

**Q: Does sampling break voice traces?**
A: Tail-sample at the collector and *always keep* traces with errors or FTL > 1500ms. Head-sampling will drop the calls you most need.

**Q: Is this worth it for a 5-call/day startup?**
A: No. Use the OpenAI dashboard until you're past 1k calls/day. Try the [14-day trial](/trial) first.

## Sources

- [OpenAI Agents SDK — Tracing](https://openai.github.io/openai-agents-python/tracing/)
- [OpenAI — Integrations and observability](https://developers.openai.com/api/docs/guides/agents/integrations-observability)
- [DEV — AI Agent Observability in 2026](https://dev.to/chunxiaoxx/ai-agent-observability-in-2026-openai-agents-sdk-langsmith-and-opentelemetry-3ale)
- [Forasoft — OpenAI Realtime API Production Voice Agents 2026](https://www.forasoft.com/blog/article/openai-realtime-api-voice-agent-production-guide-2026)

---

Source: https://callsphere.ai/blog/vw3c-tracing-openai-realtime-end-to-end