By Sagar Shankaran, Founder of CallSphere
OpenAI Realtime traces look great in the OpenAI dashboard but vanish when the call leaves their servers. Here's how to stitch SIP, WebRTC, your tools, and Realtime into one trace.
Key takeaways
TL;DR — OpenAI's Traces dashboard ends at OpenAI. To trace a real voice call you need to inject your own
traceparentand join SIP, WebRTC media, model events, and tools into one root.
flowchart TD
Client[Client] --> Edge[Cloudflare Worker]
Edge -->|WS upgrade| DO[Durable Object]
DO --> AI[(OpenAI Realtime WS)]
AI --> DO
DO --> Client
DO -.hibernation.-> Storage[(Persisted state)]The OpenAI Agents SDK emits beautiful traces — model calls, tool calls, handoffs — into OpenAI's dashboard. The Realtime API does too, via session-level traces. Both stop at OpenAI's edge. Your phone-system layer (Twilio, Telnyx, your SIP trunk), your media transport (WebRTC), and your tool executors (databases, CRM, calendars) sit outside their view. When a call goes wrong you're flipping between three dashboards and a Postgres query, manually correlating timestamps.
The fix is to make your trace the parent and have OpenAI's traces become children. Inject a traceparent header on the WebSocket upgrade or HTTPS POST that opens the Realtime session, and propagate that ID through your tool calls, RAG lookups, and SIP signaling.
Build a single root span per call:
callsphere.call (one per phone number ringing in)sip.invite (Twilio webhook → your gateway)webrtc.peer_connection (media negotiation)gen_ai.realtime.session (the OpenAI session — they emit nested spans inside)gen_ai.tool.execute per tool, gen_ai.client per model turnUse OTel context propagation. The Realtime API doesn't accept traceparent directly, but you can stash your trace ID in the session metadata and re-attach on the model side.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
CallSphere runs Realtime for the Healthcare and Real Estate verticals. The Healthcare FastAPI on :8084 answers Twilio webhooks, mints a Realtime ephemeral key, and proxies the SDP through our edge. We open a root callsphere.call span when Twilio fires the inbound webhook. The trace ID is shoved into Realtime session metadata. Tool calls (insurance verification, EHR lookup) reuse the same trace context via OTel's HTTP propagator.
Real Estate's 6-container NATS pod is harder — the trace context flies across six microservices over NATS. We custom-coded a NATS header propagator (NATS doesn't carry HTTP headers natively) so the trace ID survives. The Sales WebSocket layer (PM2 + 8 workers) and the After-hours Bull/Redis queue use the same propagator pattern. The result: one click in Honeycomb shows the entire call, including the OpenAI-internal spans we pull from their trace export.
We see ~480ms first-token-out on Realtime calls; the trace tells us exactly which 480ms came from us vs them. $1499 enterprise tier on /pricing gets per-call trace links in the call recording UI.
@app.post("/twilio/inbound")
async def inbound(request: Request):
with tracer.start_as_current_span("callsphere.call") as root:
trace_id = root.get_span_context().trace_id
ephemeral = await mint_realtime_key(metadata={"trace_id": format(trace_id, "032x")})
return twiml_with_session(ephemeral)
Read OpenAI's trace export (their Traces API supports webhook export as of Q1 2026) and graft their spans under your root using the metadata trace_id.
Propagate over NATS with a custom header carrier:
from opentelemetry.propagate import inject
def publish_with_trace(subject, payload):
headers = {}
inject(headers)
nats.publish(subject, payload, headers=headers)
Tag tool spans with gen_ai.tool.name and gen_ai.tool.call.id so they line up under the model turn that requested them.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Persist the call_id ↔ trace_id map in Postgres (we use the calls table) so support engineers can paste a phone number and get the trace.
Q: Does the Realtime API natively emit OTel spans? A: As of Q1 2026, no — it emits OpenAI-format traces accessible via the dashboard and an export webhook. You graft them under your root.
Q: How do I trace TURN/STUN delays?
A: We instrument the WebRTC client with timing events (onicegatheringstatechange, etc.) and emit them as span events on webrtc.peer_connection.
Q: Can I trace barge-in events?
A: Yes — emit a span event gen_ai.audio.barge_in with audio.elapsed_ms so you can see how often users interrupt.
Q: Does sampling break voice traces? A: Tail-sample at the collector and always keep traces with errors or FTL > 1500ms. Head-sampling will drop the calls you most need.
Q: Is this worth it for a 5-call/day startup? A: No. Use the OpenAI dashboard until you're past 1k calls/day. Try the 14-day trial first.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
A founder's guide to texto a voz (text-to-speech in Spanish): LATAM vs Castilian voices, free options, and how CallSphere ships Spanish agents.
A founder's guide to the female voice generator landscape: AI female voices, Japanese voices, robot voices, and how CallSphere ships 57+ voices live.
A founder's guide to the Siri voice generator landscape: how AI voice cloning works, what is legal, and how CallSphere uses 57+ voices in production.
A founder's guide to AI voice assistants for ecommerce: customer service, order lookup, and how CallSphere fits in versus virtual receptionists.
Robot text to speech in 2026: how I pick TTS APIs, when robotic voices help, and how CallSphere ships 57+ language voice agents. Hands-on guide.
The customer support specialist role in 2026 is half human, half AI. Here is what the job looks like, the AI tools that pair with it, and how we ship it.
© 2026 CallSphere LLC. All rights reserved.