Skip to content
AI Voice Agents
AI Voice Agents11 min read0 views

Telnyx Call Control API and AI Bridging in 2026: Carrier-Grade Voice for AI Agents

Telnyx Call Control plus Media Streaming over WebSockets gives you sub-200ms round-trip when you pair it with LiveKit on Telnyx, plus carrier-native STT and TTS at half the cost of LiveKit Cloud. Here is the 2026 wiring.

Telnyx is the carrier-on-a-CPaaS option: they own their network, which means tighter media SLAs and fewer hops. The April 2026 launch of LiveKit on Telnyx puts colocated GPU inference next to that carrier-native voice path and claims 50% lower STT/TTS cost than LiveKit Cloud. For AI voice teams that care about latency and unit economics, the math is now on Telnyx's side.

Background

Telnyx Call Control is a REST/Webhook API for call lifecycle: dial, answer, hangup, transfer, gather DTMF, play audio, record. Media Streaming is the WebSocket extension that forks the call audio to your endpoint. The two together let you build a voice agent that can drive the call (Call Control) while AI processes the audio (Media Streaming).

April 2026 brought two big launches:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →
  • LiveKit on Telnyx: fully hosted LiveKit agents on Telnyx infrastructure, sub-200ms round-trip, 50% cheaper STT/TTS than LiveKit Cloud, session fees waived in beta.
  • WhatsApp Business Calling: programmable voice over WhatsApp, same Call Control API, AI agent extends to WhatsApp from the same code.

The Telnyx LiveKit Plugin wraps Telnyx STT and TTS as LiveKit-native processors, so your existing LiveKit agent gets carrier-native speech without code changes.

Architecture

graph LR
    A[PSTN / WhatsApp] --> B[Telnyx Carrier Network]
    B -->|webhook| C[Your Call Control App]
    C -->|REST commands| B
    B -->|wss media stream| D[Audio Bridge]
    D --> E[LiveKit on Telnyx GPU]
    E -->|Telnyx STT| F[LLM]
    F -->|Telnyx TTS| E
    E -->|RTP| B
# Start media streaming via Call Control API
curl -X POST https://api.telnyx.com/v2/calls/$CALL_ID/actions/streaming_start \
  -H "Authorization: Bearer $TELNYX_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "stream_url": "wss://bridge.callsphere.ai/telnyx-realtime",
    "stream_track": "both_tracks",
    "stream_bidirectional_mode": "rtp",
    "audio_codec": "PCMU"
  }'

CallSphere implementation

CallSphere terminates every product on Twilio across all six verticals (Healthcare AI on FastAPI :8084 to OpenAI Realtime, Real Estate AI, Sales Calling AI with 5 concurrent outbound, Salon AI, IT Helpdesk AI, After-Hours AI with Twilio simul call+SMS 120-second timeout). 37 agents, 90+ tools, 115+ DB tables, HIPAA + SOC 2, $149/$499/$1499 plans, 14-day trial, 22% affiliate. We monitor the Telnyx LiveKit launch closely because the unit economics matter at scale; for prospects pushing 1M+ minutes/month our reference architecture demonstrates a Telnyx fallback trunk with the same OpenAI Realtime adapter behind a CPaaS abstraction layer in our shared bridge.

Build steps

  1. Sign up for Telnyx, configure a Call Control Application, attach a phone number.
  2. Set the webhook URL on the application; the webhook receives call.initiated, call.answered, call.hangup events.
  3. On call.answered, call POST /v2/calls/{id}/actions/streaming_start with your WebSocket URL.
  4. Implement the WebSocket: parse the binary frames (RTP payload) or JSON-wrapped base64 depending on stream_bidirectional_mode.
  5. Forward audio to OpenAI Realtime or Telnyx STT; receive responses.
  6. To send audio back, use streaming_send_payload action or Call Control playback for static media.
  7. For LiveKit on Telnyx: deploy your existing LiveKit agent Dockerfile via Telnyx API; bind the agent to a SIP trunk; route the Call Control call into the LiveKit room.

Pitfalls

  • stream_bidirectional_mode has two values: rtp (true bidirectional via UDP) and mp3 (one-way). Pick rtp for AI.
  • Webhook delivery has retries; idempotency by call_control_id is your responsibility.
  • Telnyx and Twilio stream JSON envelopes differ; the audio codec is the same but the wrapper isn't.
  • LiveKit on Telnyx is in beta as of mid-2026; production SLA may differ from Telnyx core voice.
  • The new WhatsApp Calling channel is separate from PSTN; rate limits and number formats differ.

FAQ

How does Telnyx latency compare to Twilio? Telnyx claims sub-200ms round-trip on LiveKit on Telnyx; Twilio Streams typical end-to-end is 600-900ms with external models. Direct comparison is hard because Twilio has more model partners.

Is Telnyx STT competitive with Deepgram? Telnyx's own STT is workmanlike; for top accuracy on hard domains (medical, legal) most teams still pick Deepgram or Speechmatics.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

WhatsApp Business Calling API limits? Same as Telnyx PSTN at launch; Meta-side rate limits apply per business account.

Can I keep my LiveKit code unchanged? Yes. LiveKit on Telnyx accepts the same agent Dockerfile; you change the room URL and the SIP trunk binding.

Is Telnyx HIPAA-eligible? Yes with a signed BAA on enterprise plans.

Sources

Start a 14-day trial of our Twilio-based stack, see pricing, or contact us about a Telnyx fallback configuration for high-volume tenants.

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Voice Agents

MOS Call Quality Scoring for AI Voice Operations in 2026: Beyond 4.2

MOS 4.3+ is the band where AI voice feels human. Drop below 3.6 and conversations break. Here is how to measure, improve, and alert on MOS in production AI voice using G.711, Opus, and the underlying packet loss / jitter / latency math.

AI Infrastructure

Deploy a Voice Agent on Modal with Python and Serverless GPU

Modal turns a Python function into autoscaling serverless compute with optional GPU. Deploy a LiveKit Agent with one command and get pay-per-second billing.

AI Voice Agents

Build a Voice Agent with LiveKit Agents Python SDK 1.5 (2026)

LiveKit Agents 1.5 (April 2026) added an audio-based interruption model and native MCP tools. Here's a full self-hosted LiveKit voice agent with adaptive turn detection.

AI Strategy

State Data Residency for AI Voice in Healthcare — Texas, Nevada, Colorado in 2026

Texas SB 1188 requires US-resident EHRs from January 1, 2026; Nevada's consumer-health-data law constrains health data; Colorado AI Act takes effect June 30, 2026. AI voice agents must architect for state-by-state data localization.

AI Engineering

SIP Debugging with sngrep and Wireshark for AI Voice Calls in 2026: The Hands-On Playbook

When your AI voice agent gets one-way audio, missed DTMF, or codec mismatch, sngrep and Wireshark are still the fastest path to root cause in 2026. Here is the playbook.

AI Infrastructure

RTP Transcoding Cost for AI Voice in 2026: Why Edge Placement Beats Central GPU

Transcoding RTP to WebSocket is more CPU-intensive than people expect. For AI voice in 2026, where you place the transcode (edge near the carrier vs central near the model) decides your cost-per-minute.