TL;DR — Bland's $0.09/min is bundled (LLM + STT + TTS + telephony). With OpenAI Realtime + your own Twilio Elastic SIP trunk you'll land near $0.045/min on volume, keep call recordings inside your VPC, and own the conversation logic instead of renting a black-box pathway. Full code below.

What you'll build

A drop-in replacement for a Bland /v1/calls outbound flow: a FastAPI service that holds the OpenAI Realtime websocket, a Twilio Media Streams bridge, and a parity harness that replays your last 200 Bland transcripts against the new stack so you can ship behind a flag.

Prerequisites

Bland account with API access and 30 days of call logs (you'll diff these).
Twilio account with Elastic SIP Trunking enabled (E.164 numbers, not just programmable voice).
OpenAI API key with Realtime access — gpt-4o-realtime-preview-2025-06-03 or newer.
Python 3.11+, FastAPI, websockets, twilio SDK.
A staging DID you can dial without affecting production.

Architecture

sequenceDiagram
  participant C as Caller
  participant T as Twilio Elastic SIP
  participant B as Your bridge (FastAPI)
  participant O as OpenAI Realtime
  C->>T: PSTN INVITE
  T->>B: WSS Media Streams (mulaw)
  B->>O: WSS /v1/realtime?model=...
  B->>O: input_audio_buffer.append (transcoded)
  O-->>B: response.audio.delta
  B-->>T: media frames (mulaw)
  T-->>C: PSTN audio

Step 1 — Mirror your Bland pathway config into a JSON spec

Bland encodes flow as a "pathway" graph. Export it (GET /v1/pathway/{id}), then translate to a single OpenAI session.update payload. Most pathways collapse into one system prompt + 4–8 tools; do not try to recreate the graph node-for-node.

```python session_update = { "type": "session.update", "session": { "instructions": open("prompts/booking.md").read(), "voice": "alloy", "input_audio_format": "g711_ulaw", "output_audio_format": "g711_ulaw", "turn_detection": {"type": "server_vad", "silence_duration_ms": 380}, "tools": [book_appointment_tool, transfer_to_human_tool], }, } ```

Step 2 — Stand up the Twilio Media Streams bridge

```python from fastapi import FastAPI, WebSocket import websockets, json, base64, os

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

app = FastAPI()

@app.websocket("/twilio") async def twilio_ws(ws: WebSocket): await ws.accept() headers = {"Authorization": f"Bearer {os.environ['OPENAI_API_KEY']}", "OpenAI-Beta": "realtime=v1"} url = "wss://api.openai.com/v1/realtime?model=gpt-4o-realtime-preview-2025-06-03" async with websockets.connect(url, extra_headers=headers) as oai: await oai.send(json.dumps(session_update)) async def t2o(): async for msg in ws.iter_text(): ev = json.loads(msg) if ev["event"] == "media": await oai.send(json.dumps({ "type": "input_audio_buffer.append", "audio": ev["media"]["payload"]})) async def o2t(): async for raw in oai: ev = json.loads(raw) if ev["type"] == "response.audio.delta": await ws.send_json({"event":"media", "media":{"payload": ev["delta"]}}) import asyncio; await asyncio.gather(t2o(), o2t()) ```

Step 3 — Port Bland tools to OpenAI function tools

A Bland webhookNode becomes a single function tool. Keep the same JSON contract so your downstream services don't change:

```python book_appointment_tool = { "type": "function", "name": "book_appointment", "description": "Book a slot. Mirrors Bland webhookNode book_appt.", "parameters": { "type": "object", "required": ["name", "phone", "slot_iso"], "properties": { "name": {"type": "string"}, "phone": {"type": "string", "pattern": "^\+?[1-9]\d{7,14}$"}, "slot_iso": {"type": "string", "format": "date-time"}, }, }, } ```

Step 4 — Run the parity harness against 200 Bland transcripts

For every call you replay, compare booking outcome, transferred-to-human flag, and time-to-first-booking. Any drift over 4% gets flagged:

```python def parity_score(bland_outcomes, new_outcomes): matches = sum(1 for b, n in zip(bland_outcomes, new_outcomes) if b["booked"] == n["booked"]) return matches / len(bland_outcomes) ```

Step 5 — Cutover with a percentage flag

Route 5% of inbound to the new stack via Twilio's <Dial><Sip> weighted target. Watch CSAT and transfer-rate dashboards for 72 hours, then ramp 5 → 25 → 60 → 100.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Step 6 — Decommission and reclaim spend

Cancel Bland after 14 days of zero traffic. Pull final usage CSV, archive transcripts to S3 (S3 Object Lock if HIPAA), and delete API keys.

Common pitfalls

Mulaw mismatch. Twilio sends 8kHz mulaw; OpenAI accepts g711_ulaw — set both ends, do not transcode in Python.
Server VAD too tight. Bland defaults are aggressive; silence_duration_ms: 380 matches their feel.
Tool schema drift. Add a JSON-schema diff in CI between Bland exports and your new tool definitions.

How CallSphere does this in production

CallSphere runs 37 specialist agents across 6 verticals on top of OpenAI Realtime, ElevenLabs and a Pion-based WebRTC mesh — never one giant pathway. Healthcare uses a 14-tool FastAPI service on port 8084 (HIPAA controls baked in), OneRoof Property uses 10 specialists over WebRTC + NATS, and Salon runs 4 agents with ElevenLabs voices and GB-YYYYMMDD-### booking refs. Pricing is flat $149/$499/$1499 with a 14-day trial and a 22% affiliate program — no per-minute surprise bills like Bland.

FAQ

Will my Bland pathway logic survive? Mostly. Linear flows port cleanly. Highly branched ones are better expressed as one prompt + tools.

What about Bland's voice clones? Re-clone in ElevenLabs or use OpenAI's stock voices; clone fidelity is comparable on 30-second samples.

Can I keep my Bland number? Yes — port the DID to Twilio (10-day SLA in the US).

Is this cheaper at low volume? Below ~3,000 minutes/month, Bland's bundled price wins. Above it, your own stack wins.

Does CallSphere support migration? Yes — see /compare/bland-ai and /demo.

Migrate from Bland AI to OpenAI Realtime + Your Own Stack in 2026

What you'll build

Prerequisites

Architecture

Step 1 — Mirror your Bland pathway config into a JSON spec

Step 2 — Stand up the Twilio Media Streams bridge

Step 3 — Port Bland tools to OpenAI function tools

Step 4 — Run the parity harness against 200 Bland transcripts

Step 5 — Cutover with a percentage flag

Step 6 — Decommission and reclaim spend

Common pitfalls

How CallSphere does this in production

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

Latency vs Cost: A Decision Matrix for Voice AI Spend in 2026

Build a Chat Agent with Haystack RAG + Open LLM (Llama 3.2, 2026)

Build a Voice Agent on Cloudflare Workers AI (No External LLM)

OpenAI's May 2026 WebRTC Rearchitecture: How Voice Latency Got Real

How to Build Voice Agent CI/CD with Evals as Gate (GitHub Actions)

Logistics Dispatch Voice Agent 2026: Driver Hotline + Load Assignment Hands-Free