By Sagar Shankaran, Founder of CallSphere
Using GPT-Realtime-2 for healthcare voice agents. BAA scope, PHI handling, retention, logging, and why a managed platform usually wins this build.
Key takeaways
On May 7, 2026, OpenAI launched GPT-Realtime-2 — 128K context, GPT-5-class reasoning, $32/1M audio input, $64/1M output, $0.40/1M cached. The first question every healthcare team asked: can we actually use this for patient-facing voice?
The short answer: yes, but the BAA, PHI handling, retention, and audit story matters more than the model spec. This post is what the right deployment actually looks like.
Five concrete obligations that apply to any AI voice agent touching PHI:
The model is one item on this list. The other four are operational, and they are where most healthcare voice projects stall.
Two paths in 2026:
OpenAI direct: OpenAI now offers BAA-eligible deployments under their enterprise tier for the GPT-Realtime line. The data handling commitment includes zero-day retention for inputs and outputs unless customer opts in.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Azure AI Foundry: Microsoft has had HIPAA BAA scope on Azure OpenAI for over a year. Foundry inherits the same coverage for GPT-Realtime-2 as it rolls out by region.
In practice most large healthcare deployments still pick Foundry — not because the model is different, but because procurement, networking, and audit infrastructure already live on Azure.
For a typical primary-care call:
At GPT-Realtime-2 pricing with caching, per-call model spend is $0.55–$0.85 for a typical primary-care call. A practice doing 1,500 patient calls/mo sits around $1,000/mo in model spend — small next to the labor it replaces, large next to a chatbot.
Three patterns we see in production healthcare voice deployments:
The model is rarely the bottleneck. The real timeline costs:
A solo healthcare team building from scratch is looking at 3–6 months before the first compliant patient call. A managed platform compresses this to 3–5 business days because the BAA, network controls, audit pipeline, disclosure scripts, and staff training materials are pre-built.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
CallSphere is a managed AI voice and chat agent platform. Healthcare is one of our 6 live verticals, alongside real estate, sales, salon/beauty, IT helpdesk, and after-hours escalation. The platform is HIPAA-friendly, supports BAA workflows, ships ~14 function tools (including appointment scheduling, patient verification, provider messaging, escalation), and covers 57+ languages for multilingual practices.
Pricing: Starter $149/mo (2,000 interactions), Growth $499/mo (10,000), Scale $1,499/mo (50,000). Most healthcare practices go live in 3–5 business days.
For practices that want to build on GPT-Realtime-2 directly, that path is real — it is just a different timeline and a different ops surface. For practices that want to take patient calls next week, the managed path is what we built.
See it in action: callsphere.ai/demo.
Q: Can I run GPT-Realtime-2 on-prem for HIPAA? A: No. Both OpenAI direct and Azure Foundry are cloud-only. The compliance story is BAA-and-cloud, not on-prem.
Q: Does the cached input pricing apply to PHI tokens? A: Cached input is a billing optimization, not a compliance change. PHI in the cached prefix is still PHI and still scoped under the BAA.
Q: What happens if the AI says something clinically wrong? A: The deployment must include explicit scoping (the AI does not diagnose, prescribe, or give clinical advice) and an escalation path to a human. This is your responsibility regardless of which model you use.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
HIPAA-aware AI customer support uses privacy-conscious design, minimal data handling, and human routing for clinical cases. A 2026 guide for providers.
A founder's guide to texto a voz (text-to-speech in Spanish): LATAM vs Castilian voices, free options, and how CallSphere ships Spanish agents.
A founder's guide to the female voice generator landscape: AI female voices, Japanese voices, robot voices, and how CallSphere ships 57+ voices live.
A founder's guide to the Siri voice generator landscape: how AI voice cloning works, what is legal, and how CallSphere uses 57+ voices in production.
A founder's guide to AI voice assistants for ecommerce: customer service, order lookup, and how CallSphere fits in versus virtual receptionists.
Robot text to speech in 2026: how I pick TTS APIs, when robotic voices help, and how CallSphere ships 57+ language voice agents. Hands-on guide.
© 2026 CallSphere LLC. All rights reserved.