By Sagar Shankaran, Founder of CallSphere
Deploy GPT-Realtime-2 on Azure AI Foundry. Region availability, networking, data residency, BAA, and the gotchas teams hit in the first 48 hours.
Key takeaways
Alongside OpenAI's direct launch of GPT-Realtime-2 on May 7, 2026, Microsoft made the same model family available through Azure AI Foundry. For enterprises that already buy AI through Azure — for procurement, compliance, BAA, data residency, or BYOC reasons — this is the deployment path that matters.
This is a practical guide to what is different on Azure vs OpenAI direct, and the gotchas that have surfaced in the first 72 hours.
Five durable reasons that have nothing to do with the model itself:
Six things to know if you have been on OpenAI direct and are moving to Foundry:
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Foundry's headline pricing for GPT-Realtime-2 mirrors OpenAI's:
Translate ($0.034/min) and Whisper streaming ($0.017/min) are also on Foundry's rate card. Enterprise commit customers may have negotiated rates that differ.
The default networking story on Foundry deployments:
For voice specifically, the websocket path needs careful firewall configuration. The most common deployment delay we have seen on day-one is a network team that has not yet allowed the streaming websocket path through corporate egress.
Three patterns that have already surfaced this week:
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
CallSphere is a managed AI voice and chat agent platform. We do not require customers to pick a cloud or manage Foundry quotas. The platform is the abstraction — customers consume per-interaction pricing (Starter $149/mo (2,000), Growth $499/mo (10,000), Scale $1,499/mo (50,000)) without owning the deployment surface. For enterprises that have a hard Azure-only mandate, we accommodate that on Scale-tier deployments; for everyone else, the cloud underneath is something we operate.
Talk to us about deployment options: callsphere.ai/demo.
Q: Is Foundry strictly worse on raw speed than OpenAI direct? A: Within margin of error in our testing. Some regions are faster, some slower. The differences are in the noise vs production tuning of your own stack.
Q: Can I run hybrid — Foundry for prod, OpenAI direct for dev? A: Yes. Most teams do exactly this. Pin model versions explicitly so a Foundry rollout does not surprise prod.
Q: When does the BAA cover the new realtime models? A: Microsoft has confirmed coverage rollout in parallel with the model availability rollout. Confirm in writing before HIPAA traffic flows.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
Using GPT-Realtime-2 for healthcare voice agents. BAA scope, PHI handling, retention, logging, and why a managed platform usually wins this build.
GPT-Realtime-2 brings GPT-5-class reasoning into voice. What that means for tool-call reliability, structured output, and production agent design.
A buyer-side comparison: building a phone agent on OpenAI's GPT-Realtime-2 API vs buying CallSphere. TCO, time-to-launch, and what you actually own.
OpenAI's GPT-Realtime-2 quadruples voice context to 128K tokens. Here is exactly what the 32K-to-128K jump changes for production phone agents.
Zep Cloud and OSS Zep have diverged in 2026 with different feature sets. The build-vs-buy math for memory infrastructure with concrete cost numbers and trade-offs.
Enterprise AI agent buyers need governance-first evaluation, 30-point scorecards, and quarterly re-verification. The 2026 procurement playbook for CIOs and CTOs.
© 2026 CallSphere LLC. All rights reserved.