By Sagar Shankaran, Founder of CallSphere
Wire ACS Call Automation bidirectional streaming to Voice Live API for production PSTN AI agents. Real C# Web App, EventGrid hookup, midcall barge-in, transfer-to-human flow.
Key takeaways
TL;DR — As of January 2026, ACS Call Automation bidirectional streaming is GA. You purchase a phone number through ACS, hook EventGrid to your webhook, accept the call with
AnswerCall, startMediaStreamingOptionswith bidirectional=true, and pipe frames into Voice Live API. The whole loop fits in a small ASP.NET service.
A C# ASP.NET service that answers ACS-routed calls, opens a bidirectional media stream, and bridges audio to Voice Live API. Mid-call the agent can call a tool to fetch order status, and a "transfer to human" intent triggers AddParticipant with a queue's phone number.
Azure.Communication.CallAutomation NuGet, Azure.Identity.flowchart TD
PSTN[Caller PSTN] --> ACS[ACS Call Automation]
ACS -->|EventGrid IncomingCall| API[ASP.NET Webhook]
API -->|AnswerCall + MediaStreaming| ACS
ACS <-->|wss audio frames| API
API <-->|wss| VL[Voice Live API]
VL --> GPT[gpt-realtime-mini]
API -->|AddParticipant| QUEUE[Human Queue Number]
```csharp
[HttpPost("/incoming")]
public async Task
```csharp [HttpGet("/media")] public async Task Media() { if (!HttpContext.WebSockets.IsWebSocketRequest) { HttpContext.Response.StatusCode = 400; return; } var acs = await HttpContext.WebSockets.AcceptWebSocketAsync(); using var vl = new ClientWebSocket(); vl.Options.SetRequestHeader("Authorization", $"Bearer {await GetAadToken()}"); await vl.ConnectAsync(new Uri("wss://vox-foundry.cognitiveservices.azure.com/voice-agent/realtime?api-version=2025-05-01-preview&model=gpt-realtime"), default);
await SendSessionUpdate(vl);
var t1 = Pump(acs, vl, ParseAcsFrame); // ACS -> Voice Live
var t2 = Pump(vl, acs, FormatAcsFrame); // Voice Live -> ACS
await Task.WhenAny(t1, t2);
} ```
ACS bidirectional frames are JSON-wrapped base64 PCM at 24kHz mono — the same sample rate Voice Live wants natively. No resampling.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
```csharp record AcsAudioFrame(string kind, AcsAudioData audioData); record AcsAudioData(string data, string timestamp, string participantRawID, bool silent);
byte[] ParseAcsFrame(byte[] raw) {
var f = JsonSerializer.Deserialize
In session.update, register the tool. When Voice Live emits response.function_call_arguments.done, dispatch to your CRM SDK and reply with conversation.item.create (function_call_output) + response.create. Same pattern as OpenAI Realtime.
When the user says "agent please", parse the model's signal (a tool call request_transfer is the cleanest), then:
```csharp await _calls.GetCallConnection(callConnectionId).AddParticipantAsync(new CallInvite(new PhoneNumberIdentifier("+1800SUPPORT"), new PhoneNumberIdentifier("+1YourACSNumber"))); ```
ACS handles SIP REFER under the covers; the AI can stay in the call as a transcriber or drop with HangUpAsync.
Enable StartRecordingAsync for compliance. ACS recordings drop into your storage account; pipe through Azure AI Speech batch transcription + Foundry sentiment analysis for post-call analytics.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Same as the previous post: az containerapp create with --user-assigned for AAD, --ingress external, scaling to 50 replicas. WebSocket sticky session is automatic in Container Apps.
SubscriptionValidationEvent with the validation code or the topic stays unverified.EnableBidirectional=true for the new GA bidirectional path; otherwise you get one-way out-of-call streaming (legacy).CallSphere uses ACS for select Microsoft-aligned enterprise tenants in Healthcare (HIPAA + BAA on AAD) but our default voice path is OpenAI Realtime over Twilio Media Streams via FastAPI :8084. We've measured ACS bidirectional median latency at ~750ms vs ~650ms for Twilio + OpenAI, but ACS wins on data residency for EU customers. 37 agents, 90+ tools, 115+ DB tables, 6 verticals. $149/$499/$1499, 14-day trial, 22% affiliate.
Q: Can I use ACS without Voice Live? Yes — bridge to any STT+LLM+TTS stack. Voice Live just removes the integration tax.
Q: How do I get an EU number?
ACS supports number purchase in 30+ countries via the portal; pick country during PurchasePhoneNumbers.
Q: Latency vs Twilio Media Streams? On East US 2 with Voice Live: ~750ms. With Twilio + OpenAI Realtime: ~650ms. ACS catches up in EU regions where Twilio adds a transatlantic hop.
Q: How do I do warm transfers?
Use AddParticipantAsync then MuteParticipantAsync for the AI; the live agent picks up the same call leg.
Q: Can I record with redaction?
Yes — pipe recordings through Azure AI Speech batch transcription with profanity=Masked + a Foundry redaction prompt for PII/PHI before storing.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
Deploy GPT-Realtime-2 on Azure AI Foundry. Region availability, networking, data residency, BAA, and the gotchas teams hit in the first 48 hours.
Haystack 2.7's Agent component plus an Ollama-served Llama 3.2 gives you tool-calling RAG with citations. Here's a complete pipeline against your own document store.
Run STT, LLM, and TTS entirely on Cloudflare's edge — no OpenAI, no ElevenLabs. Real working code with Whisper, Llama 3.3 70B, and Deepgram Aura.
Version your prompts in git, run a 50-case eval suite on every PR, block merges below threshold, and ship a new agent prompt with confidence — full GitHub Actions tutorial.
Replace expensive outbound SDR tooling with a self-hosted dialer that runs OpenAI Realtime agents at 100 concurrent calls. Full architecture and code.
HVAC companies miss 40–60% of inbound. Build a 4-agent dispatch (intake, scheduling, parts, emergency) that integrates with ServiceTitan in 600 lines.
© 2026 CallSphere LLC. All rights reserved.
Watch how CallSphere handles real customer calls, schedules appointments, and processes payments — live.
Try Live DemoBook a DemoCalculate Your ROI