By Sagar Shankaran, Founder of CallSphere
How to test an AI voice agent end-to-end with a headless browser, fake audio, and a deterministic harness in 2026. Playwright wins; here is the production setup.
Key takeaways
You cannot ship an AI voice agent that is only tested by humans. Headless browsers with fake-audio capture and deterministic media injection are the 2026 way to run thousands of voice scenarios in CI.
A full-stack voice agent test exercises: ephemeral token mint, ICE gathering, SDP exchange, DataChannel function calls, audio playback, and end-of-turn detection. None of that runs reliably against an HTTP mock. You need a real browser, real WebRTC, and a real way to feed audio in and capture audio out.
In 2026 the practical answer is Playwright with Chromium. Selenium still works (and Selenium's WebDriver BiDi has caught up), but Playwright's auto-wait, tracing, and built-in browser launch flags make it 5–10x less flaky for media tests. New greenfield projects in 2026 should not start on Selenium unless your team has heavy Java or C# investment, or Safari is in your test matrix and you need legacy WebDriver paths.
The model has shifted twice in two years: first away from Puppeteer (which Playwright outgrew on cross-browser support), then toward MCP-driven test harnesses where Claude or Cursor invoke browser tools directly. Both still rely on the same fake-mic primitives.
```mermaid flowchart LR CI[CI runner] -- launch headless --> Chromium Chromium -- fake mic --> WAV[(scripted audio)] Chromium -- WebRTC --> Agent[AI voice agent] Agent -- audio --> Chromium Chromium -- captured WAV --> Asserts[Whisper / golden checks] ```
The browser is launched with two key Chromium flags: `--use-fake-ui-for-media-stream` (auto-grants mic permission) and `--use-file-for-fake-audio-capture=/path/to/in.wav` (substitutes the mic with a WAV). Outbound audio is captured by recording a remote `MediaStreamTrack` to a `MediaRecorder`.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
CallSphere runs ~600 headless voice scenarios per night across the six verticals (real estate, healthcare, behavioral health, legal, salon, insurance). The pipeline:
Across 37 agents, 90+ tools, and 115+ database tables this has caught regressions in tool registries, latency budgets, and SDP munging within hours of merge. SOC 2 + HIPAA test fixtures use synthetic data only. Pricing $149/$499/$1499 with the 14-day trial; affiliates 22% — see /affiliate.
```ts import { test, expect, chromium } from "@playwright/test";
test("real-estate agent schedules tour", async () => { const browser = await chromium.launch({ args: [ "--use-fake-ui-for-media-stream", "--use-fake-device-for-media-stream", "--use-file-for-fake-audio-capture=fixtures/realestate-schedule.wav", "--autoplay-policy=no-user-gesture-required", ], }); const ctx = await browser.newContext(); const page = await ctx.newPage(); await page.goto("https://callsphere.ai/demo?vertical=real-estate"); await page.click("button[data-test=start-call]"); // wait until DataChannel emits a confirmed tool call const event = await page.waitForFunction(() => (window as any).__lastToolCall); expect(await event.jsonValue()).toMatchObject({ name: "schedule_showing" }); await browser.close(); }); ```
Why not curl the model directly? That misses ICE, SDP, DataChannel ordering, and audio jitter — exactly the layers that break in production.
Does WebRTC work in headless Chrome? Yes — JavaScript execution, WebRTC, service workers all work the same as headed.
What about Safari? WebKit's headless support lags. Run a manual cross-browser pass weekly; do not block CI on Safari.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
How do I avoid LLM nondeterminism? Use `temperature: 0` for tests, lock prompts, and assert on tool calls (deterministic) rather than wording.
Can I run hundreds in parallel? Yes — one Chromium per worker, ~250 MB RAM each. We run 32 in parallel on a single c6i.8xlarge.
Does Selenium support fake mic? Yes via Chromium options — but Playwright handles them more cleanly.
What about MCP-driven tests? Playwright MCP is great for ad-hoc exploration; for repeatable CI use the standard Playwright runner.
Can I record real users? Not without consent. For QA fixtures, synthesize with a TTS pass instead of recording prospects.
Three rules from running 600 nightly scenarios:
We also run a weekly chaos pass: random jitter and packet loss injected via `tc netem` on the runner. Catches degradation we would otherwise only see in P99 customer reports.
Try the production agent on /demo, check /pricing, or start a /trial.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
A founder's guide to texto a voz (text-to-speech in Spanish): LATAM vs Castilian voices, free options, and how CallSphere ships Spanish agents.
A founder's guide to the female voice generator landscape: AI female voices, Japanese voices, robot voices, and how CallSphere ships 57+ voices live.
A founder's guide to the Siri voice generator landscape: how AI voice cloning works, what is legal, and how CallSphere uses 57+ voices in production.
A founder's guide to AI voice assistants for ecommerce: customer service, order lookup, and how CallSphere fits in versus virtual receptionists.
Robot text to speech in 2026: how I pick TTS APIs, when robotic voices help, and how CallSphere ships 57+ language voice agents. Hands-on guide.
The customer support specialist role in 2026 is half human, half AI. Here is what the job looks like, the AI tools that pair with it, and how we ship it.
© 2026 CallSphere LLC. All rights reserved.