Bedrock Claude + Transcribe streaming + Polly Neural runs $0.06–$0.10 per minute on paper. The honest math reveals where the AWS-native stack beats and where it loses to OpenAI Realtime.

The cost problem

flowchart TD
  Client[Client] --> Edge[Cloudflare Worker]
  Edge -->|WS upgrade| DO[Durable Object]
  DO --> AI[(OpenAI Realtime WS)]
  AI --> DO
  DO --> Client
  DO -.hibernation.-> Storage[(Persisted state)]

CallSphere reference architecture

Enterprises with AWS commits often default-build voice agents on the AWS-native stack: Transcribe for STT, Bedrock for LLM, and Polly for TTS. The pitch is "use your committed spend, stay in VPC, single billing." The trap is that the AWS stack is a stitched cascade — three services with three latency penalties — and the per-minute cost looks great until you add Bedrock token cost honestly.

We modeled it against gpt-realtime to find the real break-even.

How AWS prices it

Amazon Transcribe (streaming):

Tier 1 (first 250k minutes/month): $0.024/min
Tier 2: $0.015/min (38% discount)
Tier 3: $0.0102/min (58% discount)
Speaker ID adds 20–40% extra

Amazon Polly:

Standard voices: $4.00 per 1M characters
Neural voices: $16.00 per 1M characters
Long-Form voices: $100.00 per 1M characters
Generative voices (newer): higher than Long-Form

Amazon Bedrock (May 2026):

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

Claude 3.5 Haiku: $0.80/M input · $4.00/M output
Claude 3.5 Sonnet: $3.00/M input · $15.00/M output
Bedrock prompt caching: 90% discount on cached input where supported
Provisioned Throughput: from $21.18/hour per model unit

Honest math

Profile A — 5-minute support call, Claude 3.5 Haiku, Polly Neural, Tier 1 Transcribe:

Transcribe: 5 × $0.024 = $0.12
Polly Neural (2 min × ~150 wpm × ~5 chars/word ÷ 1M × $16): $0.024
Bedrock Haiku (12k input cached + 2k output): ~$0.018
Total: ~$0.162/call → $0.032/min

But that uses Tier 1 Transcribe — $0.024/min. Most production fleets that hit Tier 2 ($0.015/min) drop the per-call total to $0.117 → $0.023/min.

Profile B — 12-minute healthcare intake, Claude Sonnet, Polly Neural, 22k prompt:

Transcribe: 12 × $0.024 = $0.288
Polly Neural (5 min × 150 wpm × 5 chars ÷ 1M × $16): $0.060
Bedrock Sonnet (22k cached input over 18 turns + 8k output): ~$0.21
Total: ~$0.558 → $0.047/min

Profile C — Same as B but on gpt-realtime cached:

~$0.96 → $0.080/min

So AWS stitched is ~40% cheaper than OpenAI Realtime cached on long, complex calls. The savings come from cheap Transcribe tier-2 + Bedrock prompt caching + Polly Neural.

The downside: latency. The cascaded AWS stack runs 700–900ms voice-to-voice on best-tuned configurations. gpt-realtime sits at ~430ms.

When AWS wins, when it loses

AWS wins when:

You have a Transcribe commit pulling you to Tier 2 or 3
Your prompt is huge (Bedrock cache rate is competitive)
Latency tolerance is 600ms+ (not premium support flows)
Compliance requires AWS VPC + KMS + CloudTrail end-to-end
You already pay for Bedrock provisioned throughput

AWS loses when:

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Sub-500ms voice-to-voice is required
You are below 250k Transcribe minutes/month (Tier 1 is meh)
You want the latest emotional voices (Polly is solid but not v3)
Your team is not deep in AWS — operational complexity is real

How CallSphere optimizes

CallSphere does not run pure AWS-stitched in production today, but we do use AWS for non-voice paths where it makes sense — AWS SES for cold outreach mail, S3 for call recording archives, and Bedrock as a fallback LLM for one Healthcare post-call analytics pipeline that needs the data residency story.

For voice itself we land on OpenAI Realtime + ElevenLabs for premium and Deepgram + GPT-4o-mini + Aura-2 for cost-sensitive — see our other posts in this batch for the math. Across 6 verticals — 37 agents, 90+ tools, 115+ DB tables — AWS is part of the back-of-house but not the realtime hot path.

If you are running on AWS already and considering a switch, the ROI calculator on our site lets you plug in your current AWS unit cost and compare to our pricing tiers ($149 / $499 / $1499). The 14-day no-card trial lets you A/B against your AWS-stitched baseline.

Optimization checklist

Compute your real Transcribe tier — Tier 1 is rough; Tier 2/3 unlocks AWS savings.
Use Polly Neural unless you need Long-Form quality (4× price for marginal gains).
Use Bedrock prompt caching aggressively — same 90% discount as Anthropic direct.
Choose Claude Haiku for short flows, Sonnet for complex.
Watch out for Bedrock Provisioned Throughput — only worth it at very high concurrency.
Consider Polly's Generative Voices for brand voice — but benchmark vs ElevenLabs.
Stay in one region to avoid cross-region egress charges.
Use Speaker Diarization only if you need it — adds 20–40%.
Pre-warm Bedrock with a small inference at start-of-shift to dodge cold-start.
Monitor latency p95 with X-Ray; add Lambda Provisioned Concurrency if cold starts hurt.

FAQ

Is AWS Transcribe cheaper than Deepgram? On Tier 1, no — Deepgram Nova-3 ($0.0048/min) beats Transcribe Tier 1 ($0.024/min) 5×. On Tier 3, Transcribe ($0.0102) gets close.

Can I use Bedrock with prompt caching? Yes — Bedrock supports prompt caching for Claude models with up to 90% discount on cached input.

Should I use Polly Long-Form voices? Only for brand voice or audiobook use cases. The 4× price multiplier is hard to justify for live agents.

What about AWS Lex for the orchestration? Lex bundles intents and slot filling, but its LLM is dated. Most teams skip Lex and orchestrate directly.

Can I bring HIPAA workloads here? Yes — Transcribe, Polly, and Bedrock are all HIPAA-eligible with a BAA in place. Same as our Healthcare Voice Agent stack.

Sources

Amazon Transcribe Pricing — https://aws.amazon.com/transcribe/pricing/
Amazon Polly Pricing — https://aws.amazon.com/polly/pricing/
Amazon Bedrock Pricing — https://aws.amazon.com/bedrock/pricing/
CostGoat AWS Transcribe Calculator — https://costgoat.com/pricing/amazon-transcribe

AWS Bedrock + Transcribe + Polly Stitched vs Realtime: Real Cost

The cost problem

How AWS prices it

Honest math

When AWS wins, when it loses

How CallSphere optimizes

Optimization checklist

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

Latency vs Cost: A Decision Matrix for Voice AI Spend in 2026

AWS HealthScribe 2026: The Open Medical Scribe API Layer

AWS Multi-Agent Orchestrator: Supervisor Routing Patterns Guide

AWS Trainium 2 April 2026 update — supply ramp and pricing

Amazon Q1 2026 earnings — AWS AI growth and Trainium 2 ramp

Agent Memory Cost Modeling in 2026: An Honest Numbers Walkthrough