Build a Voice Agent on AWS App Runner with FastAPI + Bedrock (2026)
Deploy a HIPAA-eligible voice agent on AWS App Runner: FastAPI WebSocket bridge, Bedrock Claude for reasoning, ECR auto-deploy from GitHub, VPC connector for private RDS.
TL;DR — App Runner is AWS's fully-managed container service: point it at an ECR image, set CPU/memory, and it autoscales 1-25 instances behind a public HTTPS URL with WebSocket support. Pair with Bedrock Claude 4.7 Sonnet, an RDS Postgres in a VPC, and Twilio Media Streams for a HIPAA-eligible voice agent without managing EKS or ECS.
What you'll build
A FastAPI service containerized to ECR Public, deployed via App Runner with VPC connector to a private RDS for Postgres. The service bridges Twilio Media Streams to Bedrock Nova Sonic (Amazon's speech-to-speech model) for sub-second voice. CI: GitHub Actions builds the image, App Runner auto-deploys.
Prerequisites
- AWS account with App Runner, ECR, Bedrock, RDS access in us-east-1.
- Bedrock model access for
amazon.nova-sonic-v1:0and/oranthropic.claude-sonnet-4-7-20250620-v1:0. - GitHub repo + Actions for CI.
- Twilio number.
Architecture
flowchart TD
GH[GitHub Actions] -->|push image| ECR[Amazon ECR]
ECR -->|auto-deploy| AR[AWS App Runner]
AR -->|VPC connector| RDS[(RDS Postgres private)]
T[Twilio] -->|wss| AR
AR <-->|InvokeModelWithBidirectionalStream| BS[Bedrock Nova Sonic]
AR -->|InvokeModel fallback| CL[Bedrock Claude 4.7]
Step 1 — Containerize FastAPI
```dockerfile FROM public.ecr.aws/docker/library/python:3.11-slim WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . . EXPOSE 8080 CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8080", "--workers", "1"] ```
App Runner only forwards traffic to one port; use a single uvicorn worker per container and let App Runner scale instances.
Step 2 — Push to ECR
```bash aws ecr create-repository --repository-name voice-agent aws ecr get-login-password | docker login --username AWS --password-stdin $ACCOUNT.dkr.ecr.us-east-1.amazonaws.com docker build -t voice-agent . docker tag voice-agent:latest $ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/voice-agent:latest docker push $ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/voice-agent:latest ```
Step 3 — Create the App Runner service
```bash aws apprunner create-service \ --service-name voice-agent \ --source-configuration '{ "ImageRepository": { "ImageIdentifier": "'$ACCOUNT'.dkr.ecr.us-east-1.amazonaws.com/voice-agent:latest", "ImageRepositoryType": "ECR", "ImageConfiguration": {"Port": "8080"} }, "AutoDeploymentsEnabled": true, "AuthenticationConfiguration": { "AccessRoleArn": "arn:aws:iam::'$ACCOUNT':role/AppRunnerECRAccessRole" } }' \ --instance-configuration '{ "Cpu": "1 vCPU", "Memory": "2 GB", "InstanceRoleArn": "arn:aws:iam::'$ACCOUNT':role/AppRunnerVoiceAgentRole" }' \ --network-configuration '{ "EgressConfiguration": {"EgressType":"VPC","VpcConnectorArn":"arn:aws:apprunner:us-east-1:'$ACCOUNT':vpcconnector/voice-vpc/1/...."} }' ```
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
AutoDeploymentsEnabled: true makes App Runner pull a new image whenever ECR has a fresh :latest tag.
Step 4 — Wire to Bedrock Nova Sonic
```python import boto3, json br = boto3.client("bedrock-runtime", region_name="us-east-1")
async def nova_sonic_session(twilio_ws): response = br.invoke_model_with_bidirectional_stream( modelId="amazon.nova-sonic-v1:0", body=streaming_body_iter(twilio_ws) ) async for chunk in response["body"]: ev = json.loads(chunk["chunk"]["bytes"]) if ev["type"] == "audioOutput": await twilio_ws.send_text(json.dumps({ "event": "media", "streamSid": sid, "media": {"payload": ev["audioOutput"]["content"]} })) ```
Nova Sonic is Amazon's speech-to-speech model — drop-in replacement for STT+LLM+TTS.
Step 5 — VPC connector to private RDS
```bash aws apprunner create-vpc-connector \ --vpc-connector-name voice-vpc \ --subnets subnet-aaa subnet-bbb \ --security-groups sg-rds-client ```
App Runner egress goes through this connector, so RDS can be private (no public IP). Inbound (Twilio → App Runner) still hits the public HTTPS URL — no change needed.
Step 6 — IAM role for the instance
```json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": ["bedrock:InvokeModel","bedrock:InvokeModelWithBidirectionalStream"], "Resource": "" }, { "Effect": "Allow", "Action": "rds-db:connect", "Resource": "arn:aws:rds-db:us-east-1::dbuser:db-*/voice_user" } ] } ```
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Step 7 — GitHub Actions for CI
```yaml name: ci on: { push: { branches: [main] } } jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: aws-actions/configure-aws-credentials@v4 with: { role-to-assume: arn:aws:iam::ACCT:role/GhaDeployer, aws-region: us-east-1 } - run: aws ecr get-login-password | docker login --username AWS --password-stdin ACCT.dkr.ecr.us-east-1.amazonaws.com - run: docker build -t voice-agent . && docker tag voice-agent ACCT.dkr.ecr.us-east-1.amazonaws.com/voice-agent:latest && docker push ACCT.dkr.ecr.us-east-1.amazonaws.com/voice-agent:latest ```
App Runner picks up the new image automatically.
Pitfalls
- WebSocket idle timeout is 120s by default — adjust via service config to 600s.
- Single port: App Runner exposes one HTTP/WS port. Use uvicorn or sidecars carefully.
- No persistent disk — ephemeral. Logs to CloudWatch via the App Runner integration.
- VPC connector cold start adds ~1-2s on cold scale-out. Keep
min instances = 2. - Bedrock Nova Sonic regional: us-east-1 only as of May 2026.
- HIPAA: App Runner is HIPAA-eligible; sign BAA, encrypt RDS with KMS CMK, turn off CloudWatch detailed logging for PHI.
How CallSphere does this in production
CallSphere's HIPAA Healthcare vertical runs on EKS in a private VPC with FastAPI :8084 — App Runner was tempting but the cost crossed over for our scale (37 agents, 90+ tools, 115+ DB tables, 6 verticals). For teams under ~50k call-min/day, App Runner is cheaper than EKS and far less ops. We use Pion Go + NATS for the OneRoof multi-family vertical's SIP layer. $149/$499/$1499, 14-day trial, 22% affiliate.
FAQ
Q: App Runner vs ECS Fargate vs EKS? App Runner: zero ops, $$$ per call-min beyond ~50k/day. Fargate: middle. EKS: best cost at scale, most ops. Start with App Runner.
Q: Does App Runner support Bedrock streaming?
Yes — InvokeModelWithBidirectionalStream works fine with App Runner's WS support.
Q: Multi-region? App Runner is regional; for global, deploy in 2-3 regions and use Route 53 latency routing.
Q: Cost at 1k call-min/day? 1 vCPU 2GB instance @ $0.064/hour x 24h x ~3 instances = $4.60/day compute. Bedrock + Twilio dominate.
Q: Can I deploy from GitHub source (no Docker)? Yes — App Runner supports source-code mode for Python/Node, but for voice agents Docker gives you better control over runtime versions.
Sources
- Deploying FastAPI on AWS App Runner with CI/CD — Cheesecake Labs
- Deploy a Serverless FastAPI App with Neon Postgres + AWS App Runner — Neon
- Deploying Strands Agents SDK to AWS App Runner
- Deploy voice agents with Pipecat and Amazon Bedrock AgentCore Runtime — AWS
- aws-solutions-library-samples/sample-voice-agent
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.