Skip to content
AI Infrastructure
AI Infrastructure12 min read0 views

Build a Voice Agent on AWS App Runner with FastAPI + Bedrock (2026)

Deploy a HIPAA-eligible voice agent on AWS App Runner: FastAPI WebSocket bridge, Bedrock Claude for reasoning, ECR auto-deploy from GitHub, VPC connector for private RDS.

TL;DR — App Runner is AWS's fully-managed container service: point it at an ECR image, set CPU/memory, and it autoscales 1-25 instances behind a public HTTPS URL with WebSocket support. Pair with Bedrock Claude 4.7 Sonnet, an RDS Postgres in a VPC, and Twilio Media Streams for a HIPAA-eligible voice agent without managing EKS or ECS.

What you'll build

A FastAPI service containerized to ECR Public, deployed via App Runner with VPC connector to a private RDS for Postgres. The service bridges Twilio Media Streams to Bedrock Nova Sonic (Amazon's speech-to-speech model) for sub-second voice. CI: GitHub Actions builds the image, App Runner auto-deploys.

Prerequisites

  1. AWS account with App Runner, ECR, Bedrock, RDS access in us-east-1.
  2. Bedrock model access for amazon.nova-sonic-v1:0 and/or anthropic.claude-sonnet-4-7-20250620-v1:0.
  3. GitHub repo + Actions for CI.
  4. Twilio number.

Architecture

flowchart TD
  GH[GitHub Actions] -->|push image| ECR[Amazon ECR]
  ECR -->|auto-deploy| AR[AWS App Runner]
  AR -->|VPC connector| RDS[(RDS Postgres private)]
  T[Twilio] -->|wss| AR
  AR <-->|InvokeModelWithBidirectionalStream| BS[Bedrock Nova Sonic]
  AR -->|InvokeModel fallback| CL[Bedrock Claude 4.7]

Step 1 — Containerize FastAPI

```dockerfile FROM public.ecr.aws/docker/library/python:3.11-slim WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . . EXPOSE 8080 CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8080", "--workers", "1"] ```

App Runner only forwards traffic to one port; use a single uvicorn worker per container and let App Runner scale instances.

Step 2 — Push to ECR

```bash aws ecr create-repository --repository-name voice-agent aws ecr get-login-password | docker login --username AWS --password-stdin $ACCOUNT.dkr.ecr.us-east-1.amazonaws.com docker build -t voice-agent . docker tag voice-agent:latest $ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/voice-agent:latest docker push $ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/voice-agent:latest ```

Step 3 — Create the App Runner service

```bash aws apprunner create-service \ --service-name voice-agent \ --source-configuration '{ "ImageRepository": { "ImageIdentifier": "'$ACCOUNT'.dkr.ecr.us-east-1.amazonaws.com/voice-agent:latest", "ImageRepositoryType": "ECR", "ImageConfiguration": {"Port": "8080"} }, "AutoDeploymentsEnabled": true, "AuthenticationConfiguration": { "AccessRoleArn": "arn:aws:iam::'$ACCOUNT':role/AppRunnerECRAccessRole" } }' \ --instance-configuration '{ "Cpu": "1 vCPU", "Memory": "2 GB", "InstanceRoleArn": "arn:aws:iam::'$ACCOUNT':role/AppRunnerVoiceAgentRole" }' \ --network-configuration '{ "EgressConfiguration": {"EgressType":"VPC","VpcConnectorArn":"arn:aws:apprunner:us-east-1:'$ACCOUNT':vpcconnector/voice-vpc/1/...."} }' ```

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

AutoDeploymentsEnabled: true makes App Runner pull a new image whenever ECR has a fresh :latest tag.

Step 4 — Wire to Bedrock Nova Sonic

```python import boto3, json br = boto3.client("bedrock-runtime", region_name="us-east-1")

async def nova_sonic_session(twilio_ws): response = br.invoke_model_with_bidirectional_stream( modelId="amazon.nova-sonic-v1:0", body=streaming_body_iter(twilio_ws) ) async for chunk in response["body"]: ev = json.loads(chunk["chunk"]["bytes"]) if ev["type"] == "audioOutput": await twilio_ws.send_text(json.dumps({ "event": "media", "streamSid": sid, "media": {"payload": ev["audioOutput"]["content"]} })) ```

Nova Sonic is Amazon's speech-to-speech model — drop-in replacement for STT+LLM+TTS.

Step 5 — VPC connector to private RDS

```bash aws apprunner create-vpc-connector \ --vpc-connector-name voice-vpc \ --subnets subnet-aaa subnet-bbb \ --security-groups sg-rds-client ```

App Runner egress goes through this connector, so RDS can be private (no public IP). Inbound (Twilio → App Runner) still hits the public HTTPS URL — no change needed.

Step 6 — IAM role for the instance

```json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": ["bedrock:InvokeModel","bedrock:InvokeModelWithBidirectionalStream"], "Resource": "" }, { "Effect": "Allow", "Action": "rds-db:connect", "Resource": "arn:aws:rds-db:us-east-1::dbuser:db-*/voice_user" } ] } ```

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Step 7 — GitHub Actions for CI

```yaml name: ci on: { push: { branches: [main] } } jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: aws-actions/configure-aws-credentials@v4 with: { role-to-assume: arn:aws:iam::ACCT:role/GhaDeployer, aws-region: us-east-1 } - run: aws ecr get-login-password | docker login --username AWS --password-stdin ACCT.dkr.ecr.us-east-1.amazonaws.com - run: docker build -t voice-agent . && docker tag voice-agent ACCT.dkr.ecr.us-east-1.amazonaws.com/voice-agent:latest && docker push ACCT.dkr.ecr.us-east-1.amazonaws.com/voice-agent:latest ```

App Runner picks up the new image automatically.

Pitfalls

  • WebSocket idle timeout is 120s by default — adjust via service config to 600s.
  • Single port: App Runner exposes one HTTP/WS port. Use uvicorn or sidecars carefully.
  • No persistent disk — ephemeral. Logs to CloudWatch via the App Runner integration.
  • VPC connector cold start adds ~1-2s on cold scale-out. Keep min instances = 2.
  • Bedrock Nova Sonic regional: us-east-1 only as of May 2026.
  • HIPAA: App Runner is HIPAA-eligible; sign BAA, encrypt RDS with KMS CMK, turn off CloudWatch detailed logging for PHI.

How CallSphere does this in production

CallSphere's HIPAA Healthcare vertical runs on EKS in a private VPC with FastAPI :8084 — App Runner was tempting but the cost crossed over for our scale (37 agents, 90+ tools, 115+ DB tables, 6 verticals). For teams under ~50k call-min/day, App Runner is cheaper than EKS and far less ops. We use Pion Go + NATS for the OneRoof multi-family vertical's SIP layer. $149/$499/$1499, 14-day trial, 22% affiliate.

FAQ

Q: App Runner vs ECS Fargate vs EKS? App Runner: zero ops, $$$ per call-min beyond ~50k/day. Fargate: middle. EKS: best cost at scale, most ops. Start with App Runner.

Q: Does App Runner support Bedrock streaming? Yes — InvokeModelWithBidirectionalStream works fine with App Runner's WS support.

Q: Multi-region? App Runner is regional; for global, deploy in 2-3 regions and use Route 53 latency routing.

Q: Cost at 1k call-min/day? 1 vCPU 2GB instance @ $0.064/hour x 24h x ~3 instances = $4.60/day compute. Bedrock + Twilio dominate.

Q: Can I deploy from GitHub source (no Docker)? Yes — App Runner supports source-code mode for Python/Node, but for voice agents Docker gives you better control over runtime versions.

Sources

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.