By Sagar Shankaran, Founder of CallSphere
Blue/green deploy an AI voice agent without dropping calls. ALB stickiness, draining timeouts tuned for WebSockets, Redis-backed session state, and a clean cutover.
Key takeaways
TL;DR — Blue/green for voice means: stand up green, drain blue with sticky sessions intact, cut over new calls only, keep state in Redis so reconnects work. Stickiness duration on the LB must match your call SLO, not 12 hours.
Two parallel ReplicaSets (voice-agent-blue and voice-agent-green) behind an Application Load Balancer with target-group stickiness, session state in Redis so a reconnecting call lands on the same color, and a cutover script that gates on "all blue calls drained".
flowchart TD
CLIENT[Caller WS] --> ALB[ALB stickiness=5m]
ALB --> BLUE[voice-agent-blue]
ALB --> GREEN[voice-agent-green]
BLUE --> REDIS[(Redis call state)]
GREEN --> REDIS
CUT[Cutover] -->|weight 0/100| ALB
DRAIN[Drain blue] --> BLUE
```hcl resource "aws_lb_target_group" "blue" { name = "voice-blue" port = 8080 protocol = "HTTP" protocol_version = "HTTP1" # WebSocket stickiness { type = "lb_cookie" cookie_duration = 300 # 5 minutes — matches our max call length enabled = true } deregistration_delay = 600 # let active calls finish (10 min) health_check { path = "/healthz/realtime" interval = 10 timeout = 3 } } resource "aws_lb_target_group" "green" { ... identical with name = "voice-green" } ```
cookie_duration = 300 (5 min) is the critical knob. Default 12 hours means a customer who reconnects 11 hours later still hits the old color — keeps blue alive forever.
```hcl resource "aws_lb_listener_rule" "voice" { listener_arn = aws_lb_listener.https.arn action { type = "forward" forward { target_group { arn = aws_lb_target_group.blue.arn weight = 100 } target_group { arn = aws_lb_target_group.green.arn weight = 0 } stickiness { duration = 300 enabled = true } } } condition { path_pattern { values = ["/realtime/*"] } } } ```
The voice agent must not keep call context in process memory. Otherwise a green pod can't resume a blue pod's call. Move to Redis:
```python import redis.asyncio as redis r = redis.from_url("redis://voice-redis:6379")
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
async def get_call(call_id): return json.loads(await r.get(f"call:{call_id}") or "{}") async def set_call(call_id, state): await r.setex(f"call:{call_id}", 1800, json.dumps(state)) ```
Now any pod (blue or green) can pick up state. Sticky sessions stay because of the cookie, but reconnects are safe.
```bash kubectl apply -f voice-agent-green.yaml kubectl rollout status deploy/voice-agent-green --timeout=120s ```
Green is live and registered in its target group, but weight = 0 means no traffic.
```bash
aws elbv2 modify-rule --rule-arn $RULE \ --actions '[{"Type":"forward","ForwardConfig":{"TargetGroups":[ {"TargetGroupArn":"${BLUE_TG}","Weight":0}, {"TargetGroupArn":"${GREEN_TG}","Weight":100}], "TargetGroupStickinessConfig":{"Enabled":true,"DurationSeconds":300}}}]'
```
Then poll until blue drains:
```bash while [ "$(aws cloudwatch get-metric-data ... blue active_calls)" != "0" ]; do sleep 30; done ```
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
```bash kubectl delete deploy voice-agent-blue ```
After confirming blue active_calls == 0 for at least 5 min (a stragglers buffer), remove. ALB target deregistration_delay (10 min) handles late connection drains.
```bash
python smoke/realtime_ping.py --url wss://agent.example.com/realtime --color green || \ aws elbv2 modify-rule ... --weights blue=100,green=0 && exit 1 ```
ssl_policy. Default ELBSecurityPolicy-2016-08 lacks TLS 1.3.CallSphere does blue/green for major voice-agent versions and canary (Argo Rollouts) for prompt changes. Stickiness is 300s; call state lives in Redis Sentinel; deregistration delay 10 min. Across our k3s + Cloudflare Tunnel stack we cut over ~12 times a month with zero call drops on the blue/green path. 37 agents, 90+ tools, 115+ DB tables, $149/$499/$1499, 14-day trial, 22% affiliate.
Q: Blue/green vs canary for voice? Blue/green for infrastructure swaps (LB config, runtime version). Canary for agent changes (prompts, models). Use both.
Q: Why not WebRTC instead of WebSocket signaling? WebRTC media is point-to-point and doesn't go through the ALB. Only the signaling WebSocket needs sticky handling.
Q: Redis instead of in-memory state — what's the latency cost? ~1-2 ms per turn. Negligible vs voice round-trip.
Q: Can I do this with k8s Services only, no ALB?
Yes — use sessionAffinity: ClientIP and two Services with weighted Ingress. Less elegant than ALB target weights, but works.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
A founder's guide to texto a voz (text-to-speech in Spanish): LATAM vs Castilian voices, free options, and how CallSphere ships Spanish agents.
A founder's guide to the female voice generator landscape: AI female voices, Japanese voices, robot voices, and how CallSphere ships 57+ voices live.
A founder's guide to the Siri voice generator landscape: how AI voice cloning works, what is legal, and how CallSphere uses 57+ voices in production.
A founder's guide to AI voice assistants for ecommerce: customer service, order lookup, and how CallSphere fits in versus virtual receptionists.
Robot text to speech in 2026: how I pick TTS APIs, when robotic voices help, and how CallSphere ships 57+ language voice agents. Hands-on guide.
The customer support specialist role in 2026 is half human, half AI. Here is what the job looks like, the AI tools that pair with it, and how we ship it.
© 2026 CallSphere LLC. All rights reserved.
Watch how CallSphere handles real customer calls, schedules appointments, and processes payments — live.
Try Live DemoBook a DemoCalculate Your ROI