By Sagar Shankaran, Founder of CallSphere
Kamailio 6.0's dispatcher module is how you horizontally scale AI voice bridges behind a SIP front-end. Round-robin is the easy answer; call-load and weight-based dispatching is the right one.
Key takeaways
A round-robin SIP dispatcher works for the first hundred concurrent AI calls. By call number 200 you discover that one bridge node has 80% of the GPU pressure and another has 10%. Kamailio 6.0's dispatcher does better, but only if you configure it past defaults.
flowchart LR
Phone["PSTN caller"] --> Carrier["Carrier"]
Carrier -- "SIP INVITE" --> SBC["Session Border Controller"]
SBC -- "SIP" --> PBX["Twilio / Asterisk"]
PBX -- "RTP · Opus" --> Bridge["AI Voice Gateway"]
Bridge --> AI["OpenAI Realtime"]
AI --> Bridge
Bridge --> PBXKamailio is a SIP server, not a media gateway; it routes signaling. The dispatcher module distributes SIP requests across a pool of downstream destinations using one of several algorithms: round-robin, hash over From/To/Call-ID, weight-based, call-load-based, and a few others. Kamailio 6.0 (released 2025) and 6.1 (2026) refined the call-load algorithm and added new health-check options.
For AI voice in 2026, Kamailio fronts a fleet of bridge nodes (FreeSWITCH or your own FastAPI bridge) that each terminate calls and connect to OpenAI Realtime. The dispatcher picks which bridge gets the next call. Round-robin is the obvious default; under real production load with non-uniform call durations and tool-call-heavy conversations, it leaves bridges unbalanced. Call-load dispatching tracks active calls per destination and routes to the least-loaded node.
# /etc/kamailio/dispatcher.list
1 sip:bridge-1.callsphere.ai:5060;weight=10
1 sip:bridge-2.callsphere.ai:5060;weight=10
1 sip:bridge-3.callsphere.ai:5060;weight=8
1 sip:bridge-4.callsphere.ai:5060;weight=10
# kamailio.cfg snippet
loadmodule "dispatcher.so"
modparam("dispatcher", "list_file", "/etc/kamailio/dispatcher.list")
modparam("dispatcher", "ds_ping_interval", 10)
modparam("dispatcher", "ds_probing_threshold", 3)
modparam("dispatcher", "ds_probing_mode", 1)
route {
if (is_method("INVITE")) {
# Algorithm 9 = call-load (active calls per destination)
if (!ds_select_dst("1", "9")) {
# 8 = weight-based fallback
ds_select_dst("1", "8");
}
t_on_failure("FAIL_FALLBACK");
route(RELAY);
}
}
failure_route[FAIL_FALLBACK] {
if (t_check_status("5[0-9][0-9]")) {
ds_mark_dst("ip"); # mark this destination as inactive
ds_next_dst(); # try next
t_relay();
}
}
The call-load algorithm requires Kamailio to track active dialogs (mod_dialog) so it knows which node currently has how many calls. The weight-based algorithm uses static weights from the dispatcher list, which works if your nodes are heterogeneous (different vCPU sizes).
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
For AI voice the additional knob is "tool-call density". A bridge node handling a Healthcare AI call with frequent EHR lookups consumes more CPU than one handling a Salon AI call that only books appointments. Pure call count under-counts that. A custom XHTTP route can let bridges report load back to Kamailio and feed a custom_dispatcher state table.
CallSphere uses Twilio Programmable Voice across all six verticals; Twilio handles the SIP front-end and we do not run Kamailio in our default cloud SKU. For customers requiring on-prem multi-region (regulated Healthcare AI prospects, primarily), Kamailio dispatcher in front of FreeSWITCH bridge nodes is the documented architecture. Healthcare AI on FastAPI :8084, Real Estate AI, Sales Calling AI (5 concurrent outbound per tenant), Salon AI, IT Helpdesk AI, and After-Hours AI (Twilio simul call+SMS with 120-second timeout) would be served by a horizontally-scaled bridge fleet behind Kamailio in such deployments. Across 37 agents, 90+ tools, 115+ DB tables, HIPAA + SOC 2 alignment, $149/$499/$1499 pricing, 14-day trial, and 22% affiliate, the dispatch policy in those deployments uses call-load primary with weight fallback and active health checks.
Can Kamailio handle media too? No. Kamailio is signaling only. Media goes to FreeSWITCH, RTPProxy, or directly between endpoints (depending on topology).
Does Twilio expose this kind of dispatch internally? Twilio's edge does its own load balancing across their bridges. You do not see or tune it.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
What about Kubernetes-native SIP? Kamailio runs in Kubernetes fine; the harder part is keeping SIP UDP and TCP/TLS ports addressable through the LoadBalancer service. Use a NodePort or hostPort for SIP.
How many concurrent calls per Kamailio node? A single well-tuned Kamailio handles tens of thousands of CPS and millions of concurrent dialogs. The bottleneck is almost always the bridges.
What is new in Kamailio 6.0/6.1? 6.0 stabilized HTTP/2 client; 6.1 (2026) added improved dispatcher metrics and finer tlsf integration. KamailioWorld 2026 highlighted production deployments mixing dispatcher with WebSocket-backed mobile push.
Start a 14-day trial, see pricing for $149/$499/$1499 tiers, or contact us about scaled on-prem AI voice deployments.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
A founder's guide to the female voice generator landscape: AI female voices, Japanese voices, robot voices, and how CallSphere ships 57+ voices live.
MOS 4.3+ is the band where AI voice feels human. Drop below 3.6 and conversations break. Here is how to measure, improve, and alert on MOS in production AI voice using G.711, Opus, and the underlying packet loss / jitter / latency math.
Horizontal scaling for LLM-backed APIs has surprises traditional APIs do not. The 2026 patterns and the pitfalls that bite.
Texas SB 1188 requires US-resident EHRs from January 1, 2026; Nevada's consumer-health-data law constrains health data; Colorado AI Act takes effect June 30, 2026. AI voice agents must architect for state-by-state data localization.
When your AI voice agent gets one-way audio, missed DTMF, or codec mismatch, sngrep and Wireshark are still the fastest path to root cause in 2026. Here is the playbook.
PCI DSS 4.0.1 future-dated requirements went mandatory March 31, 2025. AI voice agents that take card payments on behalf of healthcare providers — copays, deductibles, payment plans — must meet 12 requirements with DTMF masking and scope reduction.
© 2026 CallSphere LLC. All rights reserved.