Skip to content
AI Infrastructure
AI Infrastructure10 min read0 views

Kamailio Dispatcher for AI Voice Scaling in 2026: Round-Robin Is Not Enough

Kamailio 6.0's dispatcher module is how you horizontally scale AI voice bridges behind a SIP front-end. Round-robin is the easy answer; call-load and weight-based dispatching is the right one.

A round-robin SIP dispatcher works for the first hundred concurrent AI calls. By call number 200 you discover that one bridge node has 80% of the GPU pressure and another has 10%. Kamailio 6.0's dispatcher does better, but only if you configure it past defaults.

Background

flowchart LR
  Phone["PSTN caller"] --> Carrier["Carrier"]
  Carrier -- "SIP INVITE" --> SBC["Session Border Controller"]
  SBC -- "SIP" --> PBX["Twilio / Asterisk"]
  PBX -- "RTP · Opus" --> Bridge["AI Voice Gateway"]
  Bridge --> AI["OpenAI Realtime"]
  AI --> Bridge
  Bridge --> PBX
CallSphere reference architecture

Kamailio is a SIP server, not a media gateway; it routes signaling. The dispatcher module distributes SIP requests across a pool of downstream destinations using one of several algorithms: round-robin, hash over From/To/Call-ID, weight-based, call-load-based, and a few others. Kamailio 6.0 (released 2025) and 6.1 (2026) refined the call-load algorithm and added new health-check options.

For AI voice in 2026, Kamailio fronts a fleet of bridge nodes (FreeSWITCH or your own FastAPI bridge) that each terminate calls and connect to OpenAI Realtime. The dispatcher picks which bridge gets the next call. Round-robin is the obvious default; under real production load with non-uniform call durations and tool-call-heavy conversations, it leaves bridges unbalanced. Call-load dispatching tracks active calls per destination and routes to the least-loaded node.

Technical deep-dive

# /etc/kamailio/dispatcher.list
1 sip:bridge-1.callsphere.ai:5060;weight=10
1 sip:bridge-2.callsphere.ai:5060;weight=10
1 sip:bridge-3.callsphere.ai:5060;weight=8
1 sip:bridge-4.callsphere.ai:5060;weight=10
# kamailio.cfg snippet
loadmodule "dispatcher.so"
modparam("dispatcher", "list_file", "/etc/kamailio/dispatcher.list")
modparam("dispatcher", "ds_ping_interval", 10)
modparam("dispatcher", "ds_probing_threshold", 3)
modparam("dispatcher", "ds_probing_mode", 1)

route {
    if (is_method("INVITE")) {
        # Algorithm 9 = call-load (active calls per destination)
        if (!ds_select_dst("1", "9")) {
            # 8 = weight-based fallback
            ds_select_dst("1", "8");
        }
        t_on_failure("FAIL_FALLBACK");
        route(RELAY);
    }
}

failure_route[FAIL_FALLBACK] {
    if (t_check_status("5[0-9][0-9]")) {
        ds_mark_dst("ip");  # mark this destination as inactive
        ds_next_dst();      # try next
        t_relay();
    }
}

The call-load algorithm requires Kamailio to track active dialogs (mod_dialog) so it knows which node currently has how many calls. The weight-based algorithm uses static weights from the dispatcher list, which works if your nodes are heterogeneous (different vCPU sizes).

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

For AI voice the additional knob is "tool-call density". A bridge node handling a Healthcare AI call with frequent EHR lookups consumes more CPU than one handling a Salon AI call that only books appointments. Pure call count under-counts that. A custom XHTTP route can let bridges report load back to Kamailio and feed a custom_dispatcher state table.

CallSphere implementation

CallSphere uses Twilio Programmable Voice across all six verticals; Twilio handles the SIP front-end and we do not run Kamailio in our default cloud SKU. For customers requiring on-prem multi-region (regulated Healthcare AI prospects, primarily), Kamailio dispatcher in front of FreeSWITCH bridge nodes is the documented architecture. Healthcare AI on FastAPI :8084, Real Estate AI, Sales Calling AI (5 concurrent outbound per tenant), Salon AI, IT Helpdesk AI, and After-Hours AI (Twilio simul call+SMS with 120-second timeout) would be served by a horizontally-scaled bridge fleet behind Kamailio in such deployments. Across 37 agents, 90+ tools, 115+ DB tables, HIPAA + SOC 2 alignment, $149/$499/$1499 pricing, 14-day trial, and 22% affiliate, the dispatch policy in those deployments uses call-load primary with weight fallback and active health checks.

Implementation steps

  1. Build a fleet of bridge nodes that each register with Kamailio (or are statically listed in dispatcher.list).
  2. Set ds_ping_interval to 5-15 seconds with TCP/TLS pings; mark unhealthy nodes inactive within a minute.
  3. Use algorithm 9 (call-load) primary for AI voice; calls vary in duration too much for hash-based to balance.
  4. Set per-destination weights matching vCPU + GPU capacity; reset weights when scaling up.
  5. Enable failure_route to retry on 5xx responses against the next destination automatically.
  6. Configure mod_dialog so dispatcher can track active dialog counts.
  7. Expose Kamailio dispatcher metrics via xhttp and scrape into Prometheus.
  8. Pre-fetch capacity: at 90% bridge utilization across the fleet, scale up the auto-scaling group; do not wait until 100%.

FAQ

Can Kamailio handle media too? No. Kamailio is signaling only. Media goes to FreeSWITCH, RTPProxy, or directly between endpoints (depending on topology).

Does Twilio expose this kind of dispatch internally? Twilio's edge does its own load balancing across their bridges. You do not see or tune it.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

What about Kubernetes-native SIP? Kamailio runs in Kubernetes fine; the harder part is keeping SIP UDP and TCP/TLS ports addressable through the LoadBalancer service. Use a NodePort or hostPort for SIP.

How many concurrent calls per Kamailio node? A single well-tuned Kamailio handles tens of thousands of CPS and millions of concurrent dialogs. The bottleneck is almost always the bridges.

What is new in Kamailio 6.0/6.1? 6.0 stabilized HTTP/2 client; 6.1 (2026) added improved dispatcher metrics and finer tlsf integration. KamailioWorld 2026 highlighted production deployments mixing dispatcher with WebSocket-backed mobile push.

Sources

Start a 14-day trial, see pricing for $149/$499/$1499 tiers, or contact us about scaled on-prem AI voice deployments.

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Voice Agents

MOS Call Quality Scoring for AI Voice Operations in 2026: Beyond 4.2

MOS 4.3+ is the band where AI voice feels human. Drop below 3.6 and conversations break. Here is how to measure, improve, and alert on MOS in production AI voice using G.711, Opus, and the underlying packet loss / jitter / latency math.

AI Strategy

State Data Residency for AI Voice in Healthcare — Texas, Nevada, Colorado in 2026

Texas SB 1188 requires US-resident EHRs from January 1, 2026; Nevada's consumer-health-data law constrains health data; Colorado AI Act takes effect June 30, 2026. AI voice agents must architect for state-by-state data localization.

AI Engineering

SIP Debugging with sngrep and Wireshark for AI Voice Calls in 2026: The Hands-On Playbook

When your AI voice agent gets one-way audio, missed DTMF, or codec mismatch, sngrep and Wireshark are still the fastest path to root cause in 2026. Here is the playbook.

AI Infrastructure

RTP Transcoding Cost for AI Voice in 2026: Why Edge Placement Beats Central GPU

Transcoding RTP to WebSocket is more CPU-intensive than people expect. For AI voice in 2026, where you place the transcode (edge near the carrier vs central near the model) decides your cost-per-minute.

Technical Guides

Scaling AI Voice Agents to 1000+ Concurrent Calls: Architecture Guide

Architecture patterns for scaling AI voice agents to 1000+ concurrent calls — horizontal scaling, connection pooling, and queue management.

AI Infrastructure

E911 Address Registration for AI Numbers in 2026: Kari's Law and Ray Baum's Act Compliance

E911 is not optional. Kari's Law mandates direct 911 dialing on multi-line systems; Ray Baum's Act mandates dispatchable location. Both apply to AI voice deployments. FCC fines run $10k per day. Here is what to register and how.