---
title: "Build a Voice Agent with Asterisk + ARI + Open LLM (2026)"
description: "Asterisk + ARI + AudioSocket + an open LLM = a voice agent that drops into your existing PBX. No SIP-trunking provider lock-in — full Python orchestration."
canonical: https://callsphere.ai/blog/vw4h-build-voice-agent-asterisk-ari-open-llm
category: "AI Infrastructure"
tags: ["Asterisk", "ARI", "AudioSocket", "PBX", "Tutorial"]
author: "CallSphere Team"
published: 2026-04-22T00:00:00.000Z
updated: 2026-05-07T16:13:46.993Z
---

# Build a Voice Agent with Asterisk + ARI + Open LLM (2026)

> Asterisk + ARI + AudioSocket + an open LLM = a voice agent that drops into your existing PBX. No SIP-trunking provider lock-in — full Python orchestration.

> **TL;DR** — If you already run Asterisk or FreePBX, you don't need Twilio. Use ARI for call control + AudioSocket for raw RTP and bolt on faster-whisper + Llama via Ollama + Piper. Production-ready and SIP-trunk-agnostic.

## What you'll build

An Asterisk dialplan that, on incoming calls, hands the channel to a Python ARI app via AudioSocket. The app does STT, calls the LLM, synthesizes the reply, and pipes it back into the channel.

## Prerequisites

1. Asterisk 20 LTS (built with `chan_audiosocket`).
2. Linux box on the same network (or co-located).
3. Python 3.11, `pip install ari panoramisk faster-whisper piper-tts ollama numpy`.
4. Ollama with `llama3.1:8b`.
5. ARI configured in `/etc/asterisk/ari.conf` and a user (`username = aiuser`).

## Architecture

```mermaid
flowchart LR
  PSTN[Caller] -->|SIP| AST[Asterisk 20]
  AST -->|ARI events| APP[Python ARI app]
  AST |AudioSocket TCP| APP
  APP --> FW[faster-whisper]
  APP --> OLL[Ollama llama3.1:8b]
  APP --> PIP[Piper TTS]
```

## Step 1 — Asterisk dialplan

```ini
; /etc/asterisk/extensions.conf
[default]
exten => 100,1,NoOp(Inbound to AI)
 same => n,Answer()
 same => n,Stasis(ai-voice)        ; hand to ARI app named ai-voice
 same => n,Hangup()
```

```ini
; /etc/asterisk/ari.conf
[general]
enabled = yes
[aiuser]
type = user
read_only = no
password = supersecret
```

```ini
; /etc/asterisk/http.conf
[general]
enabled = yes
bindaddr = 0.0.0.0
```

Reload: `asterisk -rx "core reload"`.

## Step 2 — ARI app skeleton

```python

# app.py

import ari, asyncio, socket, struct, threading
client = ari.connect("[http://127.0.0.1:8088](http://127.0.0.1:8088)", "aiuser", "supersecret")

def on_start(channel_obj, ev):
    chan = channel_obj["channel"]
    print("New call:", chan.id)
    # Bridge to AudioSocket on local TCP 9090
    chan.continueInDialplan()  # we'll execute AudioSocket from dialplan instead

client.on_channel_event("StasisStart", on_start)
threading.Thread(target=client.run, args=(["ai-voice"],), daemon=True).start()
```

## Step 3 — Push the channel into AudioSocket

Update the dialplan to hand the channel to AudioSocket once Stasis logs the call:

```ini
[default]
exten => 100,1,Answer()
 same => n,AudioSocket(uuid=${CHANNEL(uniqueid)},server=127.0.0.1:9090)
 same => n,Hangup()
```

AudioSocket sends slin16 mono frames over a raw TCP socket — perfect for Whisper.

## Step 4 — TCP server that drives the conversation

```python

# audiosocket_server.py

import socket, struct, numpy as np, subprocess, ollama
from faster_whisper import WhisperModel
stt = WhisperModel("small.en", device="cpu", compute_type="int8")

KIND_HANGUP, KIND_ID, KIND_SLIN, KIND_ERROR = 0x00, 0x01, 0x10, 0xFF

def read_frame(s):
    h = s.recv(3)
    if len(h) H", h[1:3])[0]
    return kind, s.recv(length)

def send_slin(s, pcm_int16):
    for i in range(0, len(pcm_int16), 320):
        chunk = pcm_int16[i:i+320].tobytes()
        s.sendall(struct.pack(">BH", KIND_SLIN, len(chunk)) + chunk)

def transcribe(buf):
    pcm = np.frombuffer(buf, dtype=np.int16).astype(np.float32) / 32768
    segs, _ = stt.transcribe(pcm, language="en", vad_filter=True)
    return " ".join(s.text for s in segs).strip()

def llm(history, text):
    history.append({"role":"user","content":text})
    r = ollama.chat(model="llama3.1:8b", messages=history,
                    options={"num_predict":140})
    history.append(r["message"])
    return r["message"]["content"]

def piper(text):
    p = subprocess.run(
        ["piper","--model","en_US-amy-medium","--output-raw"],
        input=text.encode(), capture_output=True)
    return np.frombuffer(p.stdout, dtype=np.int16)

srv = socket.socket(); srv.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
srv.bind(("0.0.0.0", 9090)); srv.listen()
print("AudioSocket listening :9090")
while True:
    conn, _ = srv.accept()
    history, buf = [], bytearray()
    while True:
        kind, payload = read_frame(conn)
        if kind is None or kind == KIND_HANGUP: break
        if kind == KIND_SLIN:
            buf.extend(payload)
            if len(buf) > 16000 * 2 * 2:   # ~2s audio
                text = transcribe(bytes(buf)); buf.clear()
                if text:
                    reply = llm(history, text)
                    pcm = piper(reply)
                    send_slin(conn, pcm)
```

## Step 5 — VAD for natural turn-taking

Wrap the buffer logic with `webrtcvad` instead of fixed 2-second windows:

```python
import webrtcvad
vad = webrtcvad.Vad(2)  # 0–3, higher = more aggressive

# Feed 20ms (320 byte) frames; trigger transcribe on trailing silence

```

## Step 6 — Run + test

```bash
ollama serve &
python audiosocket_server.py &
asterisk -rx "module load chan_audiosocket"

# dial extension 100 from a softphone

```

## Common pitfalls

- **AudioSocket module not loaded.** `asterisk -rx "module show like audiosocket"` should list it.
- **Sample rate.** AudioSocket is 8 kHz slin by default; force `SLIN16` with `Set(JITTERBUFFER(adaptive)=default)` and codec negotiation.
- **Endianness.** Length is big-endian; getting it wrong silently drops frames.

## How CallSphere does this in production

CallSphere bridges to Asterisk-style PBXs via SIP for enterprise deployments while running its own 37-agent stack on cloud Realtime + ElevenLabs. Healthcare uses 14 HIPAA tools on FastAPI :8084; OneRoof Property runs 10 specialists on Pion WebRTC; Salon, Dental, F&B and Behavioral fill out 6 verticals. 90+ tools · 115+ DB tables. [14-day trial](/trial) · [22% affiliate](/affiliate) · [/pricing](/pricing).

## FAQ

**FreePBX support?** Yes — same Asterisk under the hood.

**Why not chan_pjsip + ExternalMedia?** AudioSocket is simpler and more portable than ExternalMedia/RTP.

**Latency?** ~600–800 ms with VAD + small.en + 8B Q4.

**Concurrent calls?** Limited by your STT/LLM throughput; Asterisk handles thousands.

**HIPAA?** Lock down logs, set `recording=no` unless consented, encrypt SIP with TLS + SRTP.

## Sources

- [Asterisk AI Voice Agent](https://github.com/hkjarral/AVA-AI-Voice-Agent-for-Asterisk)
- [Real-time Asterisk AudioSocket guide](https://medium.com/@anilmathewm/real-time-ai-voice-agents-with-asterisk-audiosocket-2026-guide-6d00d7efa840)
- [Asterisk + Realtime API repo](https://towardsai.net/p/machine-learning/how-to-build-an-ai-voice-agent-with-openai-realtime-api-asterisk-sip-2025-using-python-with-github-repo)
- [BrightCoding open-source AI for FreePBX](https://www.blog.brightcoding.dev/2025/12/07/the-ultimate-open-source-ai-voice-agent-for-asterisk-freepbx-transform-your-phone-system-in-5-minutes/)

---

Source: https://callsphere.ai/blog/vw4h-build-voice-agent-asterisk-ari-open-llm
