---
title: "Build an Elixir Phoenix Channels Voice Agent with a LiveView UI"
description: "Combine Phoenix Channels for streaming audio frames with LiveView for the chat UI. BEAM concurrency means one node can hold tens of thousands of voice sessions."
canonical: https://callsphere.ai/blog/vw2h-build-elixir-phoenix-channels-voice-agent-liveview
category: "AI Voice Agents"
tags: ["Tutorial", "Build", "Elixir", "Phoenix", "LiveView", "Channels"]
author: "CallSphere Team"
published: 2026-03-22T00:00:00.000Z
updated: 2026-05-07T09:27:39.592Z
---

# Build an Elixir Phoenix Channels Voice Agent with a LiveView UI

> Combine Phoenix Channels for streaming audio frames with LiveView for the chat UI. BEAM concurrency means one node can hold tens of thousands of voice sessions.

> **TL;DR** — A Phoenix node on a 2-vCPU box can hold 50k idle voice sockets without breaking a sweat. Channels move audio bytes; LiveView paints the transcript. The OpenAI Realtime client lives in a per-session GenServer.

## What you'll build

A Phoenix 1.7 app that opens a Channel for binary audio frames, runs each call in a supervised GenServer that talks to OpenAI Realtime via `websockex`, and uses LiveView to render the live transcript. Pipe audio in, see the answer appear word-by-word, hear it back in the browser.

## Prerequisites

1. Elixir 1.17+, Erlang/OTP 27+.
2. `mix phx.new voice --no-ecto --live`.
3. `{:websockex, "~> 0.4.3"}` in `mix.exs`.
4. `OPENAI_API_KEY` in your runtime config.
5. A browser that supports `MediaRecorder` (any modern browser).

## Architecture

```mermaid
flowchart LR
  B[Browser MediaRecorder] -- WS audio --> C[Phoenix Channel]
  C -- cast frames --> G[Per-call GenServer]
  G -- WebSocket --> O[OpenAI Realtime]
  O -- delta --> G
  G -- broadcast --> L[LiveView Transcript]
```

## Step 1 — Boot the channel

In `lib/voice_web/channels/voice_channel.ex`:

```elixir
defmodule VoiceWeb.VoiceChannel do
  use Phoenix.Channel

def join("voice:" <> session_id, _params, socket) do
    {:ok, pid} = Voice.Session.start_link(session_id, self())
    {:ok, assign(socket, :session, pid)}
  end

def handle_in("audio", {:binary, payload}, socket) do
    Voice.Session.send_audio(socket.assigns.session, payload)
    {:noreply, socket}
  end
end
```

Wire it into the user socket:

```elixir
defmodule VoiceWeb.UserSocket do
  use Phoenix.Socket
  channel "voice:*", VoiceWeb.VoiceChannel
  def connect(_, socket, *), do: {:ok, socket}
  def id(*), do: nil
end
```

## Step 2 — The per-call GenServer

```elixir
defmodule Voice.Session do
  use GenServer
  require Logger

def start_link(id, channel_pid),
    do: GenServer.start_link(**MODULE**, {id, channel_pid})

def send_audio(pid, bytes), do: GenServer.cast(pid, {:audio, bytes})

@impl true
  def init({id, channel}) do
    {:ok, oai} = Voice.OpenAIClient.start_link(self())
    {:ok, %{id: id, channel: channel, oai: oai}}
  end

@impl true
  def handle_cast({:audio, bytes}, %{oai: oai} = s) do
    Voice.OpenAIClient.append(oai, bytes)
    {:noreply, s}
  end

@impl true
  def handle_info({:openai, %{"type" => "response.audio.delta", "delta" => b64}}, s) do
    send(s.channel, {:audio_chunk, b64})
    {:noreply, s}
  end

def handle_info({:openai, %{"type" => "response.text.delta", "delta" => t}}, s) do
    Phoenix.PubSub.broadcast(Voice.PubSub, "transcript:" <> s.id, {:token, t})
    {:noreply, s}
  end
end
```

## Step 3 — websockex client to OpenAI

```elixir
defmodule Voice.OpenAIClient do
  use WebSockex
  @url "wss://api.openai.com/v1/realtime?model=gpt-4o-realtime-preview-2025-06-03"

def start_link(parent) do
    key = System.fetch_env!("OPENAI_API_KEY")
    WebSockex.start_link(@url, **MODULE**, %{parent: parent},
      extra_headers: [
        {"Authorization", "Bearer " <> key},
        {"OpenAI-Beta", "realtime=v1"}
      ])
  end

def append(pid, bytes) do
    msg = Jason.encode!(%{
      type: "input_audio_buffer.append",
      audio: Base.encode64(bytes)
    })
    WebSockex.send_frame(pid, {:text, msg})
  end

def handle_frame({:text, raw}, %{parent: p} = s) do
    send(p, {:openai, Jason.decode!(raw)})
    {:ok, s}
  end
end
```

## Step 4 — LiveView transcript

```elixir
defmodule VoiceWeb.TranscriptLive do
  use VoiceWeb, :live_view

def mount(%{"id" => id}, _, socket) do
    Phoenix.PubSub.subscribe(Voice.PubSub, "transcript:" <> id)
    {:ok, assign(socket, transcript: "", id: id)}
  end

def handle_info({:token, t}, socket),
    do: {:noreply, assign(socket, :transcript, socket.assigns.transcript <> t)}

def render(assigns) do
    ~H"""

```

```

    """
  end
end
```

## Step 5 — Browser hook

```javascript
let Hooks = {}
Hooks.Mic = {
  mounted() {
    let socket = new Phoenix.Socket("/socket")
    socket.connect()
    let chan = socket.channel("voice:" + this.el.dataset.id)
    chan.join()
    navigator.mediaDevices.getUserMedia({audio:true}).then(stream => {
      let rec = new MediaRecorder(stream, { mimeType: "audio/webm;codecs=opus" })
      rec.ondataavailable = (e) => e.data.arrayBuffer().then(buf => {
        chan.push("audio", new Uint8Array(buf))
      })
      rec.start(100) // 100ms slices
    })
  }
}
```

## Common pitfalls

- **Single GenServer for all calls** — kills concurrency; spawn one per session.
- **Sending raw binary as text frames** — Phoenix Channels support `{:binary, _}`; use it.
- **No supervisor** — wrap `Voice.Session` under a `DynamicSupervisor`.
- **Forgetting PubSub** — LiveView and the GenServer aren't the same process.

## How CallSphere does this in production

While CallSphere's voice stack is FastAPI + Pion, our **status board** uses Phoenix LiveView for live agent metrics across 37 agents and 90+ tools. BEAM is unbeatable for many-tiny-process fan-out. Run our [Real Estate agent demo](/industries/real-estate) to see what 50k concurrent socket scaling looks like in practice.

## FAQ

**Can BEAM really hold 50k voice sockets?** Yes — each is ~50KB process heap. RAM is the limit, not CPU.

**Why websockex instead of mint?** It's a higher-level supervised client; mint is fine if you want full control.

**How do I add tools (functions)?** Listen for `response.function_call_arguments.done` and reply with `conversation.item.create`.

**Audio format?** `g711_ulaw` for telephony, `pcm16` for browser.

**Where do I deploy?** fly.io or Gigalixir; both support clustered Phoenix.

## Sources

- [Phoenix LiveView docs](https://hexdocs.pm/phoenix_live_view)
- [Phoenix Channels guide](https://hexdocs.pm/phoenix/channels.html)
- [websockex hexdocs](https://hexdocs.pm/websockex)
- [Implementing Conversational Agents in Elixir (Sean Moriarity)](https://seanmoriarity.com/2024/02/25/implementing-natural-conversational-agents-with-elixir/)

---

Source: https://callsphere.ai/blog/vw2h-build-elixir-phoenix-channels-voice-agent-liveview
