TL;DR — ElevenLabs Conversational AI gives you a hosted agent with TTS-grade voices and built-in turn-taking. The @elevenlabs/client package handles the WebSocket; you only write your tool handlers and a thin Express layer for signed URLs.

What you'll build

A small Express service that mints a signed conversation URL, plus a Node.js client that joins the agent, executes registered "client tools" (like get_weather or book_slot), and streams audio. By the end you'll have a working voice loop with one of ElevenLabs' premium voices and tools the agent can call mid-conversation.

Prerequisites

ElevenLabs account with a created Conversational AI agent (note the agent_id).
npm install express @elevenlabs/client elevenlabs node-fetch.
Node 20+ and a microphone (or audio file source).
ELEVENLABS_API_KEY in env.
Basic familiarity with WebSocket auth (signed URLs).

Architecture

flowchart LR
  Browser -->|GET /signed-url| Express
  Express -->|REST| ElevenLabs
  Browser -->|WS conversation| ElevenLabs
  Browser -->|client_tool calls| ToolHandler

Step 1 — Create the agent in ElevenLabs

In the dashboard, create an agent, paste a system prompt, pick a voice (e.g., Rachel), and define one client tool with name get_booking_slots and a JSON Schema for params. Copy the agent_id.

Step 2 — Express endpoint to mint a signed URL

For private agents you need a server-signed URL — never ship your API key to the browser.

```ts // server.ts import express from "express"; import fetch from "node-fetch";

const app = express(); app.get("/signed-url", async (_req, res) => { const r = await fetch( `https://api.elevenlabs.io/v1/convai/conversation/get-signed-url?agent_id=${process.env.AGENT_ID}\`, { headers: { "xi-api-key": process.env.ELEVENLABS_API_KEY! } } ); const { signed_url } = await r.json(); res.json({ signed_url }); }); app.listen(3000); ```

Step 3 — Browser conversation with client tools

```ts // client.ts (bundled to browser) import { Conversation } from "@elevenlabs/client";

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

async function start() { const { signed_url } = await fetch("/signed-url").then(r => r.json());

document.getElementById("end")!.onclick = () => conversation.endSession(); } start(); ```

Step 4 — Python alternative (server-side)

If your tool execution belongs server-side (DB writes, secrets), run the agent from Python and stream audio over your own transport:

```python from elevenlabs.client import ElevenLabs from elevenlabs.conversational_ai.conversation import Conversation, ClientTools from elevenlabs.conversational_ai.default_audio_interface import DefaultAudioInterface

client = ElevenLabs(api_key=os.environ["ELEVENLABS_API_KEY"]) tools = ClientTools()

async def get_booking_slots(params): return await db.fetch_slots(params["date"])

tools.register("get_booking_slots", get_booking_slots, is_async=True)

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

conv = Conversation( client=client, agent_id=os.environ["AGENT_ID"], requires_auth=True, audio_interface=DefaultAudioInterface(), client_tools=tools, ) conv.start_session() ```

Step 5 — Wire to Twilio for phone calls

ElevenLabs has a native Twilio integration: import a Twilio number into the ElevenLabs dashboard, and inbound calls are routed to your agent automatically. For outbound, hit the Twilio outbound endpoint:

```ts await fetch(`https://api.elevenlabs.io/v1/convai/twilio/outbound-call\`, { method: "POST", headers: { "xi-api-key": KEY, "content-type": "application/json" }, body: JSON.stringify({ agent_id: AGENT_ID, agent_phone_number_id: PHONE_ID, to_number: "+18453884261", }), }); ```

Step 6 — Logging tool calls

Every clientTool invocation is a great audit hook — log the args and the response to your DB so you can replay conversations later.

Common pitfalls

Shipping the API key to the browser: always proxy through /signed-url.
Tool returning non-string: always JSON.stringify the response — the agent reads it as text.
Microphone permission failures on iOS Safari: must be triggered by a user gesture (button click), not on page load.
Agent says "I don't know": check that tool names match exactly between the dashboard and your handlers.

How CallSphere does this in production

CallSphere's Salon vertical runs 4 ElevenLabs Conversational AI agents (booking, rescheduling, FAQ, retention) with GB-YYYYMMDD-### booking references handed back to the agent as tool results. Healthcare uses OpenAI Realtime PCM16 24kHz instead, but the tool registration pattern is identical. Pricing starts at $149/$499/$1499; 14-day trial here.

FAQ

OpenAI Realtime vs ElevenLabs Conversational AI? ElevenLabs ships with premium voices and a hosted dashboard. OpenAI Realtime is rawer but cheaper and lower-latency for phone.

Can I bring my own LLM? ElevenLabs supports custom LLMs (Claude, GPT-4o) via the agent settings.

Are tool calls billed separately? No — tool execution happens in your code, billing covers only conversation minutes.

How long can a session last? Up to 30 minutes per ElevenLabs limits as of April 2026.

How to Build a Node.js Voice Agent with ElevenLabs Conversational AI

What you'll build

Prerequisites

Architecture

Step 1 — Create the agent in ElevenLabs

Step 2 — Express endpoint to mint a signed URL

Step 3 — Browser conversation with client tools

Step 4 — Python alternative (server-side)

Step 5 — Wire to Twilio for phone calls

Step 6 — Logging tool calls

Common pitfalls

How CallSphere does this in production

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

Build a Chat Agent with Haystack RAG + Open LLM (Llama 3.2, 2026)

Build a Voice Agent on Cloudflare Workers AI (No External LLM)

How to Build Voice Agent CI/CD with Evals as Gate (GitHub Actions)

Vercel AI SDK v5 Agent Patterns: stopWhen, prepareStep, and Loop Control

Build a CallSphere-Style Outbound Voice Campaign Tool

Mastra.ai: The TypeScript Agent Framework Worth Trying in 2026