---
title: "Build a Voice Agent with Pipecat: Python Pipeline Framework (2026)"
description: "Pipecat 0.0.7x ships a frame-based pipeline for real-time voice. Wire Daily WebRTC, Deepgram, GPT-4o, and Cartesia into a working agent — code + pitfalls."
canonical: https://callsphere.ai/blog/vw9h-build-voice-agent-pipecat-python-framework-2026
category: "AI Voice Agents"
tags: ["Pipecat", "Voice Agent", "Python", "Daily", "Cartesia"]
author: "CallSphere Team"
published: 2026-03-17T00:00:00.000Z
updated: 2026-05-08T03:13:52.472Z
---

# Build a Voice Agent with Pipecat: Python Pipeline Framework (2026)

> Pipecat 0.0.7x ships a frame-based pipeline for real-time voice. Wire Daily WebRTC, Deepgram, GPT-4o, and Cartesia into a working agent — code + pitfalls.

> **TL;DR** — Pipecat is an open-source frame-based pipeline framework from Daily.co. You compose a list of processors (transport → STT → context → LLM → TTS → transport) and Pipecat handles every microsecond of timing, interruption, and back-pressure between them.

## What you'll build

A Daily room voice agent that joins a call, listens with Deepgram, reasons with GPT-4o, and speaks back with Cartesia Sonic-3 — running locally on `python bot.py` and deployable to Daily Bots, Cerebrium, or Modal.

## Architecture

```mermaid
flowchart LR
  RM[Daily room] --> TR[DailyTransport]
  TR --> STT[Deepgram STT]
  STT --> CTX[OpenAILLMContext]
  CTX --> LLM[OpenAI GPT-4o]
  LLM --> TTS[Cartesia Sonic-3]
  TTS --> TR --> RM
```

## Step 1 — Install

```bash
python -m venv .venv && source .venv/bin/activate
pip install "pipecat-ai[daily,deepgram,openai,cartesia,silero]"
```

## Step 2 — Build the pipeline

```python
import os, asyncio
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask, PipelineParams
from pipecat.transports.services.daily import DailyTransport, DailyParams
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.audio.vad.silero import SileroVADAnalyzer
```

## Step 3 — Wire it up

```python
async def main(room_url: str, token: str):
    transport = DailyTransport(
        room_url, token, "Pipecat Bot",
        DailyParams(audio_in_enabled=True, audio_out_enabled=True,
                    vad_analyzer=SileroVADAnalyzer()),
    )
    stt = DeepgramSTTService(api_key=os.environ["DEEPGRAM_API_KEY"])
    llm = OpenAILLMService(api_key=os.environ["OPENAI_API_KEY"], model="gpt-4o")
    tts = CartesiaTTSService(
        api_key=os.environ["CARTESIA_API_KEY"],
        voice_id="79a125e8-cd45-4c13-8a67-188112f4dd22",
        model="sonic-3",
    )
    ctx = OpenAILLMContext([{"role": "system",
                             "content": "You are a helpful clinic concierge."}])
    agg = llm.create_context_aggregator(ctx)
    pipeline = Pipeline([
        transport.input(), stt, agg.user(),
        llm, tts, transport.output(), agg.assistant(),
    ])
    task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
    await PipelineRunner().run(task)
```

## Step 4 — Hook the join event

```python
@transport.event_handler("on_first_participant_joined")
async def on_join(transport, participant):
    await transport.capture_participant_transcription(participant["id"])
    await task.queue_frames([
        ctx.create_user_message("Greet the caller and ask how you can help.")
    ])
```

## Step 5 — Run

```bash
DEEPGRAM_API_KEY=... OPENAI_API_KEY=... CARTESIA_API_KEY=... \
  python bot.py --url [https://yourorg.daily.co/agent](https://yourorg.daily.co/agent) --token ${DAILY_TOKEN}
```

## Step 6 — Function calling

Pipecat's `OpenAILLMContext` supports the OpenAI tools schema directly. Add `tools=[...]` and the LLM service emits `FunctionCallInProgressFrame` / `FunctionCallResultFrame` you handle with `llm.register_function("name", handler)`.

## Pitfalls

- **VAD positioning**: Place `SileroVADAnalyzer` on the transport, NOT in the pipeline — frames must be VAD-tagged before they reach the aggregator.
- **Aggregator order**: `agg.user()` BEFORE the LLM, `agg.assistant()` AFTER the TTS — reversing it loses tool messages.
- **`allow_interruptions`**: Off by default in some templates; turn it on or the agent talks over the user.
- **Cartesia voice IDs**: Region matters — pull voice IDs from your account, not docs.

## How CallSphere does this

CallSphere runs **37 agents** in **6 verticals** with **90+ tools** and **115+ DB tables**. Pipecat powers the salon and behavioral health products at a steady ~680ms p50. **$149/$499/$1,499** plans, **14-day trial**, **22% affiliate**.

## FAQ

**Pipecat vs LiveKit Agents?** Pipecat is lower-level — you control every frame. LiveKit Agents is higher-level with built-in dispatch.

**Can I swap transports?** Yes — DailyTransport, LiveKitTransport, WebsocketServerTransport, FastAPIWebsocketTransport, and Twilio all share the same interface.

**Is it production-ready?** NVIDIA NIM ships Pipecat as their reference voice agent blueprint and AWS published a multi-part guide pairing it with Bedrock.

**How do I observe it?** Pipecat emits OpenTelemetry spans for every processor — point your collector at the runner.

## Sources

- Pipecat Docs - Introduction - [https://docs.pipecat.ai/getting-started/introduction](https://docs.pipecat.ai/getting-started/introduction)
- GitHub - pipecat-ai/pipecat - [https://github.com/pipecat-ai/pipecat](https://github.com/pipecat-ai/pipecat)
- HackerNoon - Real-Time Voice Agent with Pipecat - [https://hackernoon.com/how-to-build-a-real-time-voice-agent-with-pipecat](https://hackernoon.com/how-to-build-a-real-time-voice-agent-with-pipecat)
- AWS - Intelligent AI Voice Agents with Pipecat + Bedrock - [https://aws.amazon.com/blogs/machine-learning/building-intelligent-ai-voice-agents-with-pipecat-and-amazon-bedrock-part-1/](https://aws.amazon.com/blogs/machine-learning/building-intelligent-ai-voice-agents-with-pipecat-and-amazon-bedrock-part-1/)

---

Source: https://callsphere.ai/blog/vw9h-build-voice-agent-pipecat-python-framework-2026
