---
title: "Replace Synthflow With a Self-Hosted FastAPI Voice Agent"
description: "Synthflow's no-code builder hits walls fast — branching tools, custom auth, real CRMs. Move to a self-hosted FastAPI agent and unlock everything in 600 lines."
canonical: https://callsphere.ai/blog/vw3h-replace-synthflow-with-self-hosted-fastapi-voice-agent
category: "AI Engineering"
tags: ["Synthflow", "FastAPI", "Self-hosted", "Voice AI", "Tutorial"]
author: "CallSphere Team"
published: 2026-03-22T00:00:00.000Z
updated: 2026-05-07T09:59:34.367Z
---

# Replace Synthflow With a Self-Hosted FastAPI Voice Agent

> Synthflow's no-code builder hits walls fast — branching tools, custom auth, real CRMs. Move to a self-hosted FastAPI agent and unlock everything in 600 lines.

> **TL;DR** — Synthflow charges $0.09/min for the engine plus an LLM markup. Self-hosting the same flow on a $40/mo box drops you to LLM-cost-only and unlocks any tool integration that Synthflow's drag-and-drop can't reach.

## What you'll build

A FastAPI service running on a single 4-vCPU VM that connects Twilio inbound calls to OpenAI Realtime, executes Python tools (any code you want, no JSON-only nodes), persists transcripts in Postgres, and exposes a /sessions admin UI similar to Synthflow's dashboard.

## Prerequisites

1. Synthflow account with at least one published agent and tool configurations exported (screenshots are fine).
2. A VM (Hetzner CX32, AWS t3.medium, or equivalent) with Docker + Postgres.
3. Twilio number and OpenAI Realtime key.
4. Python 3.11, FastAPI, `asyncpg`, `websockets`.
5. Reverse proxy (Caddy or Traefik) for TLS — Twilio Media Streams requires WSS.

## Architecture

```mermaid
flowchart TB
  TW[Twilio] --> CADDY[Caddy WSS]
  CADDY --> APP[FastAPI :8000]
  APP --> OAI[OpenAI Realtime]
  APP --> PG[(Postgres)]
  APP --> TOOLS[Your Python tools]
```

## Step 1 — Schema and tool registry

```sql
CREATE TABLE call_sessions (
  id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  call_sid text UNIQUE,
  started_at timestamptz DEFAULT now(),
  ended_at timestamptz,
  transcript jsonb DEFAULT '[]'::jsonb,
  outcome text
);
CREATE INDEX ON call_sessions (started_at DESC);
```

## Step 2 — FastAPI shell with tool registration

```python
from fastapi import FastAPI, WebSocket
from typing import Callable
app = FastAPI()
TOOLS: dict[str, Callable] = {}

def tool(name: str, schema: dict):
    def deco(fn):
        TOOLS[name] = (fn, schema)
        return fn
    return deco

@tool("lookup_customer", {
    "type": "object",
    "required": ["phone"],
    "properties": {"phone": {"type": "string"}},
})
async def lookup_customer(phone: str):
    # any Python you want — psycopg, httpx, internal SDKs
    return {"name": "Maria", "tier": "gold"}
```

## Step 3 — Realtime bridge

```python
import websockets, json, os, asyncio

async def bridge(twilio_ws: WebSocket):
    headers = [("Authorization", f"Bearer {os.environ['OPENAI_API_KEY']}"),
               ("OpenAI-Beta", "realtime=v1")]
    url = "wss://api.openai.com/v1/realtime?model=gpt-4o-realtime-preview-2025-06-03"
    async with websockets.connect(url, additional_headers=headers) as oai:
        await oai.send(json.dumps({"type": "session.update", "session": {
            "instructions": open("system.md").read(),
            "voice": "alloy",
            "input_audio_format": "g711_ulaw",
            "output_audio_format": "g711_ulaw",
            "turn_detection": {"type": "server_vad"},
            "tools": [{"type":"function","name":n,**({"description":""}),"parameters":s}
                      for n,(_,s) in TOOLS.items()],
        }}))
        await asyncio.gather(pump_in(twilio_ws, oai),
                             pump_out(twilio_ws, oai))
```

## Step 4 — Tool execution loop

```python
async def handle_function_call(oai, ev):
    name = ev["name"]; args = json.loads(ev["arguments"])
    fn, _ = TOOLS[name]
    try:
        result = await fn(**args)
    except Exception as e:
        result = {"error": str(e)}
    await oai.send(json.dumps({
        "type": "conversation.item.create",
        "item": {"type": "function_call_output",
                 "call_id": ev["call_id"],
                 "output": json.dumps(result)},
    }))
    await oai.send(json.dumps({"type": "response.create"}))
```

## Step 5 — Transcript persistence

Subscribe to `response.audio_transcript.done` and `conversation.item.input_audio_transcription.completed`, append to `call_sessions.transcript`, set `outcome` from any tool that calls `set_outcome`.

## Step 6 — Admin UI

A 60-line Next.js page hits `GET /sessions` and renders waveforms + transcripts. Replace Synthflow's dashboard in a day.

## Step 7 — Deploy

Caddyfile:

```
agent.example.com {
  reverse_proxy 127.0.0.1:8000
}
```

Twilio Voice URL: `https://agent.example.com/twilio-voice` (returns TwiML ``).

## Common pitfalls

- **Reverse proxy buffering kills audio.** Disable buffering for the WSS path.
- **VAD too generous on noisy lines.** Tune `silence_duration_ms` per locale.
- **Forgetting `response.create` after a tool result.** Calls go silent forever.

## How CallSphere does this in production

This pattern *is* CallSphere — at 100x scale. Healthcare's FastAPI on :8084 ships 14 tools (PHI redaction, eligibility lookup, appointment booking) under HIPAA. OneRoof Property uses 10 specialists over WebRTC + Pion + NATS. Salon runs 4 ElevenLabs agents with `GB-YYYYMMDD-###` references. Pricing: $149/$499/$1499 with [14-day trial](/trial). Compare on [/compare/synthflow](/compare/synthflow).

## FAQ

**Is FastAPI fast enough?** Easily — async I/O dominates.

**What about phone number pools?** Twilio Elastic SIP supports number pooling natively.

**HIPAA?** Add audit logs, encrypt-at-rest, BAA with Twilio + OpenAI. CallSphere's healthcare stack does this end-to-end.

**Branching like Synthflow's flow builder?** Use sub-agents (handoffs) instead of nodes.

**Cost at 30k min/mo?** Synthflow: ~$2,700+. Self-host: ~$1,400.

## Sources

- [Synthflow pricing](https://synthflow.ai/blog/voice-ai-cost)
- [FastAPI WebSockets](https://fastapi.tiangolo.com/advanced/websockets/)
- [OpenAI Realtime](https://platform.openai.com/docs/guides/realtime)
- [/compare/synthflow](https://callsphere.ai/compare/synthflow)

---

Source: https://callsphere.ai/blog/vw3h-replace-synthflow-with-self-hosted-fastapi-voice-agent
