baresip and libre for Lightweight SIP AI Clients in 2026
baresip is a portable modular SIP user agent with audio and video, built on the libre real-time-comms library. For headless AI bots, kiosks, and IoT devices that need to dial out and stream audio, the 2026 builds are smaller and faster than they look.
baresip ships as a 1 MB binary with a fully working SIP user agent and a module system that lets you load only what you need. baresip-android 79.1.0 (April 29 2026) is on F-Droid, libbaresip ships in major Linux distros, and the project is BSD-licensed. For headless AI dial-out bots and kiosk-class IoT, it punches above its weight.
Background
baresip and libre are sibling projects originally created by Alfred Heggestad. libre is the foundation: a generic real-time communications library with async IO, a SIP stack, RTP/RTCP, SRTP, ICE, TURN, and WebSocket. baresip is the SIP user agent built on libre. Together they cover roughly the same surface as PJSIP at a smaller binary footprint.
baresip's module system lets you assemble a custom build: alsa or pulseaudio for audio IO, opus or codec2 for narrowband, gst for GStreamer pipelines, presence for SIP SIMPLE, mqtt for IoT control. The aubridge module is particularly interesting for AI voice: it lets you splice an audio source/sink into a call, which means your AI brain reads from one named pipe and writes to another while baresip handles SIP and RTP.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Architecture
graph TD
A[IoT Device or Server] --> B[baresip + libre]
B -->|alsa| C[Microphone]
B -->|alsa| D[Speaker]
B -->|aubridge| E[Named Pipe / FIFO]
E --> F[Python AI Brain]
F --> G[OpenAI Realtime]
G --> F
F --> E
B -->|SIP+RTP| H[Carrier or AI SIP Bridge]
# Minimal baresip config for AI dial-out
echo 'audio_player aubridge,from-ai' >> ~/.baresip/config
echo 'audio_source aubridge,to-ai' >> ~/.baresip/config
# accounts file
echo '<sip:[email protected]>;auth_pass=secret' > ~/.baresip/accounts
# headless run
baresip -e "/dial sip:[email protected]"
The aubridge module turns the call into named pipes /tmp/baresip-from-ai and /tmp/baresip-to-ai (paths configurable). Your AI brain reads PCM from one and writes PCM to the other. Latency is single-digit milliseconds because there is no extra encode/decode pass on the bridge.
CallSphere implementation
CallSphere does not ship baresip clients today. All inbound and outbound calls (Healthcare AI on FastAPI :8084 to OpenAI Realtime, Real Estate AI, Sales Calling AI with 5 concurrent outbound, Salon AI, IT Helpdesk AI, After-Hours AI Twilio simul call+SMS 120-second timeout) terminate on Twilio Programmable Voice. 37 agents, 90+ tools, 115+ DB tables, HIPAA + SOC 2, $149/$499/$1499 plans, 14-day trial, 22% affiliate. For prospects who want a software phone for their on-call team or a kiosk-class device that can talk to our AI on their existing SIP trunk, our reference design uses baresip with the aubridge module to splice into our /api/voice-agent/capture endpoint over a small WebSocket relay.
Build steps
apt install baresip libbaresip-devon Debian/Ubuntu, or build from source for non-x86 targets.- Configure modules in ~/.baresip/config; load only ones you need to keep memory tight (~6 MB resident with audio).
- For AI dial-out, enable aubridge module and a codec module (opus, g722).
- Add SIP account to ~/.baresip/accounts pointing at your SIP bridge (LiveKit SIP, Twilio SIP domain, your own Drachtio).
- Run baresip with -e command-line script syntax to dial automatically and exit on hangup.
- Wire your AI brain to read/write the aubridge named pipes; convert to L16 16kHz before pushing to OpenAI Realtime.
- Test on the Android build (baresip-studio on F-Droid) for a portable POC.
Pitfalls
- aubridge timing relies on consistent reader/writer cadence; jitter on either side causes glitches. Add a small ring buffer.
- Some packaged baresip builds disable opus by default; check
baresip -moutput and rebuild if needed. - The libre license is BSD; some bundled modules may pull in LGPL deps (libgsm). Audit if you ship binaries.
- baresip is single-process per UA; for many concurrent calls run multiple processes or use the SIP server side instead.
- Default codec preference list is wider than you probably want; pin to opus or g722 for deterministic interop.
FAQ
Is baresip suitable for production AI bots? For low-concurrency outbound bots (one call per process, dozens of processes per server) yes. For high concurrency use a server-side framework like Pipecat or LiveKit Agents.
baresip vs PJSIP for embedded? baresip has a smaller default footprint and a clearer module boundary; PJSIP has more bake-time and more codec coverage. Both are valid.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Does baresip support WebRTC? Partial: it can speak DTLS-SRTP and ICE for one-to-one media, but it is not a full SDP for WebRTC peer.
Can I script baresip from Python? Via its TCP control interface (mod_ctrl_tcp) you can issue commands. There is also a REST control via mod_httpd.
Is the Android app actively maintained? Yes; version 79.1.0 was published April 29 2026 on F-Droid.
Sources
Start a 14-day trial of our managed cloud voice, see pricing for tiers, or contact us about a baresip reference design for kiosk and IoT use cases.
## baresip and libre for Lightweight SIP AI Clients in 2026: production view baresip and libre for Lightweight SIP AI Clients in 2026 ultimately resolves into one engineering question: when do you use the OpenAI Realtime API versus an async pipeline? Realtime wins on latency for live calls. Async wins on cost, retries, and structured tool reliability for callbacks and SMS flows. Most teams need both, and the routing layer between them becomes the most load-bearing piece of the stack. ## Shipping the agent to production Production AI agents live or die on three loops: evals, retries, and handoff state. CallSphere runs **37 agents** across 6 verticals, each with its own eval suite — synthetic call transcripts replayed nightly with assertion checks on extracted entities (date, time, party size, insurance, address). Without that loop, prompt regressions ship silently and you only find out when bookings drop. Structured tools beat free-form text every time. Our **90+ function tools** all enforce JSON schemas validated server-side; if the model hallucinates an integer where a string is required, we retry with a corrective system message before falling back to a deterministic path. For long-running flows, we treat agent handoffs as a state machine — booking → confirmation → SMS — so context survives turn boundaries. The Realtime API vs. async decision usually comes down to "is the user holding the phone right now?" If yes, Realtime; if no (callback queue, after-hours voicemail), async wins on cost-per-conversation, which we track per agent in **115+ database tables** spanning all 6 verticals. ## FAQ **Is this realistic for a small business, or is it enterprise-only?** 57+ languages are supported out of the box, and the platform is HIPAA and SOC 2 aligned, which removes most of the procurement friction in regulated verticals. For a topic like "baresip and libre for Lightweight SIP AI Clients in 2026", that means you're not starting from scratch — you're configuring an agent template that's already been hardened across thousands of conversations. **Which integrations have to be in place before launch?** Day one is integration mapping (scheduler, CRM, messaging) and prompt tuning against your top 20 real call transcripts. Day two through five is shadow-mode running, where the agent transcribes and recommends but a human still answers, so you can compare side-by-side. Go-live is the moment your eval pass-rate clears your internal bar. **How do we measure whether it's actually working?** The honest answer: it scales until your tool catalog gets stale. The agent is only as good as the integrations it can actually call, so the operational discipline is keeping schemas, webhooks, and fallback paths green. The platform handles the rest — observability, retries, multi-region routing — without your team owning the GPU layer. ## Talk to us Want to see how this maps to your stack? Book a live walkthrough at [calendly.com/sagar-callsphere/new-meeting](https://calendly.com/sagar-callsphere/new-meeting), or try the vertical-specific demo at [urackit.callsphere.tech](https://urackit.callsphere.tech). 14-day trial, no credit card, pilot live in 3–5 business days.Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.