By Sagar Shankaran, Founder of CallSphere
baresip is a portable modular SIP user agent with audio and video, built on the libre real-time-comms library. For headless AI bots, kiosks, and IoT devices that need to dial out and stream audio, the 2026 builds are smaller and faster than they look.
Key takeaways
baresip ships as a 1 MB binary with a fully working SIP user agent and a module system that lets you load only what you need. baresip-android 79.1.0 (April 29 2026) is on F-Droid, libbaresip ships in major Linux distros, and the project is BSD-licensed. For headless AI dial-out bots and kiosk-class IoT, it punches above its weight.
baresip and libre are sibling projects originally created by Alfred Heggestad. libre is the foundation: a generic real-time communications library with async IO, a SIP stack, RTP/RTCP, SRTP, ICE, TURN, and WebSocket. baresip is the SIP user agent built on libre. Together they cover roughly the same surface as PJSIP at a smaller binary footprint.
baresip's module system lets you assemble a custom build: alsa or pulseaudio for audio IO, opus or codec2 for narrowband, gst for GStreamer pipelines, presence for SIP SIMPLE, mqtt for IoT control. The aubridge module is particularly interesting for AI voice: it lets you splice an audio source/sink into a call, which means your AI brain reads from one named pipe and writes to another while baresip handles SIP and RTP.
graph TD
A[IoT Device or Server] --> B[baresip + libre]
B -->|alsa| C[Microphone]
B -->|alsa| D[Speaker]
B -->|aubridge| E[Named Pipe / FIFO]
E --> F[Python AI Brain]
F --> G[OpenAI Realtime]
G --> F
F --> E
B -->|SIP+RTP| H[Carrier or AI SIP Bridge]
# Minimal baresip config for AI dial-out
echo 'audio_player aubridge,from-ai' >> ~/.baresip/config
echo 'audio_source aubridge,to-ai' >> ~/.baresip/config
# accounts file
echo '<sip:bot@callsphere.local>;auth_pass=secret' > ~/.baresip/accounts
# headless run
baresip -e "/dial sip:ai-agent@bridge.callsphere.local"
The aubridge module turns the call into named pipes /tmp/baresip-from-ai and /tmp/baresip-to-ai (paths configurable). Your AI brain reads PCM from one and writes PCM to the other. Latency is single-digit milliseconds because there is no extra encode/decode pass on the bridge.
CallSphere does not ship baresip clients today. All inbound and outbound calls (Healthcare AI on FastAPI :8084 to OpenAI Realtime, Real Estate AI, Sales Calling AI with 5 concurrent outbound, Salon AI, IT Helpdesk AI, After-Hours AI Twilio simul call+SMS 120-second timeout) terminate on Twilio Programmable Voice. 37 agents, 90+ tools, 115+ DB tables, HIPAA + SOC 2, $149/$499/$1499 plans, 14-day trial, 22% affiliate. For prospects who want a software phone for their on-call team or a kiosk-class device that can talk to our AI on their existing SIP trunk, our reference design uses baresip with the aubridge module to splice into our /api/voice-agent/capture endpoint over a small WebSocket relay.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
apt install baresip libbaresip-dev on Debian/Ubuntu, or build from source for non-x86 targets.baresip -m output and rebuild if needed.Is baresip suitable for production AI bots? For low-concurrency outbound bots (one call per process, dozens of processes per server) yes. For high concurrency use a server-side framework like Pipecat or LiveKit Agents.
baresip vs PJSIP for embedded? baresip has a smaller default footprint and a clearer module boundary; PJSIP has more bake-time and more codec coverage. Both are valid.
Does baresip support WebRTC? Partial: it can speak DTLS-SRTP and ICE for one-to-one media, but it is not a full SDP for WebRTC peer.
Can I script baresip from Python? Via its TCP control interface (mod_ctrl_tcp) you can issue commands. There is also a REST control via mod_httpd.
Is the Android app actively maintained? Yes; version 79.1.0 was published April 29 2026 on F-Droid.
Start a 14-day trial of our managed cloud voice, see pricing for tiers, or contact us about a baresip reference design for kiosk and IoT use cases.
baresip and libre for Lightweight SIP AI Clients in 2026 ultimately resolves into one engineering question: when do you use the OpenAI Realtime API versus an async pipeline? Realtime wins on latency for live calls. Async wins on cost, retries, and structured tool reliability for callbacks and SMS flows. Most teams need both, and the routing layer between them becomes the most load-bearing piece of the stack.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Production AI agents live or die on three loops: evals, retries, and handoff state. CallSphere runs 37 agents across 6 verticals, each with its own eval suite — synthetic call transcripts replayed nightly with assertion checks on extracted entities (date, time, party size, insurance, address). Without that loop, prompt regressions ship silently and you only find out when bookings drop.
Structured tools beat free-form text every time. Our 90+ function tools all enforce JSON schemas validated server-side; if the model hallucinates an integer where a string is required, we retry with a corrective system message before falling back to a deterministic path. For long-running flows, we treat agent handoffs as a state machine — booking → confirmation → SMS — so context survives turn boundaries.
The Realtime API vs. async decision usually comes down to "is the user holding the phone right now?" If yes, Realtime; if no (callback queue, after-hours voicemail), async wins on cost-per-conversation, which we track per agent in 115+ database tables spanning all 6 verticals.
Is this realistic for a small business, or is it enterprise-only? 57+ languages are supported out of the box, and the platform is HIPAA and SOC 2 aligned, which removes most of the procurement friction in regulated verticals. For a topic like "baresip and libre for Lightweight SIP AI Clients in 2026", that means you're not starting from scratch — you're configuring an agent template that's already been hardened across thousands of conversations.
Which integrations have to be in place before launch? Day one is integration mapping (scheduler, CRM, messaging) and prompt tuning against your top 20 real call transcripts. Day two through five is shadow-mode running, where the agent transcribes and recommends but a human still answers, so you can compare side-by-side. Go-live is the moment your eval pass-rate clears your internal bar.
How do we measure whether it's actually working? The honest answer: it scales until your tool catalog gets stale. The agent is only as good as the integrations it can actually call, so the operational discipline is keeping schemas, webhooks, and fallback paths green. The platform handles the rest — observability, retries, multi-region routing — without your team owning the GPU layer.
Want to see how this maps to your stack? Book a live walkthrough at calendly.com/sagar-callsphere/new-meeting, or try the vertical-specific demo at urackit.callsphere.tech. 14-day trial, no credit card, pilot live in 3–5 business days.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
Where every millisecond goes between caller and AI: PSTN, carrier, STT, LLM, TTS, and back. The component-level targets that ship in 2026 and how to hit them.
What an SBC actually does, why AI voice deployments still need them in 2026, and how Oracle, Ribbon, AudioCodes, and Cisco fit into modern stacks.
How SIP REGISTER and INVITE work end-to-end, why your AI agent platform needs to handle 401 challenges and Record-Route correctly, and the failure modes that bite production builds.
LiveKit Agents went 1.0 in April 2025, hit 1.5 by April 2026, and shipped native SIP plus phone numbers so you no longer need a Twilio bridge. Add MCP tool support and adaptive interruption handling and it is the default open-source telephony framework for AI in 2026.
How RTP carries AI voice end-to-end, why Opus matters more than G.711 for model accuracy, and the codec negotiation patterns that ship in 2026.
Complete setup guide for connecting Twilio to an AI voice agent — SIP trunking, webhooks, streaming, and production hardening.
© 2026 CallSphere LLC. All rights reserved.