Skip to content
Build a Kotlin Android Voice Agent with the LiveKit SDK
Voice & Chat Agents11 min read9 views

Build a Kotlin Android Voice Agent with the LiveKit SDK

By Sagar Shankaran, Founder of CallSphere

Quick answer

Ship a native Android voice agent with Jetpack Compose and LiveKit's client SDK. Real working Kotlin code for permissions, room connect, and audio level animations.

Key takeaways

TL;DR — LiveKit's Android SDK + Compose components let you ship a production voice agent UI in under 200 lines. The SDK handles WebRTC, ICE, audio focus, and Bluetooth routing.

What you'll build

A Compose-based Android app that joins a LiveKit room hosting a Python LiveKit Agent. The user holds-to-talk; the agent transcribes, calls a tool to look up a salon appointment, and speaks back. We add an audio-level orb that pulses with the agent's voice.

Prerequisites

  1. Android Studio Iguana+, Kotlin 2.0, Compose BOM 2026.04.00+.
  2. LiveKit Cloud project (free tier) or a self-hosted server.
  3. A Python or Node LiveKit Agent already deployed.
  4. A token-issuing endpoint on your backend.
  5. RECORD_AUDIO permission in AndroidManifest.xml.

Architecture

flowchart LR
  A[Android Compose app] -- LiveKit token --> S[Your /token endpoint]
  A -- WebRTC --> L[LiveKit Cloud]
  L -- WebRTC --> P[LiveKit Agent Python]
  P -- LLM --> O[OpenAI / Workers AI / etc.]

Step 1 — Add deps

In build.gradle.kts:

```kotlin dependencies { implementation("io.livekit:livekit-android:2.13.0") implementation("io.livekit:livekit-android-compose-components:2.1.3") implementation("androidx.compose.material3:material3") implementation("androidx.lifecycle:lifecycle-runtime-compose:2.8.0") } ```

Step 2 — Manifest permissions

```xml ```

Step 3 — Token fetch

```kotlin suspend fun fetchToken(): Pair<String, String> = withContext(Dispatchers.IO) { val client = OkHttpClient() val req = Request.Builder() .url("https://api.callsphere.ai/voice/token") .build() client.newCall(req).execute().use { res -> val json = JSONObject(res.body!!.string()) json.getString("url") to json.getString("token") } } ```

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

Step 4 — Compose room screen

The RoomScope composable hides connection lifecycle. Inside it you get a Room you can call methods on.

```kotlin @Composable fun VoiceAgentScreen() { val context = LocalContext.current var creds by remember { mutableStateOf<Pair<String,String>?>(null) }

LaunchedEffect(Unit) {
    creds = fetchToken()
}

creds?.let { (url, token) ->
    RoomScope(url = url, token = token, audio = true, connect = true) {
        val room = RoomLocal.current
        val active = rememberParticipants()
            .firstOrNull { it.identity?.value?.startsWith("agent") == true }

        Column(Modifier.fillMaxSize().padding(24.dp)) {
            Text("CallSphere Salon Agent",
                style = MaterialTheme.typography.headlineSmall)
            AgentOrb(participant = active)
            Button(onClick = { runBlocking { room.disconnect() } }) {
                Text("Hang up")
            }
        }
    }
}

} ```

Step 5 — Audio orb that pulses with the agent

LiveKit emits an AudioTrackStats.audioLevel you can subscribe to:

```kotlin @Composable fun AgentOrb(participant: Participant?) { val level by participant ?.let { rememberAudioLevel(it) } ?: remember { mutableFloatStateOf(0f) } val scale by animateFloatAsState(targetValue = 1f + level * 1.5f, label="orb")

Box(
    Modifier
        .size(180.dp)
        .graphicsLayer { scaleX = scale; scaleY = scale }
        .clip(CircleShape)
        .background(MaterialTheme.colorScheme.primary)
)

} ```

Step 6 — Request mic permission

```kotlin val micPerm = rememberLauncherForActivityResult( ActivityResultContracts.RequestPermission() ) { granted -> if (!granted) Log.w("CallSphere", "mic denied") }

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

LaunchedEffect(Unit) { micPerm.launch(android.Manifest.permission.RECORD_AUDIO) } ```

Common pitfalls

  • Forgetting MODIFY_AUDIO_SETTINGS — calls connect but you hear nothing.
  • Doing token fetch on the main thread — use Dispatchers.IO.
  • Calling room.connect from outside RoomScope — leaks; let the composable own lifecycle.
  • Bluetooth routing — set AudioManager.MODE_IN_COMMUNICATION for headsets.

How CallSphere does this in production

CallSphere's salon CRM Android client uses LiveKit on top of NestJS + Prisma. The same backend powers our Real Estate and Healthcare verticals — 6 verticals, 37 agents, 115+ DB tables. Affiliate program pays 22% on referrals — see /affiliate.

FAQ

Why LiveKit and not raw WebRTC? TURN, SFU, codec negotiation, audio focus — LiveKit handles all of it.

Can I use Realtime directly without LiveKit? Yes, but you'd write your own PeerConnectionFactory glue.

Does it work on Android 13+ predictive back? Yes, the SDK observes lifecycle.

How big is the APK bloat? ~6MB for the LiveKit SDK + Compose components.

Can the agent be on Modal? Yes — see our Modal post in this series.

Sources

Share
S

Written by

Sagar Shankaran· Founder, CallSphere

Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.