Voice Cloning Portability (ElevenLabs): CallSphere vs Vapi Lock-In
Voice cloning lock-in matters when you switch platforms. CallSphere uses ElevenLabs (portable assets); Vapi mixes providers. How to keep voice IP yours.
TL;DR
A voice clone is a brand asset. If you cannot move it, you do not really own it. CallSphere uses ElevenLabs Conversational AI (the "Sarah" voice in Sales) and ElevenLabs TTS/STT in Salon, with the cloned voice ID owned by your ElevenLabs account, fully portable. Vapi layers TTS providers (ElevenLabs, PlayHT, Azure, OpenAI), but the clone metadata, prompt-bound voice configs, and IVR-style audio cues are locked to Vapi's assistant config — moving them is a manual reconstruct.
This post is the asset-lifecycle deep dive: how to keep your voice clone portable, how the lifecycle differs between platforms, and the migration playbook if you are leaving a platform.
What Is Actually Locked In
When you "clone a voice" on a voice AI platform, you accumulate three artifacts:
- The clone itself — model weights or voice ID at the TTS provider
- The platform binding — assistant config that references the voice with provider-specific settings (stability, similarity_boost, style, speaker_boost)
- Production-tuned prompts — system prompts crafted to sound natural with that specific voice
Lock-in shows up at layer 2 and 3. The clone itself usually is portable; the surrounding config is not.
Vapi Voice Cloning Approach
Vapi supports multiple TTS providers and lets you reference an ElevenLabs voice ID directly:
{
"voice": {
"provider": "11labs",
"voiceId": "your_eleven_voice_id",
"stability": 0.5,
"similarityBoost": 0.75,
"style": 0.0,
"useSpeakerBoost": true,
"model": "eleven_turbo_v2_5"
}
}
You retain the underlying ElevenLabs voice ID — that is portable. What is not portable:
- The Vapi assistant config that references it
- Vapi-specific tuning that compensates for their pipeline
- Squad members built around that voice
- Custom audio cues stored in Vapi's CDN
- Function-call flows wired to that assistant
If you migrate to another platform, you re-reference the voice ID and rebuild everything else.
CallSphere Voice Cloning Approach
CallSphere uses ElevenLabs for both Conversational AI ("Sarah" in the Sales platform) and TTS/STT in the Salon vertical. The integration is intentionally thin so the voice asset stays yours.
Asset Ownership Model
The cloned voice lives in your ElevenLabs account, not CallSphere's. CallSphere holds:
- Your ElevenLabs API key (encrypted, per-tenant)
- The voice ID
- Tuning parameters (stability, similarity, style)
- Prompt-voice bindings
When you offboard, you take the API key, the voice ID, and the tuning JSON. Replication on another platform is mechanical.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
Salon Voice Configuration Example
// shipped in the Salon backend
export const salonVoiceConfig = {
provider: 'elevenlabs',
voice_id: process.env.ELEVENLABS_SALON_VOICE_ID,
model: 'eleven_turbo_v2_5',
output_format: 'pcm_24000',
voice_settings: {
stability: 0.55,
similarity_boost: 0.8,
style: 0.15,
use_speaker_boost: true,
},
// Bindings live in source control, not vendor-side
prompt_bindings: {
greeting: 'salon-warm-greeting-v3',
confirm: 'salon-confirm-tone-v2',
farewell: 'salon-farewell-v1',
},
};
The prompt_bindings are pointers to system-prompt templates in our git repo. Migration = copy the JSON, copy the prompts, point a new platform at your ElevenLabs account.
Sales "Sarah" Conversational AI
Sales uses ElevenLabs Conversational AI, which is a higher-level construct than raw TTS — it owns the turn-taking and interruption logic. CallSphere stores the conv-AI agent ID alongside the voice ID; both live in your ElevenLabs account.
const sarahConfig = {
conv_ai_agent_id: process.env.ELEVENLABS_SARAH_AGENT_ID,
voice_id: process.env.ELEVENLABS_SARAH_VOICE_ID,
// CallSphere-side overrides (orchestration, tools)
hand_off_targets: ['booking_specialist', 'human_sales'],
outbound_concurrency: 5,
};
When you leave CallSphere, you keep both ElevenLabs IDs and the orchestration overrides; you re-host the orchestration on whatever platform you choose.
Voice Asset Versioning
A subtle lock-in trap: when you tweak a voice (re-clone with new samples, adjust style), do you remember which prompts were tuned for which version?
CallSphere stores voice version + prompt version pairs in Postgres:
CREATE TABLE voice_asset_versions (
id UUID PRIMARY KEY,
voice_id TEXT NOT NULL,
voice_version TEXT NOT NULL,
prompt_template_id TEXT NOT NULL,
prompt_version TEXT NOT NULL,
paired_at TIMESTAMPTZ DEFAULT NOW(),
retired_at TIMESTAMPTZ
);
Migrations ship the active row only. Old versions stay archived for audit.
Vapi vs CallSphere Voice Portability Comparison
| Dimension | Vapi | CallSphere |
|---|---|---|
| ElevenLabs voice ID owned by | You (your ELabs account) | You (your ELabs account) |
| Tuning config location | Vapi assistant | Source control in your repo |
| Prompt-voice binding | Vapi UI | Source control |
| Audio cue assets | Vapi CDN | Your S3 bucket |
| Multi-tenant ELabs key | Shared possible | Per-tenant key |
| Rebuild on migration | Manual reconstruct | Copy JSON + prompts |
| Versioning of pairings | Limited | Postgres-tracked |
| Time to migrate (estimated) | 2-5 days | 4-8 hours |
Voice Asset Lifecycle
graph LR
A[Sample collection<br/>5-10 min audio] --> B[Upload to ElevenLabs]
B --> C[Generate voice_id]
C --> D[Tune voice_settings]
D --> E[Bind to prompt template]
E --> F[Pair version in Postgres]
F --> G[Deploy to production]
G --> H{Tune needed?}
H -->|yes| D
H -->|no| I[Archive old pairing]
G --> J[Migration to new platform?]
J -->|yes| K[Export voice_id + tuning + prompts]
K --> L[Re-host on new platform]
L --> G
Migration Playbook (CallSphere → Anywhere)
- Export the voice IDs from the platform config (a single
pnpm exec voice-exportcommand in CallSphere). - Export tuning JSON — voice_settings + prompt_bindings.
- Export system prompts — already in git, just copy the relevant directory.
- Re-bind on the new platform — most platforms accept ElevenLabs voice IDs directly.
- Run a parallel-call test for 50 calls to compare pre/post quality before cutover.
The migration is engineering-driven, not vendor-blocked. That is the whole point.
Anti-Patterns We Avoid
- Storing audio cues on the platform's CDN. Always store in your own S3 bucket, reference by URL.
- Embedding the API key in assistant configs. Always inject from secrets manager at runtime.
- Hard-coding voice IDs in prompts. Always parameterize.
- Skipping version pairing. A voice tweak six months ago that was paired with a now-deleted prompt is a silent quality regression.
FAQ
Can I clone a voice on CallSphere directly?
We do not host the cloning UI — you clone in your ElevenLabs account, then paste the voice ID into CallSphere config. This is by design: the asset stays in your account.
What if ElevenLabs raises prices?
Both platforms expose the price pass-through. Switching to PlayHT or Azure neural voices is a config change, not a re-clone.
How long does a voice clone take to make?
ElevenLabs Instant clone: ~1 minute, requires 1-3 minute sample. Professional clone: ~24 hours, requires 30+ minute sample.
Are there legal concerns with voice cloning?
Yes — written consent from the voice owner is mandatory. CallSphere requires uploaded consent forms before activating any clone in a production tenant.
Does Vapi support custom voices the same way?
Vapi supports ElevenLabs voice IDs and PlayHT cloned voices. Portability of the IDs themselves is similar; portability of the surrounding config is where Vapi adds friction.
Talk to a Human About Voice Strategy
The /features page documents the voice provider matrix per vertical, and /demo lets you hear the production "Sarah" voice on a live call.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.