By Sagar Shankaran, Founder of CallSphere
Stand up a Gemini-powered voice agent with Vertex AI Agent Builder (now Gemini Enterprise Agent Platform). Phone gateway, ADK code-first agent, Cloud Run runtime — under 200 lines.
Key takeaways
TL;DR — Vertex AI Agent Builder (rebranded "Gemini Enterprise Agent Platform" at Cloud Next 2026) gives you the ADK code-first kit, Agent Engine managed runtime, and a built-in phone gateway with TTS/STT in 220+ voices and 40+ languages. You write a Python class, deploy with one command, attach a phone number, done.
A code-first voice agent built with the Agent Development Kit (ADK), backed by gemini-2.5-flash for reasoning and Chirp 3 HD voices for TTS. The agent has one tool (lookup_appointment) backed by Firestore, runs on Agent Engine (managed), and answers a real PSTN number through Conversational Agents Phone Gateway.
gcloud CLI authenticated, billing enabled.google-cloud-aiplatform>=1.85, google-adk>=0.5.flowchart TD
PSTN[Caller PSTN] --> CXP[Conversational Agents Phone Gateway]
CXP -->|Chirp 3 STT| AE[Agent Engine Runtime]
AE -->|ADK agent| GEM[gemini-2.5-flash]
AE -->|tool| FS[(Firestore appointments)]
AE -->|text reply| TTS[Chirp 3 HD TTS]
TTS --> CXP
CXP --> PSTN
```python
from google.adk.agents import Agent from google.adk.tools import FunctionTool from google.cloud import firestore
db = firestore.Client()
def lookup_appointment(patient_id: str) -> dict: """Returns the next appointment for the given patient_id.""" doc = db.collection("appointments").document(patient_id).get() return doc.to_dict() or {"error": "not found"}
root_agent = Agent( name="reception_agent", model="gemini-2.5-flash", instruction=( "You are a friendly receptionist. Confirm the patient's name, " "look up their appointment, and read it back. Keep replies short." ), tools=[FunctionTool(func=lookup_appointment)], ) ```
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
```bash pip install google-adk adk web
```
The dev UI shows the full reasoning trace, tool calls, and lets you swap the model in real time.
```python
from vertexai import agent_engines from agent import root_agent
remote = agent_engines.create( agent_engine=root_agent, requirements=["google-adk>=0.5", "google-cloud-firestore"], display_name="reception-agent", ) print(remote.resource_name) ```
gcloud auth application-default login && python deploy.py — Agent Engine builds a container, pushes to Artifact Registry, and gives you a versioned endpoint.
In the Conversational Agents console (formerly Dialogflow CX), create a new agent, choose Use a deployed Agent Engine endpoint, paste the resource name, then under Manage → Integrations → Phone Gateway click Configure new number and pick a country.
The gateway handles SIP, codec negotiation, Chirp 3 STT in (server VAD with 0.6s end-of-speech timeout), Chirp 3 HD TTS out, barge-in, and DTMF passthrough. No code on your side.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
In the agent's Speech and IVR settings, pick:
chirp_3 model with use_enhanced=trueen-US-Chirp3-HD-Charon (or en-US-Studio-O for Studio voices)600ms (default is too aggressive for elderly callers)If you need a knowledge base, create a Vertex AI Search data store over a GCS bucket of your help docs and add it as a sub-agent or as an ADK VertexAiSearchTool:
```python from google.adk.tools import VertexAiSearchTool search = VertexAiSearchTool( data_store_id="projects/123/locations/global/collections/default_collection/dataStores/help-docs" ) root_agent.tools.append(search) ```
Agent Engine emits Cloud Logging events for every turn, every tool call, and every model response. Pipe them into BigQuery via a Logs Router sink for dashboards.
min_instances=1 for production.CallSphere runs OpenAI Realtime on FastAPI :8084 for Healthcare because GCP's Phone Gateway didn't support our HIPAA chain of custody until late 2025. For our 6 verticals (Healthcare, Multi-Family, Salons, Behavioral, Hospitality, Real Estate), we keep Gemini 2.5 Flash as a fallback model behind our 90+ tools — primarily for non-PHI workloads where its 1M context lets us pass entire CRM histories. 37 agents, 115+ DB tables. Pricing: $149/$499/$1499, 14-day trial, 22% affiliate.
Q: ADK vs Agent Studio (low-code)? Use ADK for code-first teams that want git, tests, CI. Use Agent Studio for non-engineers and rapid prototyping. They share the same runtime.
Q: Gemini 2.5 Flash vs Pro for voice? Flash is the right default for voice — TTFT is ~300ms vs ~700ms on Pro. Save Pro for tool-heavy reasoning loops.
Q: How does this compare to Dialogflow CX classic? Conversational Agents (the new console) replaces both old Dialogflow CX and Agent Builder. ADK is what you write; CX flows are still available for deterministic IVR.
Q: What's the latency target?
Voice-to-voice ~700-900ms with Chirp 3 + Flash on us-central1.
Q: Can I bring my own LLM? Yes — ADK's model param accepts any Vertex Model Garden or LiteLLM-compatible endpoint, including Claude on Vertex.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
A no-fluff recap of the 7 biggest enterprise AI moves from Google Cloud Next 2026 — Gemini Enterprise, Agentspace, A2A, Gemini 3.1 Ultra, and more.
Workspace Studio puts a Gemini-powered AI agent builder inside Google Workspace. A walkthrough of what it does, who it is for, and where it fits in 2026.
Haystack 2.7's Agent component plus an Ollama-served Llama 3.2 gives you tool-calling RAG with citations. Here's a complete pipeline against your own document store.
At Cloud Next 2026 Google renamed Vertex AI to Gemini Enterprise Agent Platform and absorbed Agentspace. What actually changed and why a rebrand made sense.
Run STT, LLM, and TTS entirely on Cloudflare's edge — no OpenAI, no ElevenLabs. Real working code with Whisper, Llama 3.3 70B, and Deepgram Aura.
Jules's GitHub integration takes an issue, writes a fix, runs tests, and opens a PR — here is the architecture and pricing. Practical context for teams in North Carolina.
© 2026 CallSphere LLC. All rights reserved.
Watch how CallSphere handles real customer calls, schedules appointments, and processes payments — live.
Try Live DemoBook a DemoCalculate Your ROI