IT Helpdesk RAG with ChromaDB: CallSphere 10 Agents vs Vapi

TL;DR

CallSphere's U Rack IT product ships 10 IT specialist agents (Triage, Device, Ticket, Network, Email, Computer, Printer, Phone, Security, Lookup) and a ChromaDB-backed RAG layer with the dedicated Lookup agent as the retrieval specialist. Vapi.ai offers PDF knowledge-base upload — useful but generic — and gives you a single voice agent to query it. There is no specialist taxonomy, no Lookup agent, no 40+ table data model, no role-based dashboard for Admin/Agent/Requester, and no MSP-aware ticketing schema. This post is the architecture comparison and a worked example of an L1 password reset RAG retrieval.

The MSP Problem: L1 Volume Crushes Margins

Managed Service Providers (MSPs) live and die on L1 ticket margin. Service Leadership Index 2025 reports that 62% of MSP ticket volume is L1 — password resets, printer jams, basic network connectivity, email setup, software install — and that L1 tickets average $23 in fully loaded support cost while billing $18-22 per ticket. The math is brutal: every L1 ticket the MSP touches with a human is a margin loser.

The fix every MSP CEO has been chasing for five years is L1 voice automation with knowledge-base retrieval. It is not a chatbot — clients call. It is not a generic voice agent — a printer ticket needs different troubleshooting than a network ticket. It is not a single RAG endpoint — the agent needs to call retrieval as a tool, multiple times, with reranking. It is a structured multi-agent system with a knowledge layer.

That is exactly what U Rack IT is.

The U Rack IT Architecture

U Rack IT is built for IT, MSP, and enterprise helpdesks. The stack:

Backend: Python FastAPI with OpenAI Realtime API + Agents SDK; NestJS + Prisma for the API layer.
Frontend: React + Tailwind, role-based dashboard (Admin / Agent / Requester).
Database: PostgreSQL + Supabase + ChromaDB.
40+ DB models: account_managers, organizations, contacts, devices, support_tickets, call_logs, agent_interactions, ai_usage_logs, daily_metrics, support_agents, locations, plus 30 supporting tables.
10 Specialist Agents: Triage, Device, Ticket, Network, Email, Computer, Printer, Phone, Security, Lookup (RAG via ChromaDB).

The 10 Agents

Agent	Role	Sample Tools
Triage	Identify caller + classify problem	lookup_contact, classify_issue
Device	Identify and triage end-user device	get_device, check_warranty, lookup_serial
Ticket	Create, update, close support tickets	create_ticket, update_ticket, close_ticket
Network	Wifi, VPN, router troubleshoot	check_network_status, run_traceroute, vpn_diag
Email	O365, Google Workspace, IMAP	reset_email_password, check_mail_flow
Computer	OS, drivers, peripherals	get_os_info, check_drivers, sw_install_status
Printer	Printer queue, drivers, setup	check_printer_status, clear_queue, install_driver
Phone	Softphone, deskphone, mobile MDM	check_softphone, mdm_status, reset_mobile
Security	Account lockout, MFA, suspicious activity	lookup_security_event, force_logout, mfa_reset
Lookup	RAG retrieval over ChromaDB KB	retrieve_kb, rerank, summarize_runbook

The Lookup agent is the retrieval specialist. It is called by every other agent when their tool surface is not enough — for example, the Printer agent hits an unfamiliar model and asks Lookup to retrieve the runbook for that model.

Vapi's Knowledge Base

Vapi's knowledge base feature lets you upload PDFs or markdown files. The assistant can reference them in answers. It is a competent feature for FAQ-style use cases. It is not a structured RAG layer. There is:

No reranker.
No chunking strategy you control.
No multi-document fusion.
No specialist agent that owns retrieval as its job.
No taxonomy of IT-specific agents.
No 40+ table schema for tickets, devices, contacts.

You can build all of this on Vapi. The platform is flexible. But the time and engineering cost is six to twelve months of focused work, and the product you build will not be tested against the customer base U Rack IT already serves.

Comparison

Capability	U Rack IT	Vapi
IT specialist agents	10 shipped	None — write your own
ChromaDB-backed RAG	Default	PDF upload only
Reranker on retrieval	Cohere/cross-encoder configurable	None native
Specialist Lookup agent	Yes	No
40+ DB models for IT	Shipped	Build it
Role-based dashboard (Admin/Agent/Requester)	Shipped	Build it
Ticket creation as a function call	create_ticket tool	Build the tool
Device/serial lookup	Tools ready	Build it
Knowledge base ingestion pipeline	Crawler + chunker + embed	Manual upload
Multi-organization (MSP) tenancy	Built-in	Build it
Time to L1 automation live	Days	Quarters

The RAG Retrieval Flow

```mermaid graph TD A[Caller Asks Question] --> B[Triage Agent Classifies] B --> C{Specialist?} C -->|Printer| D[Printer Agent] C -->|Network| E[Network Agent] C -->|Other| F[Computer Agent] D --> G{Tool Surface Sufficient?} E --> G F --> G G -->|Yes| H[Run Tool + Resolve] G -->|No| I[Call Lookup Agent] I --> J[Embed Query: text-embedding-3-large] J --> K[ChromaDB Top-K Retrieve k=8] K --> L[Cross-Encoder Rerank to Top 3] L --> M[Lookup: Summarize Runbook] M --> N{Confidence > 0.7?} N -->|Yes| O[Return Steps to Specialist] N -->|No| P[Escalate: Open Ticket + Page Human] O --> Q[Specialist Walks Caller Through] Q --> R{Resolved?} R -->|Yes| S[Close Ticket + Log] R -->|No| T[Escalate to L2] P --> S T --> S ```

The retrieval pipeline runs entirely as a tool call from the specialist agent. The Lookup agent never speaks to the caller directly — it returns structured runbook steps that the Printer/Network/Computer agent then narrates.

Worked Example: Password Reset for an Unknown App

A user calls and says "I can't log into Loomly."

Hear it before you finish reading

Talk to a live CallSphere AI voice agent for IT support in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

Turn 1 (Triage): Identifies the user, classifies as account/access. Hands to Security agent.

Turn 2 (Security): Security agent's tool surface knows Active Directory and O365 but has no built-in tool for Loomly. It calls the Lookup agent with query "Loomly password reset SSO procedure."

Turn 3 (Lookup):

Embeds the query.
ChromaDB returns 8 candidate chunks: 3 from internal MSP runbooks, 4 from Loomly's public docs (crawled), 1 from a forum.
Cross-encoder reranks to top 3.
Top match is internal runbook "Loomly SSO Reset (Tenant XYZ)" with confidence 0.84.

Turn 4 (Lookup): Returns: "1. Confirm SSO is enabled (yes per tenant config). 2. Have user navigate to https://loomly.com/sso/login. 3. Click 'Forgot Password'. 4. Check connected mailbox. 5. If MFA prompts, use Authenticator app. 6. If 2FA email lost, escalate to L2."

Turn 5 (Security): Walks the user through steps 1-5.

Turn 6: User logs in successfully.

Turn 7 (Ticket): Ticket agent creates support_ticket row, classification "L1 / Account Access / Resolved", time-to-resolution 4 minutes 12 seconds, agent_interactions logged for analytics.

On Vapi's PDF knowledge base, the same flow would have:

No reranker (top-K straight retrieval, often surfacing the wrong chunk).
No specialist routing (one assistant trying to hold all IT context).
No structured ticket creation (you build it).
No L2 escalation primitive.

MSP Multi-Tenancy

U Rack IT is multi-tenant by design. The organizations table is the tenant boundary. Every contact, device, ticket, and call_log is scoped to an org. ChromaDB collections are per-organization, so MSP-A's runbooks never bleed into MSP-B's RAG context. The role-based dashboard scopes views by tenant, role, and location.

Vapi has no tenancy model. You build it.

Chunking Strategy: The Detail That Decides Recall

RAG quality is mostly chunking quality. Bad chunks produce bad retrievals which produce bad answers. U Rack IT's chunker is tuned for IT runbooks specifically:

Step-aware splitting: numbered steps stay together as units.
Code-block preservation: shell snippets are not split mid-line.
Heading hierarchy: section headers carry into chunk metadata for keyword filtering.
Overlap: 80-token overlap between adjacent chunks for boundary recall.
Max chunk size: 1024 tokens (with 80-token overlap = 1104 effective).
Metadata: vendor, product, OS, ticket-category, last-updated, source-url.

The metadata is the secret. Retrieval can pre-filter by metadata before vector similarity, dramatically improving precision. A "printer / HP M404" query first filters chunks where vendor=HP and product='M404', then runs vector search on the narrowed set. Recall jumps from 67% to 91% in our internal benchmark.

Vapi's PDF knowledge base does naive chunking (typically 500-character splits without metadata). Recall on the same benchmark: 38%.

Embedding Model Selection

We use OpenAI's text-embedding-3-large (3072-dim) for primary indexing. For high-volume collections we offer text-embedding-3-small (1536-dim) as a cost-optimized option. Customer-specific terminology (product code names, internal acronyms) is handled via a learned reranker fine-tuned on the customer's runbooks.

Still reading? Stop comparing — try CallSphere live.

See the IT support AI agent handle a real call — complete, industry-specific, and live in your browser. No signup.

Try the IT support Demo → Book 30-min Walkthrough See Pricing

The reranker is a cross-encoder (BAAI/bge-reranker-v2 by default, or Cohere Rerank 3 on enterprise tier) that re-scores top-K candidates from ChromaDB by joint query+chunk attention. This is the second-largest precision gain after metadata filtering.

Vapi has no reranker option. You implement it yourself.

Per-Tenant Knowledge Base Isolation

Each MSP customer organization has its own ChromaDB collection. The collection is named org_{org_id}. Cross-collection retrieval is hard-disabled — there is no API path that allows MSP-A's agent to read MSP-B's runbooks.

Public vendor documentation (HP service manuals, Microsoft KB articles, Cisco product docs) lives in a shared "public" collection that all orgs can read. The agent always retrieves from org_X first, then falls back to public if confidence is low. This is exactly the pattern that reproduces what a senior MSP technician does: check our internal runbook first, then the vendor's official docs.

Tool-Driven Resolution vs Knowledge-Only Answers

Pure RAG can tell you "the steps to reset an O365 password are..." but cannot actually reset the password. U Rack IT goes further: agents have executable tools that perform the action, not just describe it. The Email agent has reset_email_password that talks to the Microsoft Graph API. The Security agent has force_logout that revokes tokens. The Computer agent has software_install_status that queries the endpoint via the MSP's RMM (NinjaRMM, Atera, ConnectWise Automate).

This is the difference between an answer and a resolution. Vapi-based RAG systems describe; U Rack IT does.

FAQ

How is the ChromaDB knowledge base populated?

A crawler ingests internal runbooks (Confluence, Notion, SharePoint, file shares) and public vendor docs. A chunker splits at semantic boundaries. text-embedding-3-large produces 3072-dim vectors. ChromaDB stores them per-organization.

How often is the RAG re-indexed?

Daily incremental, weekly full. Crawler diff-detects changes to source documents.

Can the Lookup agent call multiple specialists' tools?

The Lookup agent only retrieves. It returns structured runbook steps to the calling specialist, which then runs its own tools (e.g., reset_email_password). This separation is what makes the architecture stable.

What about hallucination?

The Lookup agent is constrained: if confidence is below 0.7, it returns "escalate" rather than guessing. Specialists are also instructed never to invent steps that contradict the runbook.

Does U Rack IT integrate with ConnectWise/Datto/Kaseya?

Yes — bidirectional sync for tickets, contacts, and devices via webhook adapters.

How is sensitive data handled?

ChromaDB collections are encrypted at rest with per-tenant keys. Embedded chunks never contain raw passwords or tokens — the ingester scrubs secrets before embedding. Caller authentication is multi-factor when required (caller ID + verification question + optional MFA via SMS).

Can the Lookup agent be used as a chatbot too?

Yes — the same RAG pipeline runs in a Slack/Teams bot for the Requester role. The Lookup agent's interface is channel-agnostic.

What happens if the runbook is wrong?

Tickets created by the agent capture the runbook chunk(s) used. If the runbook is wrong, agents and admins can flag the chunk in the dashboard, which creates a remediation task. Frequent flags trigger a re-ingestion of the source document.

Automate L1, Keep the Margin

If your MSP is bleeding margin on password resets and printer jams, U Rack IT pays back inside one quarter. Book a demo at /demo and we will run your real ticket categories through the 10-agent stack live.

TL;DR

The MSP Problem: L1 Volume Crushes Margins

The U Rack IT Architecture

The 10 Agents

Vapi's Knowledge Base

Comparison

The RAG Retrieval Flow

Worked Example: Password Reset for an Unknown App

MSP Multi-Tenancy

Chunking Strategy: The Detail That Decides Recall

Embedding Model Selection

Per-Tenant Knowledge Base Isolation

Tool-Driven Resolution vs Knowledge-Only Answers

FAQ

How is the ChromaDB knowledge base populated?

How often is the RAG re-indexed?

Can the Lookup agent call multiple specialists' tools?

What about hallucination?

Does U Rack IT integrate with ConnectWise/Datto/Kaseya?

How is sensitive data handled?

Can the Lookup agent be used as a chatbot too?

What happens if the runbook is wrong?

Automate L1, Keep the Margin

Try CallSphere AI Voice Agents

Related Articles You May Like

Tbilisi Accountants, Lawyers and Relocation Firms: Capture Every Enquiry with an AI Voice Agent

How Colombian Tutoring Centers and Academies Enroll More Students with an AI Voice and Chat Agent

Yirgacheffe to the World: An AI Agent That Never Misses a Coffee Buyer Call

How-To: Stop Losing High-Value Bookings at Your Palau Dive Resort While the Crew Is on the Reef

Gulf Salons, Beauty and Wellness: Stop Losing Bookings to Missed Calls Across the UAE, Saudi Arabia and Qatar

Missed Viewings, Lost Deals: AI Voice for Luxembourg's Fast-Moving Property Market

Product

Resources

Company

Legal

Industries

Integrations

Solutions

Compare

Pillar Guides

See AI Voice Agents in Action