Skip to content
Learn Agentic AI
Learn Agentic AI10 min read0 views

Text Classification for Agent Routing: Intent Detection and Topic Categorization

Build robust text classification systems for AI agent routing using intent detection, multi-label classification, and zero-shot approaches with practical Python implementations.

Why Classification Is the Agent's First Decision

Every message that reaches a multi-agent system must be routed to the right handler. A billing question should go to the billing agent. A technical issue should go to the support agent. A sales inquiry should go to the sales agent. Text classification is the mechanism that makes this routing fast and accurate.

Intent detection is a specialized form of text classification where the classes represent user goals: "check_balance," "reset_password," "schedule_appointment." Topic categorization groups messages by subject matter: "billing," "technical," "general." Both are essential for agents that handle diverse user requests.

Building an Intent Classifier with scikit-learn

For agents with a known set of intents and training data, a traditional ML classifier is fast and reliable.

flowchart TD
    START["Text Classification for Agent Routing: Intent Det…"] --> A
    A["Why Classification Is the Agent39s Firs…"]
    A --> B
    B["Building an Intent Classifier with scik…"]
    B --> C
    C["Zero-Shot Classification: No Training D…"]
    C --> D
    D["Multi-Label Classification"]
    D --> E
    E["Confidence-Based Routing"]
    E --> F
    F["Combining Classification Approaches"]
    F --> G
    G["FAQ"]
    G --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
import joblib

training_data = [
    ("What is my account balance?", "check_balance"),
    ("How much do I owe?", "check_balance"),
    ("I forgot my password", "reset_password"),
    ("Can't log in to my account", "reset_password"),
    ("I want to cancel my subscription", "cancel_subscription"),
    ("How do I unsubscribe?", "cancel_subscription"),
    ("Can I speak to a manager?", "escalate"),
    ("I need to talk to a real person", "escalate"),
]

texts, labels = zip(*training_data)

classifier = Pipeline([
    ("tfidf", TfidfVectorizer(ngram_range=(1, 2), max_features=5000)),
    ("clf", LogisticRegression(max_iter=1000)),
])

classifier.fit(texts, labels)

def classify_intent(text: str) -> dict:
    prediction = classifier.predict([text])[0]
    probabilities = classifier.predict_proba([text])[0]
    confidence = max(probabilities)
    return {"intent": prediction, "confidence": round(confidence, 3)}

print(classify_intent("I need to check how much money is in my account"))
# {'intent': 'check_balance', 'confidence': 0.82}

Zero-Shot Classification: No Training Data Required

When you cannot collect labeled training data — or when new intents appear frequently — zero-shot classification lets you define categories at inference time.

from transformers import pipeline

zero_shot = pipeline(
    "zero-shot-classification",
    model="facebook/bart-large-mnli",
)

def classify_zero_shot(
    text: str,
    candidate_labels: list[str],
    multi_label: bool = False,
) -> dict:
    result = zero_shot(
        text,
        candidate_labels=candidate_labels,
        multi_label=multi_label,
    )
    return {
        label: round(score, 3)
        for label, score in zip(result["labels"], result["scores"])
    }

labels = ["billing", "technical_support", "sales", "general_inquiry"]
message = "My invoice shows a charge I didn't authorize"

print(classify_zero_shot(message, labels))
# {'billing': 0.891, 'general_inquiry': 0.054,
#  'technical_support': 0.033, 'sales': 0.022}

Zero-shot models use natural language inference under the hood. They evaluate whether the text "entails" each candidate label. This means your label names matter — "billing_dispute" will perform differently than "billing" for the same input.

Multi-Label Classification

Users often express multiple intents in a single message: "I need to update my address and also check when my next payment is due." Multi-label classification assigns multiple categories simultaneously.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

def classify_multi_intent(text: str, labels: list[str]) -> list[dict]:
    result = zero_shot(
        text,
        candidate_labels=labels,
        multi_label=True,
    )
    return [
        {"label": label, "score": round(score, 3)}
        for label, score in zip(result["labels"], result["scores"])
        if score > 0.5
    ]

labels = ["update_info", "check_payment", "billing", "technical"]
message = "Please change my address and tell me when my next bill is due"

print(classify_multi_intent(message, labels))
# [{'label': 'update_info', 'score': 0.92},
#  {'label': 'check_payment', 'score': 0.87},
#  {'label': 'billing', 'score': 0.61}]

Confidence-Based Routing

Raw classification output needs a decision layer. Here is a routing pattern that handles low-confidence predictions gracefully.

from dataclasses import dataclass
from typing import Optional

@dataclass
class RouteDecision:
    target_agent: str
    confidence: float
    fallback_used: bool

class IntentRouter:
    def __init__(self, classifier, threshold: float = 0.7):
        self.classifier = classifier
        self.threshold = threshold
        self.agent_map = {
            "check_balance": "billing_agent",
            "reset_password": "auth_agent",
            "cancel_subscription": "retention_agent",
            "escalate": "human_agent",
        }

    def route(self, message: str) -> RouteDecision:
        result = self.classifier(message)
        intent = result["intent"]
        confidence = result["confidence"]

        if confidence >= self.threshold and intent in self.agent_map:
            return RouteDecision(
                target_agent=self.agent_map[intent],
                confidence=confidence,
                fallback_used=False,
            )

        return RouteDecision(
            target_agent="general_agent",
            confidence=confidence,
            fallback_used=True,
        )

The threshold is critical. Set it too low and misrouted messages frustrate users. Set it too high and too many messages fall through to the general agent. Start at 0.7 and adjust based on production metrics.

Combining Classification Approaches

Production agents often layer multiple classifiers. A fast keyword-based filter catches obvious cases, a traditional ML model handles most traffic, and a zero-shot model handles the long tail.

class HybridClassifier:
    def __init__(self, keyword_rules, ml_model, zero_shot_model):
        self.keyword_rules = keyword_rules
        self.ml_model = ml_model
        self.zero_shot_model = zero_shot_model

    def classify(self, text: str, labels: list[str]) -> dict:
        # Tier 1: keyword match (microseconds)
        for pattern, label in self.keyword_rules.items():
            if pattern.lower() in text.lower():
                return {"label": label, "confidence": 1.0, "tier": "keyword"}

        # Tier 2: ML model (milliseconds)
        ml_result = self.ml_model(text)
        if ml_result["confidence"] > 0.8:
            return {**ml_result, "tier": "ml"}

        # Tier 3: zero-shot (slower but flexible)
        zs_result = self.zero_shot_model(text, labels)
        top_label = max(zs_result, key=zs_result.get)
        return {
            "label": top_label,
            "confidence": zs_result[top_label],
            "tier": "zero_shot",
        }

FAQ

How many training examples do I need per intent for a reliable classifier?

For traditional ML classifiers like logistic regression with TF-IDF, aim for at least 50 to 100 examples per intent. For fine-tuned transformer models, 20 to 50 examples per class can be sufficient. If you have fewer than 20 examples, use zero-shot classification instead and gradually collect labeled data from production traffic to build a training set.

How do I handle intents that overlap semantically?

Overlapping intents are a design problem, not a model problem. If "check_balance" and "billing_inquiry" are hard to distinguish, consider merging them into a single intent and using a second-stage classifier or rule-based logic to differentiate subtypes. Alternatively, use multi-label classification and let the downstream agent handle both intents.

What is the best way to add new intents without retraining from scratch?

Use a zero-shot model as your primary classifier and maintain a mapping from zero-shot labels to agent routes. Adding a new intent is as simple as adding a new label string. Once you collect enough production examples for the new intent, fine-tune your ML model and redeploy. This gives you immediate coverage with zero-shot and improved accuracy over time with supervised learning.


#TextClassification #IntentDetection #NLP #ZeroShot #AIAgents #Python #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Use Cases

Automating Client Document Collection: How AI Agents Chase Missing Tax Documents and Reduce Filing Delays

See how AI agents automate tax document collection — chasing missing W-2s, 1099s, and receipts via calls and texts to eliminate the #1 CPA bottleneck.

Technical Guides

Post-Call Analytics with GPT-4o-mini: Sentiment, Lead Scoring, and Intent

Build a post-call analytics pipeline with GPT-4o-mini — sentiment, intent, lead scoring, satisfaction, and escalation detection.

AI Interview Prep

7 AI Coding Interview Questions From Anthropic, Meta & OpenAI (2026 Edition)

Real AI coding interview questions from Anthropic, Meta, and OpenAI in 2026. Includes implementing attention from scratch, Anthropic's progressive coding screens, Meta's AI-assisted round, and vector search — with solution approaches.

Learn Agentic AI

API Design for AI Agent Tool Functions: Best Practices and Anti-Patterns

How to design tool functions that LLMs can use effectively with clear naming, enum parameters, structured responses, informative error messages, and documentation.

Learn Agentic AI

Computer Use in GPT-5.4: Building AI Agents That Navigate Desktop Applications

Technical guide to GPT-5.4's computer use capabilities for building AI agents that interact with desktop UIs, browser automation, and real-world application workflows.

Learn Agentic AI

AI Agents for IT Helpdesk: L1 Automation, Ticket Routing, and Knowledge Base Integration

Build IT helpdesk AI agents with multi-agent architecture for triage, device, network, and security issues. RAG-powered knowledge base, automated ticket creation, routing, and escalation.