Skip to content
Learn Agentic AI
Learn Agentic AI12 min read11 views

Hybrid Agent Orchestration: Combining Handoffs and Tools

Combine the handoff pattern with the agents-as-tools pattern to build hybrid orchestration systems where a triage agent delegates via handoffs and specialist agents use sub-agents as tools.

The Limitation of Single-Pattern Orchestration

The OpenAI Agents SDK provides two orchestration primitives: handoffs (transferring conversation control from one agent to another) and agents as tools (one agent calling another as a function and getting back a result). Most tutorials teach these patterns in isolation. But real-world agent systems almost always need both.

Consider a customer service system. A triage agent determines the customer's intent and hands off to a billing specialist or a technical support specialist. That is a pure handoff pattern. But the billing specialist needs to check the customer's account balance, calculate proration, and verify payment methods — each of which is a focused sub-task that a smaller, cheaper agent could handle. Those sub-tasks are better modeled as tools, not handoffs.

Hybrid orchestration combines both patterns: handoffs for routing (which specialist handles this conversation) and agents as tools for sub-tasks (the specialist delegates focused work to helper agents without giving up conversation control).

Handoffs vs Tools: A Quick Recap

Before building the hybrid, let us clarify the behavioral difference between the two patterns.

flowchart TD
    START["Hybrid Agent Orchestration: Combining Handoffs an…"] --> A
    A["The Limitation of Single-Pattern Orches…"]
    A --> B
    B["Handoffs vs Tools: A Quick Recap"]
    B --> C
    C["Building the Hybrid Architecture"]
    C --> D
    D["Why This Architecture Works"]
    D --> E
    E["Adding Function Tools Alongside Agent T…"]
    E --> F
    F["Handling Escalation in Hybrid Systems"]
    F --> G
    G["Testing Hybrid Systems"]
    G --> H
    H["Design Guidelines"]
    H --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff

Handoffs transfer conversation control. When Agent A hands off to Agent B, Agent A stops running. Agent B takes over and responds directly to the user. The conversation continues with Agent B as the active agent.

Agents as tools preserve conversation control. When Agent A calls Agent B as a tool, Agent A remains in control. Agent B runs in the background, returns its result to Agent A, and Agent A decides how to use that result in its response to the user.

from agents import Agent, handoff

# HANDOFF: Triage gives up control to Billing
triage_agent = Agent(
    name="Triage",
    instructions="Route to the correct specialist.",
    handoffs=[handoff(billing_agent), handoff(support_agent)],
)

# TOOL: Billing keeps control, uses ProrateCalculator as a tool
billing_agent = Agent(
    name="BillingSpecialist",
    instructions="Handle billing inquiries. Use sub-agents for calculations.",
    tools=[prorate_calculator_agent.as_tool(
        tool_name="calculate_proration",
        tool_description="Calculate prorated charges for plan changes",
    )],
)

Building the Hybrid Architecture

The hybrid architecture has three layers:

  1. Triage layer — a single agent that routes conversations via handoffs
  2. Specialist layer — domain-specific agents that own conversation control for their domain
  3. Helper layer — focused sub-agents that specialists invoke as tools
from agents import Agent, handoff, function_tool

# ─── Helper Layer (agents used as tools) ───

account_lookup_agent = Agent(
    name="AccountLookup",
    instructions="""Look up customer account details. Return a summary
    including account status, current plan, balance, and payment
    method on file. Format the response as a clear text summary.""",
    model="gpt-4o-mini",
)

proration_agent = Agent(
    name="ProrationCalculator",
    instructions="""Calculate prorated charges when a customer changes
    plans mid-billing-cycle. Given the current plan price, new plan
    price, and days remaining in the cycle, compute the prorated
    amount. Show your calculation step by step.""",
    model="gpt-4o-mini",
)

diagnostics_agent = Agent(
    name="TechnicalDiagnostics",
    instructions="""Analyze technical symptoms described by the customer.
    Suggest a ranked list of likely root causes and recommended
    troubleshooting steps. Be specific and actionable.""",
    model="gpt-4o-mini",
)

knowledge_search_agent = Agent(
    name="KnowledgeBaseSearch",
    instructions="""Search the knowledge base for articles relevant to
    the customer's issue. Return the top 3 most relevant articles
    with titles, summaries, and direct links.""",
    model="gpt-4o-mini",
)

# ─── Specialist Layer (handoff targets with tool access) ───

billing_specialist = Agent(
    name="BillingSpecialist",
    instructions="""You are a billing specialist. Handle all billing
    inquiries including charges, refunds, plan changes, and payment
    issues. Use your tools to look up accounts and calculate
    prorations. Be precise with numbers and always confirm amounts
    with the customer before making changes.""",
    model="gpt-4o",
    tools=[
        account_lookup_agent.as_tool(
            tool_name="lookup_account",
            tool_description="Look up customer account details and balance",
        ),
        proration_agent.as_tool(
            tool_name="calculate_proration",
            tool_description="Calculate prorated charges for plan changes",
        ),
    ],
)

technical_specialist = Agent(
    name="TechnicalSpecialist",
    instructions="""You are a technical support specialist. Diagnose and
    resolve technical issues. Use the diagnostics agent for root cause
    analysis and the knowledge base for relevant documentation. Guide
    customers through troubleshooting steps one at a time.""",
    model="gpt-4o",
    tools=[
        diagnostics_agent.as_tool(
            tool_name="run_diagnostics",
            tool_description="Analyze symptoms and suggest root causes",
        ),
        knowledge_search_agent.as_tool(
            tool_name="search_knowledge_base",
            tool_description="Find relevant knowledge base articles",
        ),
    ],
)

# ─── Triage Layer (entry point with handoffs) ───

triage_agent = Agent(
    name="Triage",
    instructions="""You are the first point of contact. Determine the
    customer's intent and route to the appropriate specialist.

    Route to BillingSpecialist for: charges, refunds, plan changes,
    payment issues, invoices, billing disputes.

    Route to TechnicalSpecialist for: bugs, errors, performance
    issues, connectivity problems, feature questions.

    Ask ONE clarifying question if the intent is genuinely ambiguous.
    Do not ask unnecessary questions — route as soon as intent is clear.""",
    model="gpt-4o",
    handoffs=[
        handoff(billing_specialist, description="Billing and payment issues"),
        handoff(technical_specialist, description="Technical problems and troubleshooting"),
    ],
)

Why This Architecture Works

The hybrid pattern plays to each mechanism's strengths.

Handoffs are ideal for routing because the specialist needs full conversation context and direct user interaction. When a customer says "I was overcharged last month," the billing specialist needs to ask follow-up questions, look up the account, and explain the resolution — all as a continuous conversation.

Tools are ideal for sub-tasks because the specialist needs to remain in control of the conversation while delegating focused work. The billing specialist does not want to hand off to the proration calculator and lose conversation control. It wants to call the calculator, get a number, and incorporate that into its response.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Adding Function Tools Alongside Agent Tools

Specialists often need both agent tools (sub-agents) and traditional function tools (API calls, database queries). The OpenAI Agents SDK lets you mix both seamlessly.

@function_tool
def process_refund(customer_id: str, amount: float, reason: str) -> str:
    """Process a refund for the specified customer."""
    # In production, this calls your payment API
    if amount > 500:
        return f"REQUIRES_APPROVAL: Refund of ${amount:.2f} exceeds auto-limit."
    return f"REFUND_PROCESSED: ${amount:.2f} refunded to customer {customer_id}."

@function_tool
def get_invoice(customer_id: str, month: str) -> str:
    """Retrieve a customer's invoice for a given month."""
    # In production, this queries your billing database
    return f"Invoice for {customer_id} ({month}): $149.00 — Plan: Pro, Status: Paid"

billing_specialist_enhanced = Agent(
    name="BillingSpecialist",
    instructions="""Handle billing inquiries. Use lookup_account to get
    account details, calculate_proration for plan changes, get_invoice
    for invoice questions, and process_refund when refunds are needed.
    Always verify amounts with the customer before processing refunds.""",
    model="gpt-4o",
    tools=[
        account_lookup_agent.as_tool(
            tool_name="lookup_account",
            tool_description="Look up customer account details",
        ),
        proration_agent.as_tool(
            tool_name="calculate_proration",
            tool_description="Calculate prorated charges",
        ),
        process_refund,
        get_invoice,
    ],
)

This gives the billing specialist four tools: two are sub-agents that use LLM reasoning to produce results, and two are deterministic functions that call external APIs. The specialist decides which tool to use based on the conversation context.

Handling Escalation in Hybrid Systems

A common requirement is escalating from a specialist back to triage, or to a human agent. In the hybrid pattern, you can add a handoff from the specialist back to triage or to an escalation agent.

escalation_agent = Agent(
    name="HumanEscalation",
    instructions="""This conversation requires human intervention.
    Summarize the issue, what has been tried so far, and why
    escalation is needed. A human agent will take over.""",
    model="gpt-4o",
)

technical_specialist_with_escalation = Agent(
    name="TechnicalSpecialist",
    instructions="""Diagnose and resolve technical issues. If you cannot
    resolve the issue after using your diagnostic tools, escalate to
    a human agent. Do not keep the customer in a loop.""",
    model="gpt-4o",
    tools=[
        diagnostics_agent.as_tool(
            tool_name="run_diagnostics",
            tool_description="Analyze symptoms and suggest root causes",
        ),
        knowledge_search_agent.as_tool(
            tool_name="search_knowledge_base",
            tool_description="Find relevant knowledge base articles",
        ),
    ],
    handoffs=[
        handoff(escalation_agent, description="Escalate to human agent"),
    ],
)

Notice how the technical specialist now has both tools and handoffs. It uses tools for sub-tasks during the conversation and handoffs when it needs to transfer control entirely — either back to triage or forward to a human.

Testing Hybrid Systems

Testing hybrid orchestration requires testing each layer independently and then testing the integrated system.

Unit test the helpers. Each helper agent should produce correct results for known inputs. These are the easiest to test because they have no conversation state.

Integration test the specialists. Given a conversation history, does the specialist call the right tools in the right order? Does it incorporate tool results correctly?

End-to-end test the triage flow. Given a user message, does triage route to the correct specialist? Does the specialist handle the conversation through to resolution?

async def test_billing_flow():
    """Test that billing questions route correctly and tools are used."""
    result = await Runner.run(
        triage_agent,
        input="I was charged twice for my subscription last month",
    )

    # Verify the conversation was handled by the billing specialist
    agent_names = [
        item.agent.name
        for item in result.all_items
        if hasattr(item, "agent")
    ]
    assert "BillingSpecialist" in agent_names

    # Verify account lookup was called
    tool_names = [
        item.tool_name
        for item in result.all_items
        if hasattr(item, "tool_name")
    ]
    assert "lookup_account" in tool_names

Design Guidelines

Keep helpers stateless and cheap. Helper agents should use gpt-4o-mini and have narrow, specific instructions. They run once and return a result. No conversation memory needed.

Keep specialists stateful and capable. Specialist agents use gpt-4o and have rich instructions. They maintain the conversation with the user and decide when to call helpers.

Keep triage minimal. The triage agent should make a routing decision as quickly as possible. One clarifying question maximum. Long triage conversations frustrate users.

Limit tool count per specialist. More than 5-6 tools per agent degrades routing accuracy. If a specialist needs more sub-capabilities, consider splitting it into two specialists with a handoff between them.

Hybrid orchestration is how production multi-agent systems actually work. Pure handoff or pure tool-based architectures hit limitations quickly. Combining them gives you the routing flexibility of handoffs with the sub-task precision of tools.

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.