Skip to content
Learn Agentic AI
Learn Agentic AI12 min read5 views

Nested Handoff History and Conversation Management in Multi-Agent Systems

Learn how to manage conversation history across agent boundaries using nest_handoff_history, per-handoff overrides, CONVERSATION HISTORY blocks, and handoff_history_mapper in the OpenAI Agents SDK.

The Context Challenge in Multi-Agent Systems

When multiple agents collaborate on a task, conversation history management becomes critical. Each handoff creates a decision point: should the target agent see everything that happened before, a filtered subset, or a restructured view of the history?

The OpenAI Agents SDK provides several mechanisms for controlling how conversation history flows across agent boundaries. Understanding these mechanisms is the difference between a multi-agent system that works reliably and one that confuses itself with irrelevant context.

nest_handoff_history in RunConfig

The nest_handoff_history flag in RunConfig controls the fundamental structure of how history is presented to target agents after a handoff. When enabled, it wraps the pre-handoff conversation in a clearly delimited block rather than flattening it into the target agent's message stream.

flowchart TD
    START["Nested Handoff History and Conversation Managemen…"] --> A
    A["The Context Challenge in Multi-Agent Sy…"]
    A --> B
    B["nest_handoff_history in RunConfig"]
    B --> C
    C["Per-Handoff History Overrides"]
    C --> D
    D["The CONVERSATION HISTORY Block"]
    D --> E
    E["handoff_history_mapper for Custom Forwa…"]
    E --> F
    F["Managing Context Across Agent Boundaries"]
    F --> G
    G["Best Practices"]
    G --> H
    H["Summary"]
    H --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff

Default Behavior (nest_handoff_history=False)

By default, the target agent receives the full conversation history as a flat sequence of messages. This means the target agent sees all previous messages as if they were part of its own conversation:

from agents import Agent, Runner, handoff, RunConfig
import asyncio

agent_b = Agent(
    name="AgentB",
    instructions="You are Agent B. Continue the conversation.",
    model="gpt-4o",
)

agent_a = Agent(
    name="AgentA",
    instructions="Greet the user, then hand off to Agent B.",
    model="gpt-4o",
    handoffs=[handoff(agent_b, description="Transfer to Agent B")],
)

async def main():
    # Default: flat history
    config = RunConfig()
    result = await Runner.run(
        agent_a,
        input="Hello, I need help with my account.",
        run_config=config,
    )
    print(result.final_output)

asyncio.run(main())

With flat history, Agent B sees something like:

User: Hello, I need help with my account.
Assistant (AgentA): Hi there! Let me transfer you to the right specialist.
[handoff to AgentB]

Agent B cannot easily distinguish which messages came from Agent A versus from the user. This can lead to confusion, especially when Agent A gave instructions or made promises that Agent B should not be bound by.

Nested Behavior (nest_handoff_history=True)

When you enable nested history, the pre-handoff conversation is wrapped in a CONVERSATION HISTORY block:

from agents import Agent, Runner, handoff, RunConfig
import asyncio

agent_b = Agent(
    name="AgentB",
    instructions="You are Agent B. Review the conversation history and continue helping the user.",
    model="gpt-4o",
)

agent_a = Agent(
    name="AgentA",
    instructions="Greet the user, then hand off to Agent B.",
    model="gpt-4o",
    handoffs=[handoff(agent_b, description="Transfer to Agent B")],
)

async def main():
    config = RunConfig(nest_handoff_history=True)
    result = await Runner.run(
        agent_a,
        input="Hello, I need help with my account.",
        run_config=config,
    )
    print(result.final_output)

asyncio.run(main())

With nested history, Agent B sees something structured like:

--- CONVERSATION HISTORY ---
User: Hello, I need help with my account.
Assistant (AgentA): Hi there! Let me transfer you to the right specialist.
--- END CONVERSATION HISTORY ---

This clear demarcation helps Agent B understand:

  • What was said before it joined
  • Which messages are from the user versus previous agents
  • That the conversation is a continuation, not a fresh start

Per-Handoff History Overrides

You can override the global nest_handoff_history setting on individual handoffs. This lets you use different strategies for different handoff targets:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

from agents import Agent, handoff, RunConfig

escalation_agent = Agent(
    name="EscalationAgent",
    instructions="""You are a senior escalation manager. Review the
    full conversation history carefully to understand what has already
    been tried before you intervene.""",
    model="gpt-4o",
)

faq_agent = Agent(
    name="FAQAgent",
    instructions="""You answer frequently asked questions. You do not
    need prior conversation context — just answer the question directly.""",
    model="gpt-4o",
)

triage_agent = Agent(
    name="TriageAgent",
    instructions="Route to the right department.",
    model="gpt-4o",
    handoffs=[
        # Escalation needs full nested history to review what happened
        handoff(
            escalation_agent,
            description="Escalate complex issues",
            nest_handoff_history=True,
        ),
        # FAQ does not need history — start fresh
        handoff(
            faq_agent,
            description="Answer common questions",
            nest_handoff_history=False,
        ),
    ],
)

The per-handoff override takes precedence over the global RunConfig setting. This gives you fine-grained control:

Handoff Target Global Setting Per-Handoff Override Effective Behavior
EscalationAgent False True Nested
FAQAgent True False Flat
SupportAgent True (none) Nested (inherits global)

The CONVERSATION HISTORY Block

When nest_handoff_history is enabled, the SDK wraps prior conversation in a structured block. The target agent receives this as a system or context message before processing continues.

flowchart TD
    ROOT["Nested Handoff History and Conversation Mana…"] 
    ROOT --> P0["nest_handoff_history in RunConfig"]
    P0 --> P0C0["Default Behavior nest_handoff_history=F…"]
    P0 --> P0C1["Nested Behavior nest_handoff_history=Tr…"]
    ROOT --> P1["The CONVERSATION HISTORY Block"]
    P1 --> P1C0["Why This Matters for Agent Quality"]
    ROOT --> P2["handoff_history_mapper for Custom Forwa…"]
    P2 --> P2C0["Advanced History Mapper: Role-Based Fil…"]
    P2 --> P2C1["History Mapper with Token Counting"]
    style ROOT fill:#4f46e5,stroke:#4338ca,color:#fff
    style P0 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
    style P1 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
    style P2 fill:#e0e7ff,stroke:#6366f1,color:#1e293b

The format is designed to be unambiguous to the LLM:

[CONVERSATION HISTORY FROM PREVIOUS AGENT: AgentA]
User: I need to cancel my subscription.
AgentA: I understand you want to cancel. Let me transfer you to our retention team.
[END CONVERSATION HISTORY]

Why This Matters for Agent Quality

Without nesting, a common failure mode occurs when the target agent "adopts" the previous agent's persona. If Agent A said "I'll look into that for you," Agent B might continue as if it made that promise. With nested history, Agent B clearly sees this was a different agent's statement.

Another failure mode is tool confusion. If Agent A called tools and the results are in the flat history, Agent B might try to reference those tool results as if they were its own. Nesting makes the boundary explicit.

handoff_history_mapper for Custom Forwarding

For maximum control, use handoff_history_mapper — a function that transforms the conversation history into whatever format you want before it reaches the target agent:

from agents import Agent, handoff

def summarize_history_mapper(history: list) -> list:
    """Replace full history with a summary message."""
    if not history:
        return history

    # Extract just the user messages
    user_messages = []
    for msg in history:
        if hasattr(msg, 'role') and msg.role == 'user':
            content = msg.content if isinstance(msg.content, str) else str(msg.content)
            user_messages.append(content)

    summary = "Previous conversation summary:\n"
    for i, msg in enumerate(user_messages, 1):
        summary += f"{i}. User said: {msg}\n"

    # Return a single summary message
    return [{"role": "system", "content": summary}]

specialist_agent = Agent(
    name="Specialist",
    instructions="Help the user based on the conversation summary provided.",
    model="gpt-4o",
)

triage_agent = Agent(
    name="Triage",
    instructions="Route to specialist.",
    model="gpt-4o",
    handoffs=[
        handoff(
            specialist_agent,
            description="Specialist for complex issues",
            handoff_history_mapper=summarize_history_mapper,
        ),
    ],
)

Advanced History Mapper: Role-Based Filtering

def role_based_mapper(allowed_roles: list[str]):
    """Create a mapper that only forwards messages from specific roles."""
    def _mapper(history: list) -> list:
        filtered = []
        for msg in history:
            if hasattr(msg, 'role') and msg.role in allowed_roles:
                filtered.append(msg)
        return filtered
    return _mapper

# Only forward user and system messages — strip all assistant responses
triage_agent = Agent(
    name="Triage",
    instructions="Route to specialist.",
    model="gpt-4o",
    handoffs=[
        handoff(
            specialist_agent,
            description="Specialist",
            handoff_history_mapper=role_based_mapper(["user", "system"]),
        ),
    ],
)

History Mapper with Token Counting

For production systems where context window management is critical:

def token_budget_mapper(max_tokens: int = 2000):
    """Keep only the most recent messages that fit within a token budget."""
    def _mapper(history: list) -> list:
        # Rough approximation: 4 chars ≈ 1 token
        budget = max_tokens
        result = []

        # Process from most recent to oldest
        for msg in reversed(history):
            content = ""
            if hasattr(msg, 'content'):
                content = msg.content if isinstance(msg.content, str) else str(msg.content)
            estimated_tokens = len(content) // 4

            if estimated_tokens <= budget:
                result.insert(0, msg)
                budget -= estimated_tokens
            else:
                break

        return result
    return _mapper

triage_agent = Agent(
    name="Triage",
    instructions="Route to specialist.",
    model="gpt-4o",
    handoffs=[
        handoff(
            specialist_agent,
            description="Specialist",
            handoff_history_mapper=token_budget_mapper(max_tokens=3000),
        ),
    ],
)

Managing Context Across Agent Boundaries

Beyond history manipulation, there are patterns for managing shared state across agents using the context parameter:

from agents import Agent, Runner, handoff, RunContextWrapper
import asyncio

# Shared context type
class ConversationContext:
    def __init__(self):
        self.customer_id: str | None = None
        self.verified: bool = False
        self.notes: list[str] = []
        self.handoff_chain: list[str] = []

async def track_handoff(context: RunContextWrapper[ConversationContext]) -> None:
    context.context.handoff_chain.append("billing")
    context.context.notes.append("Handed off to billing")

billing_agent = Agent(
    name="BillingAgent",
    instructions="Handle billing. Check context.verified before making changes.",
    model="gpt-4o",
)

verification_agent = Agent(
    name="VerificationAgent",
    instructions="""Verify the customer's identity by asking for their
    account email and last 4 digits of payment method.""",
    model="gpt-4o",
    handoffs=[
        handoff(
            billing_agent,
            description="Transfer to billing after verification",
            on_handoff=track_handoff,
        ),
    ],
)

async def main():
    ctx = ConversationContext()
    ctx.customer_id = "cust_12345"

    result = await Runner.run(
        verification_agent,
        input="I need to dispute a charge on my account",
        context=ctx,
    )

    print(f"Handoff chain: {ctx.handoff_chain}")
    print(f"Notes: {ctx.notes}")

asyncio.run(main())

Best Practices

1. Use nested history for chains longer than 2 agents. When conversations pass through 3 or more agents, flat history becomes confusing. Nesting makes boundaries explicit.

2. Strip tool calls when handing to non-technical agents. If a diagnostic agent ran API health checks, the billing agent does not need to see those tool calls. Use handoff_filters.remove_all_tools or a custom filter.

3. Budget your context window. Each handoff accumulates history. For long-running multi-agent conversations, use handoff_history_mapper with token budgets to keep history within limits.

4. Use the context object for state, not history. Do not rely on conversation history to pass structured state between agents. Use the context parameter on Runner.run() for typed, reliable state sharing.

5. Log handoff history transformations. In production, log what was filtered out so you can debug cases where the target agent lacked necessary context.

Summary

Conversation history management is the unsexy but essential infrastructure of multi-agent systems. Use nest_handoff_history to create clear boundaries between agent conversations. Use per-handoff overrides for different strategies per target. Use handoff_history_mapper for complete control over what gets forwarded. And use the context object for reliable state sharing that does not depend on the LLM interpreting conversation history correctly.

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.