Nested Handoff History and Conversation Management in Multi-Agent Systems
Learn how to manage conversation history across agent boundaries using nest_handoff_history, per-handoff overrides, CONVERSATION HISTORY blocks, and handoff_history_mapper in the OpenAI Agents SDK.
The Context Challenge in Multi-Agent Systems
When multiple agents collaborate on a task, conversation history management becomes critical. Each handoff creates a decision point: should the target agent see everything that happened before, a filtered subset, or a restructured view of the history?
The OpenAI Agents SDK provides several mechanisms for controlling how conversation history flows across agent boundaries. Understanding these mechanisms is the difference between a multi-agent system that works reliably and one that confuses itself with irrelevant context.
nest_handoff_history in RunConfig
The nest_handoff_history flag in RunConfig controls the fundamental structure of how history is presented to target agents after a handoff. When enabled, it wraps the pre-handoff conversation in a clearly delimited block rather than flattening it into the target agent's message stream.
flowchart TD
START["Nested Handoff History and Conversation Managemen…"] --> A
A["The Context Challenge in Multi-Agent Sy…"]
A --> B
B["nest_handoff_history in RunConfig"]
B --> C
C["Per-Handoff History Overrides"]
C --> D
D["The CONVERSATION HISTORY Block"]
D --> E
E["handoff_history_mapper for Custom Forwa…"]
E --> F
F["Managing Context Across Agent Boundaries"]
F --> G
G["Best Practices"]
G --> H
H["Summary"]
H --> DONE["Key Takeaways"]
style START fill:#4f46e5,stroke:#4338ca,color:#fff
style DONE fill:#059669,stroke:#047857,color:#fff
Default Behavior (nest_handoff_history=False)
By default, the target agent receives the full conversation history as a flat sequence of messages. This means the target agent sees all previous messages as if they were part of its own conversation:
from agents import Agent, Runner, handoff, RunConfig
import asyncio
agent_b = Agent(
name="AgentB",
instructions="You are Agent B. Continue the conversation.",
model="gpt-4o",
)
agent_a = Agent(
name="AgentA",
instructions="Greet the user, then hand off to Agent B.",
model="gpt-4o",
handoffs=[handoff(agent_b, description="Transfer to Agent B")],
)
async def main():
# Default: flat history
config = RunConfig()
result = await Runner.run(
agent_a,
input="Hello, I need help with my account.",
run_config=config,
)
print(result.final_output)
asyncio.run(main())
With flat history, Agent B sees something like:
User: Hello, I need help with my account.
Assistant (AgentA): Hi there! Let me transfer you to the right specialist.
[handoff to AgentB]
Agent B cannot easily distinguish which messages came from Agent A versus from the user. This can lead to confusion, especially when Agent A gave instructions or made promises that Agent B should not be bound by.
Nested Behavior (nest_handoff_history=True)
When you enable nested history, the pre-handoff conversation is wrapped in a CONVERSATION HISTORY block:
from agents import Agent, Runner, handoff, RunConfig
import asyncio
agent_b = Agent(
name="AgentB",
instructions="You are Agent B. Review the conversation history and continue helping the user.",
model="gpt-4o",
)
agent_a = Agent(
name="AgentA",
instructions="Greet the user, then hand off to Agent B.",
model="gpt-4o",
handoffs=[handoff(agent_b, description="Transfer to Agent B")],
)
async def main():
config = RunConfig(nest_handoff_history=True)
result = await Runner.run(
agent_a,
input="Hello, I need help with my account.",
run_config=config,
)
print(result.final_output)
asyncio.run(main())
With nested history, Agent B sees something structured like:
--- CONVERSATION HISTORY ---
User: Hello, I need help with my account.
Assistant (AgentA): Hi there! Let me transfer you to the right specialist.
--- END CONVERSATION HISTORY ---
This clear demarcation helps Agent B understand:
- What was said before it joined
- Which messages are from the user versus previous agents
- That the conversation is a continuation, not a fresh start
Per-Handoff History Overrides
You can override the global nest_handoff_history setting on individual handoffs. This lets you use different strategies for different handoff targets:
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
from agents import Agent, handoff, RunConfig
escalation_agent = Agent(
name="EscalationAgent",
instructions="""You are a senior escalation manager. Review the
full conversation history carefully to understand what has already
been tried before you intervene.""",
model="gpt-4o",
)
faq_agent = Agent(
name="FAQAgent",
instructions="""You answer frequently asked questions. You do not
need prior conversation context — just answer the question directly.""",
model="gpt-4o",
)
triage_agent = Agent(
name="TriageAgent",
instructions="Route to the right department.",
model="gpt-4o",
handoffs=[
# Escalation needs full nested history to review what happened
handoff(
escalation_agent,
description="Escalate complex issues",
nest_handoff_history=True,
),
# FAQ does not need history — start fresh
handoff(
faq_agent,
description="Answer common questions",
nest_handoff_history=False,
),
],
)
The per-handoff override takes precedence over the global RunConfig setting. This gives you fine-grained control:
| Handoff Target | Global Setting | Per-Handoff Override | Effective Behavior |
|---|---|---|---|
| EscalationAgent | False | True | Nested |
| FAQAgent | True | False | Flat |
| SupportAgent | True | (none) | Nested (inherits global) |
The CONVERSATION HISTORY Block
When nest_handoff_history is enabled, the SDK wraps prior conversation in a structured block. The target agent receives this as a system or context message before processing continues.
flowchart TD
ROOT["Nested Handoff History and Conversation Mana…"]
ROOT --> P0["nest_handoff_history in RunConfig"]
P0 --> P0C0["Default Behavior nest_handoff_history=F…"]
P0 --> P0C1["Nested Behavior nest_handoff_history=Tr…"]
ROOT --> P1["The CONVERSATION HISTORY Block"]
P1 --> P1C0["Why This Matters for Agent Quality"]
ROOT --> P2["handoff_history_mapper for Custom Forwa…"]
P2 --> P2C0["Advanced History Mapper: Role-Based Fil…"]
P2 --> P2C1["History Mapper with Token Counting"]
style ROOT fill:#4f46e5,stroke:#4338ca,color:#fff
style P0 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
style P1 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
style P2 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
The format is designed to be unambiguous to the LLM:
[CONVERSATION HISTORY FROM PREVIOUS AGENT: AgentA]
User: I need to cancel my subscription.
AgentA: I understand you want to cancel. Let me transfer you to our retention team.
[END CONVERSATION HISTORY]
Why This Matters for Agent Quality
Without nesting, a common failure mode occurs when the target agent "adopts" the previous agent's persona. If Agent A said "I'll look into that for you," Agent B might continue as if it made that promise. With nested history, Agent B clearly sees this was a different agent's statement.
Another failure mode is tool confusion. If Agent A called tools and the results are in the flat history, Agent B might try to reference those tool results as if they were its own. Nesting makes the boundary explicit.
handoff_history_mapper for Custom Forwarding
For maximum control, use handoff_history_mapper — a function that transforms the conversation history into whatever format you want before it reaches the target agent:
from agents import Agent, handoff
def summarize_history_mapper(history: list) -> list:
"""Replace full history with a summary message."""
if not history:
return history
# Extract just the user messages
user_messages = []
for msg in history:
if hasattr(msg, 'role') and msg.role == 'user':
content = msg.content if isinstance(msg.content, str) else str(msg.content)
user_messages.append(content)
summary = "Previous conversation summary:\n"
for i, msg in enumerate(user_messages, 1):
summary += f"{i}. User said: {msg}\n"
# Return a single summary message
return [{"role": "system", "content": summary}]
specialist_agent = Agent(
name="Specialist",
instructions="Help the user based on the conversation summary provided.",
model="gpt-4o",
)
triage_agent = Agent(
name="Triage",
instructions="Route to specialist.",
model="gpt-4o",
handoffs=[
handoff(
specialist_agent,
description="Specialist for complex issues",
handoff_history_mapper=summarize_history_mapper,
),
],
)
Advanced History Mapper: Role-Based Filtering
def role_based_mapper(allowed_roles: list[str]):
"""Create a mapper that only forwards messages from specific roles."""
def _mapper(history: list) -> list:
filtered = []
for msg in history:
if hasattr(msg, 'role') and msg.role in allowed_roles:
filtered.append(msg)
return filtered
return _mapper
# Only forward user and system messages — strip all assistant responses
triage_agent = Agent(
name="Triage",
instructions="Route to specialist.",
model="gpt-4o",
handoffs=[
handoff(
specialist_agent,
description="Specialist",
handoff_history_mapper=role_based_mapper(["user", "system"]),
),
],
)
History Mapper with Token Counting
For production systems where context window management is critical:
def token_budget_mapper(max_tokens: int = 2000):
"""Keep only the most recent messages that fit within a token budget."""
def _mapper(history: list) -> list:
# Rough approximation: 4 chars ≈ 1 token
budget = max_tokens
result = []
# Process from most recent to oldest
for msg in reversed(history):
content = ""
if hasattr(msg, 'content'):
content = msg.content if isinstance(msg.content, str) else str(msg.content)
estimated_tokens = len(content) // 4
if estimated_tokens <= budget:
result.insert(0, msg)
budget -= estimated_tokens
else:
break
return result
return _mapper
triage_agent = Agent(
name="Triage",
instructions="Route to specialist.",
model="gpt-4o",
handoffs=[
handoff(
specialist_agent,
description="Specialist",
handoff_history_mapper=token_budget_mapper(max_tokens=3000),
),
],
)
Managing Context Across Agent Boundaries
Beyond history manipulation, there are patterns for managing shared state across agents using the context parameter:
from agents import Agent, Runner, handoff, RunContextWrapper
import asyncio
# Shared context type
class ConversationContext:
def __init__(self):
self.customer_id: str | None = None
self.verified: bool = False
self.notes: list[str] = []
self.handoff_chain: list[str] = []
async def track_handoff(context: RunContextWrapper[ConversationContext]) -> None:
context.context.handoff_chain.append("billing")
context.context.notes.append("Handed off to billing")
billing_agent = Agent(
name="BillingAgent",
instructions="Handle billing. Check context.verified before making changes.",
model="gpt-4o",
)
verification_agent = Agent(
name="VerificationAgent",
instructions="""Verify the customer's identity by asking for their
account email and last 4 digits of payment method.""",
model="gpt-4o",
handoffs=[
handoff(
billing_agent,
description="Transfer to billing after verification",
on_handoff=track_handoff,
),
],
)
async def main():
ctx = ConversationContext()
ctx.customer_id = "cust_12345"
result = await Runner.run(
verification_agent,
input="I need to dispute a charge on my account",
context=ctx,
)
print(f"Handoff chain: {ctx.handoff_chain}")
print(f"Notes: {ctx.notes}")
asyncio.run(main())
Best Practices
1. Use nested history for chains longer than 2 agents. When conversations pass through 3 or more agents, flat history becomes confusing. Nesting makes boundaries explicit.
2. Strip tool calls when handing to non-technical agents. If a diagnostic agent ran API health checks, the billing agent does not need to see those tool calls. Use handoff_filters.remove_all_tools or a custom filter.
3. Budget your context window. Each handoff accumulates history. For long-running multi-agent conversations, use handoff_history_mapper with token budgets to keep history within limits.
4. Use the context object for state, not history. Do not rely on conversation history to pass structured state between agents. Use the context parameter on Runner.run() for typed, reliable state sharing.
5. Log handoff history transformations. In production, log what was filtered out so you can debug cases where the target agent lacked necessary context.
Summary
Conversation history management is the unsexy but essential infrastructure of multi-agent systems. Use nest_handoff_history to create clear boundaries between agent conversations. Use per-handoff overrides for different strategies per target. Use handoff_history_mapper for complete control over what gets forwarded. And use the context object for reliable state sharing that does not depend on the LLM interpreting conversation history correctly.
Written by
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.