---
title: "Nested Handoff History and Conversation Management in Multi-Agent Systems"
description: "Learn how to manage conversation history across agent boundaries using nest_handoff_history, per-handoff overrides, CONVERSATION HISTORY blocks, and handoff_history_mapper in the OpenAI Agents SDK."
canonical: https://callsphere.ai/blog/nested-handoff-history-conversation-management-multi-agent-systems
category: "Learn Agentic AI"
tags: ["OpenAI", "Handoffs", "History", "Conversation", "Multi-Agent"]
author: "CallSphere Team"
published: 2026-03-14T00:00:00.000Z
updated: 2026-05-06T03:24:39.823Z
---

# Nested Handoff History and Conversation Management in Multi-Agent Systems

> Learn how to manage conversation history across agent boundaries using nest_handoff_history, per-handoff overrides, CONVERSATION HISTORY blocks, and handoff_history_mapper in the OpenAI Agents SDK.

## The Context Challenge in Multi-Agent Systems

When multiple agents collaborate on a task, conversation history management becomes critical. Each handoff creates a decision point: should the target agent see everything that happened before, a filtered subset, or a restructured view of the history?

The OpenAI Agents SDK provides several mechanisms for controlling how conversation history flows across agent boundaries. Understanding these mechanisms is the difference between a multi-agent system that works reliably and one that confuses itself with irrelevant context.

## nest_handoff_history in RunConfig

The `nest_handoff_history` flag in `RunConfig` controls the fundamental structure of how history is presented to target agents after a handoff. When enabled, it wraps the pre-handoff conversation in a clearly delimited block rather than flattening it into the target agent's message stream.

```mermaid
flowchart TD
    INPUT(["Task input"])
    SUPER["Supervisor agent
plans plus monitors"]
    W1["Worker 1
research"]
    W2["Worker 2
code"]
    W3["Worker 3
writing"]
    CRITIC{"Output meets
rubric?"}
    REWORK["Rework or
retry path"]
    SHARED[("Shared scratchpad
and memory")]
    OUT(["Final result"])
    INPUT --> SUPER
    SUPER --> W1 --> CRITIC
    SUPER --> W2 --> CRITIC
    SUPER --> W3 --> CRITIC
    W1 --> SHARED
    W2 --> SHARED
    W3 --> SHARED
    SHARED --> SUPER
    CRITIC -->|Pass| OUT
    CRITIC -->|Fail| REWORK --> SUPER
    style SUPER fill:#4f46e5,stroke:#4338ca,color:#fff
    style CRITIC fill:#f59e0b,stroke:#d97706,color:#1f2937
    style OUT fill:#059669,stroke:#047857,color:#fff
    style SHARED fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
```

### Default Behavior (nest_handoff_history=False)

By default, the target agent receives the full conversation history as a flat sequence of messages. This means the target agent sees all previous messages as if they were part of its own conversation:

```python
from agents import Agent, Runner, handoff, RunConfig
import asyncio

agent_b = Agent(
    name="AgentB",
    instructions="You are Agent B. Continue the conversation.",
    model="gpt-4o",
)

agent_a = Agent(
    name="AgentA",
    instructions="Greet the user, then hand off to Agent B.",
    model="gpt-4o",
    handoffs=[handoff(agent_b, description="Transfer to Agent B")],
)

async def main():
    # Default: flat history
    config = RunConfig()
    result = await Runner.run(
        agent_a,
        input="Hello, I need help with my account.",
        run_config=config,
    )
    print(result.final_output)

asyncio.run(main())
```

With flat history, Agent B sees something like:

```
User: Hello, I need help with my account.
Assistant (AgentA): Hi there! Let me transfer you to the right specialist.
[handoff to AgentB]
```

Agent B cannot easily distinguish which messages came from Agent A versus from the user. This can lead to confusion, especially when Agent A gave instructions or made promises that Agent B should not be bound by.

### Nested Behavior (nest_handoff_history=True)

When you enable nested history, the pre-handoff conversation is wrapped in a CONVERSATION HISTORY block:

```python
from agents import Agent, Runner, handoff, RunConfig
import asyncio

agent_b = Agent(
    name="AgentB",
    instructions="You are Agent B. Review the conversation history and continue helping the user.",
    model="gpt-4o",
)

agent_a = Agent(
    name="AgentA",
    instructions="Greet the user, then hand off to Agent B.",
    model="gpt-4o",
    handoffs=[handoff(agent_b, description="Transfer to Agent B")],
)

async def main():
    config = RunConfig(nest_handoff_history=True)
    result = await Runner.run(
        agent_a,
        input="Hello, I need help with my account.",
        run_config=config,
    )
    print(result.final_output)

asyncio.run(main())
```

With nested history, Agent B sees something structured like:

```
--- CONVERSATION HISTORY ---
User: Hello, I need help with my account.
Assistant (AgentA): Hi there! Let me transfer you to the right specialist.
--- END CONVERSATION HISTORY ---
```

This clear demarcation helps Agent B understand:

- What was said before it joined
- Which messages are from the user versus previous agents
- That the conversation is a continuation, not a fresh start

## Per-Handoff History Overrides

You can override the global `nest_handoff_history` setting on individual handoffs. This lets you use different strategies for different handoff targets:

```python
from agents import Agent, handoff, RunConfig

escalation_agent = Agent(
    name="EscalationAgent",
    instructions="""You are a senior escalation manager. Review the
    full conversation history carefully to understand what has already
    been tried before you intervene.""",
    model="gpt-4o",
)

faq_agent = Agent(
    name="FAQAgent",
    instructions="""You answer frequently asked questions. You do not
    need prior conversation context — just answer the question directly.""",
    model="gpt-4o",
)

triage_agent = Agent(
    name="TriageAgent",
    instructions="Route to the right department.",
    model="gpt-4o",
    handoffs=[
        # Escalation needs full nested history to review what happened
        handoff(
            escalation_agent,
            description="Escalate complex issues",
            nest_handoff_history=True,
        ),
        # FAQ does not need history — start fresh
        handoff(
            faq_agent,
            description="Answer common questions",
            nest_handoff_history=False,
        ),
    ],
)
```

The per-handoff override takes precedence over the global `RunConfig` setting. This gives you fine-grained control:

| Handoff Target | Global Setting | Per-Handoff Override | Effective Behavior |
| --- | --- | --- | --- |
| EscalationAgent | False | True | Nested |
| FAQAgent | True | False | Flat |
| SupportAgent | True | (none) | Nested (inherits global) |

## The CONVERSATION HISTORY Block

When `nest_handoff_history` is enabled, the SDK wraps prior conversation in a structured block. The target agent receives this as a system or context message before processing continues.

The format is designed to be unambiguous to the LLM:

```
[CONVERSATION HISTORY FROM PREVIOUS AGENT: AgentA]
User: I need to cancel my subscription.
AgentA: I understand you want to cancel. Let me transfer you to our retention team.
[END CONVERSATION HISTORY]
```

### Why This Matters for Agent Quality

Without nesting, a common failure mode occurs when the target agent "adopts" the previous agent's persona. If Agent A said "I'll look into that for you," Agent B might continue as if it made that promise. With nested history, Agent B clearly sees this was a different agent's statement.

Another failure mode is tool confusion. If Agent A called tools and the results are in the flat history, Agent B might try to reference those tool results as if they were its own. Nesting makes the boundary explicit.

## handoff_history_mapper for Custom Forwarding

For maximum control, use `handoff_history_mapper` — a function that transforms the conversation history into whatever format you want before it reaches the target agent:

```python
from agents import Agent, handoff

def summarize_history_mapper(history: list) -> list:
    """Replace full history with a summary message."""
    if not history:
        return history

    # Extract just the user messages
    user_messages = []
    for msg in history:
        if hasattr(msg, 'role') and msg.role == 'user':
            content = msg.content if isinstance(msg.content, str) else str(msg.content)
            user_messages.append(content)

    summary = "Previous conversation summary:\n"
    for i, msg in enumerate(user_messages, 1):
        summary += f"{i}. User said: {msg}\n"

    # Return a single summary message
    return [{"role": "system", "content": summary}]

specialist_agent = Agent(
    name="Specialist",
    instructions="Help the user based on the conversation summary provided.",
    model="gpt-4o",
)

triage_agent = Agent(
    name="Triage",
    instructions="Route to specialist.",
    model="gpt-4o",
    handoffs=[
        handoff(
            specialist_agent,
            description="Specialist for complex issues",
            handoff_history_mapper=summarize_history_mapper,
        ),
    ],
)
```

### Advanced History Mapper: Role-Based Filtering

```python
def role_based_mapper(allowed_roles: list[str]):
    """Create a mapper that only forwards messages from specific roles."""
    def _mapper(history: list) -> list:
        filtered = []
        for msg in history:
            if hasattr(msg, 'role') and msg.role in allowed_roles:
                filtered.append(msg)
        return filtered
    return _mapper

# Only forward user and system messages — strip all assistant responses
triage_agent = Agent(
    name="Triage",
    instructions="Route to specialist.",
    model="gpt-4o",
    handoffs=[
        handoff(
            specialist_agent,
            description="Specialist",
            handoff_history_mapper=role_based_mapper(["user", "system"]),
        ),
    ],
)
```

### History Mapper with Token Counting

For production systems where context window management is critical:

```python
def token_budget_mapper(max_tokens: int = 2000):
    """Keep only the most recent messages that fit within a token budget."""
    def _mapper(history: list) -> list:
        # Rough approximation: 4 chars ≈ 1 token
        budget = max_tokens
        result = []

        # Process from most recent to oldest
        for msg in reversed(history):
            content = ""
            if hasattr(msg, 'content'):
                content = msg.content if isinstance(msg.content, str) else str(msg.content)
            estimated_tokens = len(content) // 4

            if estimated_tokens  None:
    context.context.handoff_chain.append("billing")
    context.context.notes.append("Handed off to billing")

billing_agent = Agent(
    name="BillingAgent",
    instructions="Handle billing. Check context.verified before making changes.",
    model="gpt-4o",
)

verification_agent = Agent(
    name="VerificationAgent",
    instructions="""Verify the customer's identity by asking for their
    account email and last 4 digits of payment method.""",
    model="gpt-4o",
    handoffs=[
        handoff(
            billing_agent,
            description="Transfer to billing after verification",
            on_handoff=track_handoff,
        ),
    ],
)

async def main():
    ctx = ConversationContext()
    ctx.customer_id = "cust_12345"

    result = await Runner.run(
        verification_agent,
        input="I need to dispute a charge on my account",
        context=ctx,
    )

    print(f"Handoff chain: {ctx.handoff_chain}")
    print(f"Notes: {ctx.notes}")

asyncio.run(main())
```

## Best Practices

**1. Use nested history for chains longer than 2 agents.** When conversations pass through 3 or more agents, flat history becomes confusing. Nesting makes boundaries explicit.

**2. Strip tool calls when handing to non-technical agents.** If a diagnostic agent ran API health checks, the billing agent does not need to see those tool calls. Use `handoff_filters.remove_all_tools` or a custom filter.

**3. Budget your context window.** Each handoff accumulates history. For long-running multi-agent conversations, use `handoff_history_mapper` with token budgets to keep history within limits.

**4. Use the context object for state, not history.** Do not rely on conversation history to pass structured state between agents. Use the `context` parameter on `Runner.run()` for typed, reliable state sharing.

**5. Log handoff history transformations.** In production, log what was filtered out so you can debug cases where the target agent lacked necessary context.

## Summary

Conversation history management is the unsexy but essential infrastructure of multi-agent systems. Use `nest_handoff_history` to create clear boundaries between agent conversations. Use per-handoff overrides for different strategies per target. Use `handoff_history_mapper` for complete control over what gets forwarded. And use the context object for reliable state sharing that does not depend on the LLM interpreting conversation history correctly.

---

Source: https://callsphere.ai/blog/nested-handoff-history-conversation-management-multi-agent-systems