What Is the Agent Loop?

The agent loop is the core execution engine of the OpenAI Agents SDK. When you call Runner.run(), the SDK does not simply send your input to the LLM and return the response. Instead, it enters an iterative loop that orchestrates LLM calls, tool executions, and agent handoffs until a final output is produced.

Understanding this loop is critical for debugging agent behavior, setting appropriate max_turns values, and designing effective multi-agent workflows.

The Loop Step by Step

Here is the complete flow of the agent loop:

flowchart LR
    INPUT(["User input"])
    AGENT["Agent<br/>name plus instructions"]
    HAND{"Handoff to<br/>another agent?"}
    SUB["Sub-agent<br/>specialist"]
    GUARD{"Guardrail<br/>passed?"}
    TOOL["Tool call"]
    SDK[("Tracing<br/>OpenAI dashboard")]
    OUT(["Final output"])
    INPUT --> AGENT --> HAND
    HAND -->|Yes| SUB --> GUARD
    HAND -->|No| GUARD
    GUARD -->|Yes| TOOL --> AGENT
    GUARD -->|Block| OUT
    AGENT --> OUT
    AGENT --> SDK
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style SDK fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff

START
  |
  v
[1] Prepare messages (system prompt + conversation history)
  |
  v
[2] Call the LLM with messages + tool definitions
  |
  v
[3] Receive LLM response
  |
  v
[4] Check response type:
  |
  |---> [Final text output] --> RETURN RunResult
  |
  |---> [Structured output] --> Validate with Pydantic --> RETURN RunResult
  |
  |---> [Tool calls] --> Execute tools --> Add results to messages --> GOTO [2]
  |
  |---> [Handoff request] --> Switch to target agent --> GOTO [1]
  |
  v
[5] If max_turns exceeded --> RAISE MaxTurnsExceeded

Let us walk through each step in detail.

Step 1: Prepare Messages

At the start of each iteration, the SDK assembles the message list:

System message: The agent's instructions (or the result of calling the instructions function)
Conversation history: All previous messages, including user inputs, assistant responses, tool calls, and tool results from prior iterations
New user input: Your original query (on the first iteration only — subsequent iterations use the accumulated history)

If the agent was reached via a handoff, the SDK includes the handoff context in the message history so the new agent understands why it was called.

Step 2: Call the LLM

The assembled messages are sent to the configured language model along with:

Tool definitions: JSON schemas for all tools in the agent's tools list, plus any handoff tools
Output format: If output_type is set, the model is instructed to respond in the specified JSON schema
Model settings: Temperature, top_p, max_tokens, and other generation parameters

The SDK uses the OpenAI Responses API by default, though it can be configured to use the Chat Completions API for compatibility with other providers.

Step 3: Receive and Parse the Response

The LLM response is parsed into one of several types:

Text output: A plain text response with no tool calls
Structured output: A JSON response matching the output_type schema
Tool calls: One or more requests to execute tools
Handoff: A special tool call that transfers control to another agent

Step 4a: Final Output (Loop Ends)

If the response is a text or structured output with no tool calls, the loop ends. The SDK creates a RunResult containing:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

RunResult(
    input=original_input,
    new_items=[...all generated items...],
    final_output="The agent's response text",
    last_agent=current_agent,
)

For structured outputs, the SDK validates the JSON against the Pydantic model before returning. If validation fails, it can optionally retry by feeding the validation error back to the model.

Step 4b: Tool Calls (Loop Continues)

If the response contains tool calls, the SDK:

Extracts each tool call with its name and arguments
Looks up the corresponding tool function
Executes the tool (with timeout protection)
Collects the tool result (or error message)
Adds the tool call and result to the message history
Returns to Step 2 for another LLM call

When parallel_tool_calls is enabled (the default), the SDK executes all tool calls concurrently:

# The model requests two tool calls in one response:
# 1. get_weather("Tokyo")
# 2. get_weather("London")
# Both execute simultaneously, then results are fed back together

Step 4c: Handoff (Agent Switch)

If the response is a handoff, the SDK:

Identifies the target agent from the handoff tool call
Switches the current agent to the target
Restarts the loop at Step 1 with the new agent's instructions

The conversation history carries over, so the new agent has full context of the prior conversation.

A Concrete Example

Let us trace through a real scenario. Consider an agent with a calculator tool:

from agents import Agent, Runner, function_tool

@function_tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression."""
    try:
        result = eval(expression)  # Simplified for example
        return str(result)
    except Exception as e:
        return f"Error: {e}"

agent = Agent(
    name="Math Helper",
    instructions="You help with math. Use the calculate tool for any computation.",
    tools=[calculate],
)

result = Runner.run_sync(agent, "What is (17 * 23) + (45 / 9)?")

Here is what happens inside the agent loop:

Turn 1:

Messages: system prompt + user message "What is (17 * 23) + (45 / 9)?"
LLM response: tool call calculate("(17 * 23) + (45 / 9)")
SDK executes tool, gets "396.0"
Tool result added to messages

Turn 2:

Messages: system prompt + user message + tool call + tool result "396.0"
LLM response: "The result of (17 x 23) + (45 / 9) is 396."
This is a final text output — loop ends

Total turns: 2. If you had set max_turns=1, the loop would have raised MaxTurnsExceeded because the first turn produced a tool call, not a final output.

MaxTurnsExceeded: The Safety Net

The max_turns parameter prevents infinite loops. If an agent keeps making tool calls without producing a final output, the loop will terminate:

from agents import Agent, Runner, MaxTurnsExceeded

try:
    result = await Runner.run(agent, "Research everything about quantum computing", max_turns=5)
except MaxTurnsExceeded as e:
    print(f"Agent ran for {e.max_turns} turns without completing.")
    # You can still access partial results from the exception

Common reasons for hitting max_turns:

Tool loops: The agent calls the same tool repeatedly without making progress
Ambiguous instructions: The agent is not sure when to stop and keeps gathering information
Complex tasks: The task genuinely requires many tool calls
Model confusion: The model misunderstands the tools and calls them incorrectly

Error Handling in the Loop

The SDK provides an error handling mechanism through the error_handlers pattern on agents. When a tool call fails, the SDK converts the error to a text message and feeds it back to the model, giving the agent a chance to recover:

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

@function_tool
def fetch_data(url: str) -> str:
    """Fetch data from a URL."""
    import httpx
    response = httpx.get(url, timeout=5)
    response.raise_for_status()
    return response.text

# If fetch_data raises an exception, the SDK catches it,
# sends the error message back to the model, and the agent
# can decide to retry with a different URL or report the error.

This self-healing behavior is one of the key advantages of the agent loop over a simple LLM call. The agent can reason about errors and adapt its strategy.

Handoff Flow in Detail

When an agent hands off to another agent, the loop essentially restarts with the new agent. Here is a trace of a multi-agent handoff:

from agents import Agent, Runner

spanish_agent = Agent(
    name="Spanish Speaker",
    instructions="You only speak Spanish. Respond to all queries in Spanish.",
)

english_agent = Agent(
    name="English Speaker",
    instructions="You only speak English. Respond to all queries in English.",
)

triage = Agent(
    name="Language Router",
    instructions="""Determine the language of the user's message.
    Hand off to the appropriate language specialist.""",
    handoffs=[spanish_agent, english_agent],
)

result = Runner.run_sync(triage, "Hola, como estas?")
print(result.final_output)      # Response in Spanish
print(result.last_agent.name)   # "Spanish Speaker"

Turn 1 (triage agent):

LLM determines the message is Spanish
Issues handoff to spanish_agent

Turn 2 (spanish_agent):

New system prompt: "You only speak Spanish..."
Full conversation history available
Responds in Spanish — final output

The result.last_agent tells you which agent actually produced the final response, which is essential for logging and analytics.

Monitoring the Loop

For debugging and observability, enable verbose logging:

from agents import enable_verbose_stdout_logging

enable_verbose_stdout_logging()

# Now all agent loop iterations, tool calls, and handoffs
# are printed to stdout with timestamps

In production, use the built-in tracing integration to send agent loop telemetry to your observability platform.

Design Implications

Understanding the agent loop shapes how you design agents:

Keep tools focused. Each tool should do one thing well. The agent can call multiple tools across turns to compose complex behavior.
Set max_turns based on expected complexity. Count the maximum number of tool calls your agent might need, add a buffer, and set that as your limit.
Use handoffs for specialization. Instead of one agent with 20 tools, create specialized agents with 3-5 tools each and let a triage agent route.
Test the loop, not just the output. Inspect result.new_items to verify the agent took the expected path through tools and handoffs.
Design for recovery. Tools will fail. Instructions should tell the agent how to handle errors gracefully.

Source: OpenAI Agents SDK — Agent Loop

The Agent Loop Explained: How OpenAI Agents Process Tasks Step-by-Step

What Is the Agent Loop?

The Loop Step by Step

Step 1: Prepare Messages

Step 2: Call the LLM

Step 3: Receive and Parse the Response

Step 4a: Final Output (Loop Ends)

Step 4b: Tool Calls (Loop Continues)

Step 4c: Handoff (Agent Switch)

A Concrete Example

MaxTurnsExceeded: The Safety Net

Error Handling in the Loop

Handoff Flow in Detail

Monitoring the Loop

Design Implications

Try CallSphere AI Voice Agents

Related Articles You May Like

Desktop AI Agents in 2026: Project Arc, Claude Cowork, OpenAI Agents Compared

OpenAI Frontier: Model-Native Orchestration Is the Default in 2026

A2A Multi-Agent Architecture Patterns (2026 Reference)

Building Multi-Agent Systems With MCP, A2A, And CallSphere As A Node

Gemini Enterprise vs Anthropic vs OpenAI Frontier: 2026 Comparison

Anthropic's Financial Services Platform: State of Play in May 2026