Skip to content
Learn Agentic AI
Learn Agentic AI12 min read14 views

The Agent Loop Explained: How OpenAI Agents Process Tasks Step-by-Step

Understand the internal agent loop that powers the OpenAI Agents SDK. Learn how agents cycle through LLM calls, tool execution, handoffs, and final output generation.

What Is the Agent Loop?

The agent loop is the core execution engine of the OpenAI Agents SDK. When you call Runner.run(), the SDK does not simply send your input to the LLM and return the response. Instead, it enters an iterative loop that orchestrates LLM calls, tool executions, and agent handoffs until a final output is produced.

Understanding this loop is critical for debugging agent behavior, setting appropriate max_turns values, and designing effective multi-agent workflows.

The Loop Step by Step

Here is the complete flow of the agent loop:

flowchart TD
    START["The Agent Loop Explained: How OpenAI Agents Proce…"] --> A
    A["What Is the Agent Loop?"]
    A --> B
    B["The Loop Step by Step"]
    B --> C
    C["A Concrete Example"]
    C --> D
    D["MaxTurnsExceeded: The Safety Net"]
    D --> E
    E["Error Handling in the Loop"]
    E --> F
    F["Handoff Flow in Detail"]
    F --> G
    G["Monitoring the Loop"]
    G --> H
    H["Design Implications"]
    H --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
START
  |
  v
[1] Prepare messages (system prompt + conversation history)
  |
  v
[2] Call the LLM with messages + tool definitions
  |
  v
[3] Receive LLM response
  |
  v
[4] Check response type:
  |
  |---> [Final text output] --> RETURN RunResult
  |
  |---> [Structured output] --> Validate with Pydantic --> RETURN RunResult
  |
  |---> [Tool calls] --> Execute tools --> Add results to messages --> GOTO [2]
  |
  |---> [Handoff request] --> Switch to target agent --> GOTO [1]
  |
  v
[5] If max_turns exceeded --> RAISE MaxTurnsExceeded

Let us walk through each step in detail.

Step 1: Prepare Messages

At the start of each iteration, the SDK assembles the message list:

  1. System message: The agent's instructions (or the result of calling the instructions function)
  2. Conversation history: All previous messages, including user inputs, assistant responses, tool calls, and tool results from prior iterations
  3. New user input: Your original query (on the first iteration only — subsequent iterations use the accumulated history)

If the agent was reached via a handoff, the SDK includes the handoff context in the message history so the new agent understands why it was called.

Step 2: Call the LLM

The assembled messages are sent to the configured language model along with:

  • Tool definitions: JSON schemas for all tools in the agent's tools list, plus any handoff tools
  • Output format: If output_type is set, the model is instructed to respond in the specified JSON schema
  • Model settings: Temperature, top_p, max_tokens, and other generation parameters

The SDK uses the OpenAI Responses API by default, though it can be configured to use the Chat Completions API for compatibility with other providers.

Step 3: Receive and Parse the Response

The LLM response is parsed into one of several types:

  • Text output: A plain text response with no tool calls
  • Structured output: A JSON response matching the output_type schema
  • Tool calls: One or more requests to execute tools
  • Handoff: A special tool call that transfers control to another agent

Step 4a: Final Output (Loop Ends)

If the response is a text or structured output with no tool calls, the loop ends. The SDK creates a RunResult containing:

RunResult(
    input=original_input,
    new_items=[...all generated items...],
    final_output="The agent's response text",
    last_agent=current_agent,
)

For structured outputs, the SDK validates the JSON against the Pydantic model before returning. If validation fails, it can optionally retry by feeding the validation error back to the model.

Step 4b: Tool Calls (Loop Continues)

If the response contains tool calls, the SDK:

  1. Extracts each tool call with its name and arguments
  2. Looks up the corresponding tool function
  3. Executes the tool (with timeout protection)
  4. Collects the tool result (or error message)
  5. Adds the tool call and result to the message history
  6. Returns to Step 2 for another LLM call

When parallel_tool_calls is enabled (the default), the SDK executes all tool calls concurrently:

# The model requests two tool calls in one response:
# 1. get_weather("Tokyo")
# 2. get_weather("London")
# Both execute simultaneously, then results are fed back together

Step 4c: Handoff (Agent Switch)

If the response is a handoff, the SDK:

  1. Identifies the target agent from the handoff tool call
  2. Switches the current agent to the target
  3. Restarts the loop at Step 1 with the new agent's instructions

The conversation history carries over, so the new agent has full context of the prior conversation.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

A Concrete Example

Let us trace through a real scenario. Consider an agent with a calculator tool:

from agents import Agent, Runner, function_tool

@function_tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression."""
    try:
        result = eval(expression)  # Simplified for example
        return str(result)
    except Exception as e:
        return f"Error: {e}"

agent = Agent(
    name="Math Helper",
    instructions="You help with math. Use the calculate tool for any computation.",
    tools=[calculate],
)

result = Runner.run_sync(agent, "What is (17 * 23) + (45 / 9)?")

Here is what happens inside the agent loop:

Turn 1:

  • Messages: system prompt + user message "What is (17 * 23) + (45 / 9)?"
  • LLM response: tool call calculate("(17 * 23) + (45 / 9)")
  • SDK executes tool, gets "396.0"
  • Tool result added to messages

Turn 2:

  • Messages: system prompt + user message + tool call + tool result "396.0"
  • LLM response: "The result of (17 x 23) + (45 / 9) is 396."
  • This is a final text output — loop ends

Total turns: 2. If you had set max_turns=1, the loop would have raised MaxTurnsExceeded because the first turn produced a tool call, not a final output.

MaxTurnsExceeded: The Safety Net

The max_turns parameter prevents infinite loops. If an agent keeps making tool calls without producing a final output, the loop will terminate:

flowchart LR
    S0["Step 1: Prepare Messages"]
    S0 --> S1
    S1["Step 2: Call the LLM"]
    S1 --> S2
    S2["Step 3: Receive and Parse the Response"]
    S2 --> S3
    S3["Step 4a: Final Output Loop Ends"]
    S3 --> S4
    S4["Step 4b: Tool Calls Loop Continues"]
    S4 --> S5
    S5["Step 4c: Handoff Agent Switch"]
    style S0 fill:#4f46e5,stroke:#4338ca,color:#fff
    style S5 fill:#059669,stroke:#047857,color:#fff
from agents import Agent, Runner, MaxTurnsExceeded

try:
    result = await Runner.run(agent, "Research everything about quantum computing", max_turns=5)
except MaxTurnsExceeded as e:
    print(f"Agent ran for {e.max_turns} turns without completing.")
    # You can still access partial results from the exception

Common reasons for hitting max_turns:

  • Tool loops: The agent calls the same tool repeatedly without making progress
  • Ambiguous instructions: The agent is not sure when to stop and keeps gathering information
  • Complex tasks: The task genuinely requires many tool calls
  • Model confusion: The model misunderstands the tools and calls them incorrectly

Error Handling in the Loop

The SDK provides an error handling mechanism through the error_handlers pattern on agents. When a tool call fails, the SDK converts the error to a text message and feeds it back to the model, giving the agent a chance to recover:

@function_tool
def fetch_data(url: str) -> str:
    """Fetch data from a URL."""
    import httpx
    response = httpx.get(url, timeout=5)
    response.raise_for_status()
    return response.text

# If fetch_data raises an exception, the SDK catches it,
# sends the error message back to the model, and the agent
# can decide to retry with a different URL or report the error.

This self-healing behavior is one of the key advantages of the agent loop over a simple LLM call. The agent can reason about errors and adapt its strategy.

Handoff Flow in Detail

When an agent hands off to another agent, the loop essentially restarts with the new agent. Here is a trace of a multi-agent handoff:

from agents import Agent, Runner

spanish_agent = Agent(
    name="Spanish Speaker",
    instructions="You only speak Spanish. Respond to all queries in Spanish.",
)

english_agent = Agent(
    name="English Speaker",
    instructions="You only speak English. Respond to all queries in English.",
)

triage = Agent(
    name="Language Router",
    instructions="""Determine the language of the user's message.
    Hand off to the appropriate language specialist.""",
    handoffs=[spanish_agent, english_agent],
)

result = Runner.run_sync(triage, "Hola, como estas?")
print(result.final_output)      # Response in Spanish
print(result.last_agent.name)   # "Spanish Speaker"

Turn 1 (triage agent):

  • LLM determines the message is Spanish
  • Issues handoff to spanish_agent

Turn 2 (spanish_agent):

  • New system prompt: "You only speak Spanish..."
  • Full conversation history available
  • Responds in Spanish — final output

The result.last_agent tells you which agent actually produced the final response, which is essential for logging and analytics.

Monitoring the Loop

For debugging and observability, enable verbose logging:

from agents import enable_verbose_stdout_logging

enable_verbose_stdout_logging()

# Now all agent loop iterations, tool calls, and handoffs
# are printed to stdout with timestamps

In production, use the built-in tracing integration to send agent loop telemetry to your observability platform.

Design Implications

Understanding the agent loop shapes how you design agents:

  1. Keep tools focused. Each tool should do one thing well. The agent can call multiple tools across turns to compose complex behavior.

  2. Set max_turns based on expected complexity. Count the maximum number of tool calls your agent might need, add a buffer, and set that as your limit.

  3. Use handoffs for specialization. Instead of one agent with 20 tools, create specialized agents with 3-5 tools each and let a triage agent route.

  4. Test the loop, not just the output. Inspect result.new_items to verify the agent took the expected path through tools and handoffs.

  5. Design for recovery. Tools will fail. Instructions should tell the agent how to handle errors gracefully.


Source: OpenAI Agents SDK — Agent Loop

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.