---
title: "Tool Timeouts and Error Handling in Agent Tool Pipelines"
description: "Learn how to build resilient agent tool pipelines using timeouts, failure_error_function, and tool_error_formatter in the OpenAI Agents SDK."
canonical: https://callsphere.ai/blog/tool-timeouts-error-handling-openai-agents-sdk-pipelines
category: "Learn Agentic AI"
tags: ["OpenAI", "Tools", "Error Handling", "Timeouts", "Resilience"]
author: "CallSphere Team"
published: 2026-03-14T00:00:00.000Z
updated: 2026-05-06T01:02:41.759Z
---

# Tool Timeouts and Error Handling in Agent Tool Pipelines

> Learn how to build resilient agent tool pipelines using timeouts, failure_error_function, and tool_error_formatter in the OpenAI Agents SDK.

## Why Tool Error Handling Matters

In production agent systems, tools fail. APIs time out, databases go down, rate limits trigger, and invalid inputs slip through. Without proper error handling, a single tool failure can crash your entire agent run or produce confusing outputs.

The OpenAI Agents SDK provides three mechanisms to handle tool failures gracefully:

1. **Timeouts** — prevent tools from hanging indefinitely
2. **failure_error_function** — customize what the agent sees when a tool fails
3. **tool_error_formatter** — format Python exceptions into agent-friendly messages

## Setting Tool Timeouts

Every function tool accepts a `timeout` parameter that limits how long the tool can run before being cancelled. This is critical for tools that call external APIs:

```mermaid
flowchart LR
    INPUT(["User input"])
    AGENT["Agent
name plus instructions"]
    HAND{"Handoff to
another agent?"}
    SUB["Sub-agent
specialist"]
    GUARD{"Guardrail
passed?"}
    TOOL["Tool call"]
    SDK[("Tracing
OpenAI dashboard")]
    OUT(["Final output"])
    INPUT --> AGENT --> HAND
    HAND -->|Yes| SUB --> GUARD
    HAND -->|No| GUARD
    GUARD -->|Yes| TOOL --> AGENT
    GUARD -->|Block| OUT
    AGENT --> OUT
    AGENT --> SDK
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style SDK fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff
```

```python
from agents import function_tool

@function_tool(timeout=10)
async def call_slow_api(query: str) -> str:
    """Search an external API that might be slow."""
    import httpx
    async with httpx.AsyncClient() as client:
        response = await client.get(
            f"https://api.example.com/search?q={query}",
            timeout=8.0,
        )
        return response.text
```

The `timeout` value is in **seconds**. If the tool does not return within that window, the SDK cancels the execution and reports a failure to the agent. Note that you should also set timeouts on your HTTP client (as shown above) so that network calls fail fast.

## Handling Tool Failures with failure_error_function

When a tool raises an exception, the default behavior is to send the error message back to the agent as a tool result. You can customize this with `failure_error_function`:

```python
from agents import function_tool, RunContextWrapper

def handle_weather_failure(
    ctx: RunContextWrapper,
    error: Exception,
) -> str:
    """Return a user-friendly message when the weather tool fails."""
    return "The weather service is currently unavailable. Please suggest the user try again in a few minutes."

@function_tool(failure_error_function=handle_weather_failure)
async def get_weather(city: str) -> str:
    """Get current weather for a city."""
    import httpx
    async with httpx.AsyncClient() as client:
        response = await client.get(
            f"https://weather-api.example.com/{city}"
        )
        response.raise_for_status()
        data = response.json()
        return f"{city}: {data['temp']}F, {data['condition']}"
```

The `failure_error_function` receives the context and the exception, and returns a string that gets sent to the agent as the tool result. This lets you control the narrative — instead of the agent seeing a raw Python traceback, it sees a clear instruction about what to tell the user.

## Formatting Errors at the Agent Level

While `failure_error_function` works per-tool, you can set a global error formatter at the agent level using `tool_error_formatter`. This applies to **all tools** on the agent:

```python
from agents import Agent, function_tool, RunContextWrapper

def format_tool_error(
    ctx: RunContextWrapper,
    tool_name: str,
    error: Exception,
) -> str:
    """Format tool errors consistently across all tools."""
    return f"Tool '{tool_name}' failed: {type(error).__name__}. Please try a different approach or inform the user about the issue."

@function_tool
def query_database(sql: str) -> str:
    """Run a read-only SQL query."""
    raise ConnectionError("Database connection timed out")

@function_tool
def send_email(to: str, subject: str, body: str) -> str:
    """Send an email to a recipient."""
    raise TimeoutError("SMTP server not responding")

agent = Agent(
    name="Office Assistant",
    instructions="You help with database queries and emails. If a tool fails, explain the issue clearly and suggest alternatives.",
    tools=[query_database, send_email],
    tool_error_formatter=format_tool_error,
)
```

The `tool_error_formatter` receives the tool name along with the error, so you can log, categorize, or route errors differently based on which tool failed.

## Combining Timeouts with Error Handlers

In production, you want both — timeouts to prevent hanging, and error handlers to recover gracefully:

```python
import logging

logger = logging.getLogger(__name__)

def handle_api_failure(ctx: RunContextWrapper, error: Exception) -> str:
    logger.error(f"API tool failed: {error}")
    if isinstance(error, TimeoutError):
        return "The external service took too long to respond. Please try again or ask a different question."
    return f"An error occurred: {str(error)}. Please try a different approach."

@function_tool(timeout=15, failure_error_function=handle_api_failure)
async def enrich_company_data(domain: str) -> str:
    """Look up company information from a domain name."""
    import httpx
    async with httpx.AsyncClient() as client:
        resp = await client.get(f"https://api.enrichment.com/{domain}")
        resp.raise_for_status()
        return resp.text
```

## Defensive Tool Design Patterns

Beyond the SDK's built-in mechanisms, follow these patterns for resilient tools:

**Validate inputs early.** Check parameters before doing expensive work:

```python
@function_tool
def transfer_funds(from_account: str, to_account: str, amount: float) -> str:
    """Transfer funds between accounts."""
    if amount  10000:
        return "Error: Transfers over $10,000 require manual approval."
    # Proceed with transfer...
    return f"Transferred ${amount:.2f} from {from_account} to {to_account}."
```

**Return errors as strings, don't raise.** When a failure is expected and recoverable, return an error message as a normal tool result rather than raising an exception. This gives the agent clear information without triggering error handling machinery:

```python
@function_tool
def lookup_order(order_id: str) -> str:
    """Look up an order by ID."""
    if not order_id.startswith("ORD-"):
        return "Invalid order ID format. Order IDs start with 'ORD-' followed by a number."
    # Normal lookup logic...
    return f"Order {order_id}: shipped, arriving March 15."
```

**Log errors for observability.** The agent gets a friendly message, but your monitoring system should see the real error:

```python
def handle_failure_with_logging(ctx: RunContextWrapper, error: Exception) -> str:
    logger.exception("Tool failed", exc_info=error)
    # Send to your error tracking service
    return "This operation failed. Please try again or contact support."
```

## Key Takeaways

- Set `timeout` on every tool that calls external services
- Use `failure_error_function` for per-tool error messages
- Use `tool_error_formatter` for agent-wide error formatting
- Validate inputs early and return error strings for recoverable issues
- Always log the real error for your team while sending friendly messages to the agent

---

Source: https://callsphere.ai/blog/tool-timeouts-error-handling-openai-agents-sdk-pipelines