Skip to content
Learn Agentic AI
Learn Agentic AI8 min read10 views

Tool Timeouts and Error Handling in Agent Tool Pipelines

Learn how to build resilient agent tool pipelines using timeouts, failure_error_function, and tool_error_formatter in the OpenAI Agents SDK.

Why Tool Error Handling Matters

In production agent systems, tools fail. APIs time out, databases go down, rate limits trigger, and invalid inputs slip through. Without proper error handling, a single tool failure can crash your entire agent run or produce confusing outputs.

The OpenAI Agents SDK provides three mechanisms to handle tool failures gracefully:

  1. Timeouts — prevent tools from hanging indefinitely
  2. failure_error_function — customize what the agent sees when a tool fails
  3. tool_error_formatter — format Python exceptions into agent-friendly messages

Setting Tool Timeouts

Every function tool accepts a timeout parameter that limits how long the tool can run before being cancelled. This is critical for tools that call external APIs:

flowchart TD
    START["Tool Timeouts and Error Handling in Agent Tool Pi…"] --> A
    A["Why Tool Error Handling Matters"]
    A --> B
    B["Setting Tool Timeouts"]
    B --> C
    C["Handling Tool Failures with failure_err…"]
    C --> D
    D["Formatting Errors at the Agent Level"]
    D --> E
    E["Combining Timeouts with Error Handlers"]
    E --> F
    F["Defensive Tool Design Patterns"]
    F --> G
    G["Key Takeaways"]
    G --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
from agents import function_tool

@function_tool(timeout=10)
async def call_slow_api(query: str) -> str:
    """Search an external API that might be slow."""
    import httpx
    async with httpx.AsyncClient() as client:
        response = await client.get(
            f"https://api.example.com/search?q={query}",
            timeout=8.0,
        )
        return response.text

The timeout value is in seconds. If the tool does not return within that window, the SDK cancels the execution and reports a failure to the agent. Note that you should also set timeouts on your HTTP client (as shown above) so that network calls fail fast.

Handling Tool Failures with failure_error_function

When a tool raises an exception, the default behavior is to send the error message back to the agent as a tool result. You can customize this with failure_error_function:

from agents import function_tool, RunContextWrapper

def handle_weather_failure(
    ctx: RunContextWrapper,
    error: Exception,
) -> str:
    """Return a user-friendly message when the weather tool fails."""
    return "The weather service is currently unavailable. Please suggest the user try again in a few minutes."

@function_tool(failure_error_function=handle_weather_failure)
async def get_weather(city: str) -> str:
    """Get current weather for a city."""
    import httpx
    async with httpx.AsyncClient() as client:
        response = await client.get(
            f"https://weather-api.example.com/{city}"
        )
        response.raise_for_status()
        data = response.json()
        return f"{city}: {data['temp']}F, {data['condition']}"

The failure_error_function receives the context and the exception, and returns a string that gets sent to the agent as the tool result. This lets you control the narrative — instead of the agent seeing a raw Python traceback, it sees a clear instruction about what to tell the user.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Formatting Errors at the Agent Level

While failure_error_function works per-tool, you can set a global error formatter at the agent level using tool_error_formatter. This applies to all tools on the agent:

flowchart TD
    CENTER(("Core Concepts"))
    CENTER --> N0["Timeouts — prevent tools from hanging i…"]
    CENTER --> N1["failure_error_function — customize what…"]
    CENTER --> N2["tool_error_formatter — format Python ex…"]
    CENTER --> N3["Set timeout on every tool that calls ex…"]
    CENTER --> N4["Use failure_error_function for per-tool…"]
    CENTER --> N5["Use tool_error_formatter for agent-wide…"]
    style CENTER fill:#4f46e5,stroke:#4338ca,color:#fff
from agents import Agent, function_tool, RunContextWrapper

def format_tool_error(
    ctx: RunContextWrapper,
    tool_name: str,
    error: Exception,
) -> str:
    """Format tool errors consistently across all tools."""
    return f"Tool '{tool_name}' failed: {type(error).__name__}. Please try a different approach or inform the user about the issue."

@function_tool
def query_database(sql: str) -> str:
    """Run a read-only SQL query."""
    raise ConnectionError("Database connection timed out")

@function_tool
def send_email(to: str, subject: str, body: str) -> str:
    """Send an email to a recipient."""
    raise TimeoutError("SMTP server not responding")

agent = Agent(
    name="Office Assistant",
    instructions="You help with database queries and emails. If a tool fails, explain the issue clearly and suggest alternatives.",
    tools=[query_database, send_email],
    tool_error_formatter=format_tool_error,
)

The tool_error_formatter receives the tool name along with the error, so you can log, categorize, or route errors differently based on which tool failed.

Combining Timeouts with Error Handlers

In production, you want both — timeouts to prevent hanging, and error handlers to recover gracefully:

import logging

logger = logging.getLogger(__name__)

def handle_api_failure(ctx: RunContextWrapper, error: Exception) -> str:
    logger.error(f"API tool failed: {error}")
    if isinstance(error, TimeoutError):
        return "The external service took too long to respond. Please try again or ask a different question."
    return f"An error occurred: {str(error)}. Please try a different approach."

@function_tool(timeout=15, failure_error_function=handle_api_failure)
async def enrich_company_data(domain: str) -> str:
    """Look up company information from a domain name."""
    import httpx
    async with httpx.AsyncClient() as client:
        resp = await client.get(f"https://api.enrichment.com/{domain}")
        resp.raise_for_status()
        return resp.text

Defensive Tool Design Patterns

Beyond the SDK's built-in mechanisms, follow these patterns for resilient tools:

Validate inputs early. Check parameters before doing expensive work:

@function_tool
def transfer_funds(from_account: str, to_account: str, amount: float) -> str:
    """Transfer funds between accounts."""
    if amount <= 0:
        return "Error: Transfer amount must be positive."
    if amount > 10000:
        return "Error: Transfers over $10,000 require manual approval."
    # Proceed with transfer...
    return f"Transferred ${amount:.2f} from {from_account} to {to_account}."

Return errors as strings, don't raise. When a failure is expected and recoverable, return an error message as a normal tool result rather than raising an exception. This gives the agent clear information without triggering error handling machinery:

@function_tool
def lookup_order(order_id: str) -> str:
    """Look up an order by ID."""
    if not order_id.startswith("ORD-"):
        return "Invalid order ID format. Order IDs start with 'ORD-' followed by a number."
    # Normal lookup logic...
    return f"Order {order_id}: shipped, arriving March 15."

Log errors for observability. The agent gets a friendly message, but your monitoring system should see the real error:

def handle_failure_with_logging(ctx: RunContextWrapper, error: Exception) -> str:
    logger.exception("Tool failed", exc_info=error)
    # Send to your error tracking service
    return "This operation failed. Please try again or contact support."

Key Takeaways

  • Set timeout on every tool that calls external services
  • Use failure_error_function for per-tool error messages
  • Use tool_error_formatter for agent-wide error formatting
  • Validate inputs early and return error strings for recoverable issues
  • Always log the real error for your team while sending friendly messages to the agent
Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like