---
title: "Tool Guardrails: Protecting Function Execution"
description: "Learn how to implement tool input and output guardrails in the OpenAI Agents SDK to validate function arguments, skip dangerous calls, and replace tool outputs before they reach the agent."
canonical: https://callsphere.ai/blog/tool-guardrails-protecting-function-execution-openai-agents-sdk
category: "Learn Agentic AI"
tags: ["OpenAI", "Tool Guardrails", "Security", "Function Protection"]
author: "CallSphere Team"
published: 2026-03-14T00:00:00.000Z
updated: 2026-05-06T07:29:39.190Z
---

# Tool Guardrails: Protecting Function Execution

> Learn how to implement tool input and output guardrails in the OpenAI Agents SDK to validate function arguments, skip dangerous calls, and replace tool outputs before they reach the agent.

## Why Tool Execution Needs Its Own Guardrails

Input guardrails catch bad user messages. Output guardrails catch bad agent responses. But between those two checkpoints, the agent calls tools — and tool calls are where the real damage happens. A miscrafted tool call can delete database records, send emails to the wrong recipient, charge a credit card for the wrong amount, or leak internal data through an API.

Tool guardrails in the OpenAI Agents SDK intercept tool execution at two points: before the function runs (tool input guardrails) and after it returns (tool output guardrails). They give you the ability to validate arguments, skip dangerous calls entirely, or replace tool outputs with sanitized versions.

## Tool Input Guardrails: Validating Before Execution

A tool input guardrail inspects the arguments that the agent has decided to pass to a function. It runs after the LLM has generated the tool call but before the actual function executes.

```mermaid
flowchart LR
    INPUT(["User input"])
    AGENT["Agent
name plus instructions"]
    HAND{"Handoff to
another agent?"}
    SUB["Sub-agent
specialist"]
    GUARD{"Guardrail
passed?"}
    TOOL["Tool call"]
    SDK[("Tracing
OpenAI dashboard")]
    OUT(["Final output"])
    INPUT --> AGENT --> HAND
    HAND -->|Yes| SUB --> GUARD
    HAND -->|No| GUARD
    GUARD -->|Yes| TOOL --> AGENT
    GUARD -->|Block| OUT
    AGENT --> OUT
    AGENT --> SDK
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style SDK fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff
```

```python
from agents import Agent, Runner, function_tool
from pydantic import BaseModel
import asyncio

@function_tool
def transfer_funds(from_account: str, to_account: str, amount: float) -> str:
    """Transfer funds between customer accounts."""
    # In production, this calls your banking API
    return f"Transferred ${amount:.2f} from {from_account} to {to_account}"

@function_tool
def get_account_balance(account_id: str) -> str:
    """Get the current balance for an account."""
    # Simulated lookup
    balances = {"ACC001": 5420.50, "ACC002": 12300.00}
    balance = balances.get(account_id, 0.0)
    return f"Account {account_id} balance: ${balance:.2f}"
```

Now define a tool input guardrail that validates transfer amounts:

```python
async def transfer_amount_guardrail(ctx, agent, tool_call):
    """Block transfers above the auto-approval limit."""
    if tool_call.function.name != "transfer_funds":
        return None  # Only check transfer_funds calls

    import json
    args = json.loads(tool_call.function.arguments)
    amount = args.get("amount", 0)

    if amount > 10000:
        return {
            "skip": True,
            "replacement_output": (
                "Transfer blocked: amounts over $10,000 require "
                "manager approval. Please escalate this request."
            ),
        }

    if amount <= 0:
        return {
            "skip": True,
            "replacement_output": (
                "Transfer blocked: amount must be a positive number."
            ),
        }

    return None  # Allow the call to proceed
```

When the guardrail returns `None`, the tool call proceeds normally. When it returns a dictionary with `skip: True`, the actual function is never called, and the `replacement_output` is fed back to the agent as if the tool had returned that value.

### Attaching Tool Input Guardrails to an Agent

```python
banking_agent = Agent(
    name="BankingAgent",
    instructions="""You are a banking support agent. You can check
    account balances and transfer funds between accounts. Always
    confirm the details with the customer before executing a transfer.""",
    model="gpt-4o",
    tools=[transfer_funds, get_account_balance],
    tool_use_guardrails=[
        {
            "type": "input",
            "guardrail_function": transfer_amount_guardrail,
        },
    ],
)
```

This is a fundamentally different safety model than relying on prompt instructions. The prompt says "confirm before transferring," but the guardrail enforces a hard limit regardless of what the model decides to do.

## Tool Output Guardrails: Sanitizing After Execution

Tool output guardrails run after the function returns but before the result is passed back to the agent. They are useful for redacting sensitive data, normalizing formats, or adding warnings to tool results.

```python
async def redact_tool_output_guardrail(ctx, agent, tool_call, tool_output):
    """Redact sensitive fields from tool outputs before the agent sees them."""
    import re

    output_str = str(tool_output)

    # Redact SSNs
    output_str = re.sub(
        r"(d{3})-(d{2})-(d{4})",
        r"***-**-\3",
        output_str,
    )

    # Redact credit card numbers (keep last 4)
    output_str = re.sub(
        r"d{4}[-s]?d{4}[-s]?d{4}[-s]?(d{4})",
        r"****-****-****-\1",
        output_str,
    )

    if output_str != str(tool_output):
        return {"replacement_output": output_str}

    return None  # No modification needed
```

The agent sees the redacted version. It can still reference "the card ending in 4242" or "the last four of your SSN" without the full sensitive data ever appearing in the conversation context. This is critical because the full conversation context is often logged, cached, or sent to other services.

### Attaching Output Guardrails

```python
customer_agent = Agent(
    name="CustomerAgent",
    instructions="Help customers with account inquiries.",
    model="gpt-4o",
    tools=[lookup_customer, get_transactions],
    tool_use_guardrails=[
        {
            "type": "output",
            "guardrail_function": redact_tool_output_guardrail,
        },
    ],
)
```

## Combining Input and Output Tool Guardrails

For maximum protection, apply both input and output guardrails to the same agent. Input guardrails prevent dangerous calls. Output guardrails sanitize the results of allowed calls.

```python
secure_agent = Agent(
    name="SecureAgent",
    instructions="You are a secure financial assistant.",
    model="gpt-4o",
    tools=[transfer_funds, get_account_balance, lookup_customer],
    tool_use_guardrails=[
        {
            "type": "input",
            "guardrail_function": transfer_amount_guardrail,
        },
        {
            "type": "input",
            "guardrail_function": block_after_hours_guardrail,
        },
        {
            "type": "output",
            "guardrail_function": redact_tool_output_guardrail,
        },
    ],
)
```

## Skipping Calls vs Replacing Output

Tool guardrails give you two distinct intervention strategies, and choosing the right one depends on the scenario.

**Skipping the call** means the function never executes. Use this when the tool call itself is dangerous — transferring too much money, deleting data, or calling an external API with invalid parameters.

**Replacing the output** means the function executes normally, but its return value is modified before the agent sees it. Use this when the function is safe to call but its output contains sensitive data that should not enter the conversation context.

```python
async def selective_guardrail(ctx, agent, tool_call):
    """Example showing both skip and allow-with-modification patterns."""
    import json

    if tool_call.function.name == "delete_record":
        # SKIP: Never allow deletion through the agent
        return {
            "skip": True,
            "replacement_output": (
                "Record deletion is not available through this interface. "
                "Please submit a deletion request through the admin portal."
            ),
        }

    if tool_call.function.name == "search_users":
        args = json.loads(tool_call.function.arguments)
        query = args.get("query", "")
        if len(query) < 3:
            # SKIP: Prevent overly broad searches
            return {
                "skip": True,
                "replacement_output": (
                    "Search query must be at least 3 characters. "
                    "Please ask the customer for more specific information."
                ),
            }

    return None  # Allow all other calls
```

## Real-World Pattern: Audit Logging Through Tool Guardrails

Tool guardrails are an excellent place to implement audit logging because they see every tool call the agent makes, including the arguments and outputs.

```python
import json
from datetime import datetime

async def audit_log_guardrail(ctx, agent, tool_call):
    """Log every tool call for audit purposes. Never skip or modify."""
    args = json.loads(tool_call.function.arguments)

    audit_entry = {
        "timestamp": datetime.utcnow().isoformat(),
        "agent_name": agent.name,
        "tool_name": tool_call.function.name,
        "arguments": args,
        "session_id": getattr(ctx, "session_id", "unknown"),
    }

    # Write to your audit log (database, file, or external service)
    await write_audit_log(audit_entry)

    # Always return None — this guardrail never blocks
    return None
```

This guardrail observes without interfering. Every tool call is logged with full context, giving you a complete audit trail of what the agent did, when, and with what parameters. This is invaluable for compliance, debugging, and understanding agent behavior in production.

## Best Practices

**Fail closed, not open.** If your guardrail encounters an error during evaluation (network timeout, parsing failure), skip the tool call rather than allowing it. An errored guardrail should block, not pass.

**Keep guardrail logic simple.** Tool guardrails add latency to every tool call. Use fast checks — argument validation, threshold comparisons, regex matching. Reserve LLM-based evaluation for input and output guardrails where the overhead is amortized across the full request.

**Test with adversarial tool calls.** Craft test cases where the model generates edge-case arguments: negative amounts, empty strings, SQL injection in search queries, extremely long inputs. Your guardrails should handle all of these gracefully.

**Separate concerns.** Use one guardrail per concern — one for amount limits, one for audit logging, one for PII redaction. This makes them independently testable and easy to enable or disable per environment.

---

Source: https://callsphere.ai/blog/tool-guardrails-protecting-function-execution-openai-agents-sdk
