Building a Tool Approval System with OpenAI Agents SDK: Human-in-the-Loop for Sensitive Actions

Why Human-in-the-Loop Matters

Some agent actions are irreversible: sending an email, executing a database migration, processing a payment, or modifying user accounts. No matter how good your LLM is, these operations need a human checkpoint. A tool approval system lets agents operate autonomously for safe operations while pausing for human review on sensitive ones.

Designing the Approval Framework

The framework has three components: an approval request, a decision store, and a wrapper that intercepts tool calls.

flowchart LR
    INPUT(["User input"])
    AGENT["Agent<br/>name plus instructions"]
    HAND{"Handoff to<br/>another agent?"}
    SUB["Sub-agent<br/>specialist"]
    GUARD{"Guardrail<br/>passed?"}
    TOOL["Tool call"]
    SDK[("Tracing<br/>OpenAI dashboard")]
    OUT(["Final output"])
    INPUT --> AGENT --> HAND
    HAND -->|Yes| SUB --> GUARD
    HAND -->|No| GUARD
    GUARD -->|Yes| TOOL --> AGENT
    GUARD -->|Block| OUT
    AGENT --> OUT
    AGENT --> SDK
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style SDK fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff

from pydantic import BaseModel
from enum import Enum
from datetime import datetime, timedelta
from typing import Any
import uuid
import asyncio

class ApprovalStatus(str, Enum):
    PENDING = "pending"
    APPROVED = "approved"
    REJECTED = "rejected"
    TIMED_OUT = "timed_out"
    AUTO_APPROVED = "auto_approved"

class ApprovalRequest(BaseModel):
    id: str
    tool_name: str
    arguments: dict[str, Any]
    agent_name: str
    reason: str | None = None
    status: ApprovalStatus = ApprovalStatus.PENDING
    created_at: datetime = datetime.utcnow()
    decided_at: datetime | None = None
    decided_by: str | None = None
    timeout_seconds: int = 300

class ApprovalStore:
    """In-memory approval store. Replace with Redis/DB for production."""

    def __init__(self):
        self._requests: dict[str, ApprovalRequest] = {}

    async def create_request(
        self, tool_name: str, arguments: dict, agent_name: str, timeout: int = 300
    ) -> ApprovalRequest:
        request = ApprovalRequest(
            id=str(uuid.uuid4()),
            tool_name=tool_name,
            arguments=arguments,
            agent_name=agent_name,
            timeout_seconds=timeout,
        )
        self._requests[request.id] = request
        return request

    async def get_request(self, request_id: str) -> ApprovalRequest | None:
        return self._requests.get(request_id)

    async def decide(self, request_id: str, approved: bool, decided_by: str) -> ApprovalRequest:
        request = self._requests[request_id]
        request.status = ApprovalStatus.APPROVED if approved else ApprovalStatus.REJECTED
        request.decided_at = datetime.utcnow()
        request.decided_by = decided_by
        return request

    async def get_pending(self) -> list[ApprovalRequest]:
        return [r for r in self._requests.values() if r.status == ApprovalStatus.PENDING]

Auto-Approve Rules

Not every invocation of a sensitive tool needs manual review. Define rules that auto-approve low-risk invocations.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

from dataclasses import dataclass

@dataclass
class AutoApproveRule:
    tool_name: str
    condition: str  # Human-readable description
    check: callable  # Function that returns True to auto-approve

class ApprovalPolicy:
    def __init__(self):
        self._sensitive_tools: set[str] = set()
        self._auto_approve_rules: list[AutoApproveRule] = []

    def mark_sensitive(self, *tool_names: str):
        self._sensitive_tools.update(tool_names)

    def add_auto_approve_rule(self, rule: AutoApproveRule):
        self._auto_approve_rules.append(rule)

    def needs_approval(self, tool_name: str, arguments: dict) -> bool:
        if tool_name not in self._sensitive_tools:
            return False
        # Check auto-approve rules
        for rule in self._auto_approve_rules:
            if rule.tool_name == tool_name and rule.check(arguments):
                return False  # Auto-approved
        return True

# Configure policy
policy = ApprovalPolicy()
policy.mark_sensitive("send_email", "delete_record", "process_payment")

# Auto-approve emails to internal domains
policy.add_auto_approve_rule(AutoApproveRule(
    tool_name="send_email",
    condition="Emails to @company.com are auto-approved",
    check=lambda args: args.get("to", "").endswith("@company.com"),
))

# Auto-approve payments under $10
policy.add_auto_approve_rule(AutoApproveRule(
    tool_name="process_payment",
    condition="Payments under $10 are auto-approved",
    check=lambda args: float(args.get("amount", 999)) < 10.0,
))

Building the Approval Gate

The gate intercepts tool calls that need approval, waits for a decision, and either proceeds or blocks.

from agents import function_tool, RunContextWrapper

approval_store = ApprovalStore()

def requires_approval(policy: ApprovalPolicy, store: ApprovalStore, timeout: int = 300):
    """Decorator that adds an approval gate to a tool function."""
    def decorator(func):
        original_name = func.__name__

        async def wrapper(ctx: RunContextWrapper, **kwargs):
            if not policy.needs_approval(original_name, kwargs):
                return await func(ctx, **kwargs)

            # Create approval request
            request = await store.create_request(
                tool_name=original_name,
                arguments=kwargs,
                agent_name="agent",
                timeout=timeout,
            )

            # Notify (implement your notification channel)
            print(f"APPROVAL NEEDED: {request.id} for {original_name}({kwargs})")

            # Wait for decision with timeout
            deadline = datetime.utcnow() + timedelta(seconds=timeout)
            while datetime.utcnow() < deadline:
                req = await store.get_request(request.id)
                if req.status == ApprovalStatus.APPROVED:
                    return await func(ctx, **kwargs)
                elif req.status == ApprovalStatus.REJECTED:
                    return f"Action '{original_name}' was rejected by {req.decided_by}."
                await asyncio.sleep(2)

            request.status = ApprovalStatus.TIMED_OUT
            return f"Action '{original_name}' timed out waiting for approval."

        wrapper.__name__ = original_name
        wrapper.__doc__ = func.__doc__
        return wrapper
    return decorator

Defining Sensitive Tools

@function_tool
@requires_approval(policy, approval_store, timeout=120)
async def send_email(ctx: RunContextWrapper, to: str, subject: str, body: str) -> str:
    """Send an email to the specified recipient."""
    # Actual email sending logic
    return f"Email sent to {to} with subject '{subject}'"

@function_tool
@requires_approval(policy, approval_store, timeout=60)
async def delete_record(ctx: RunContextWrapper, table: str, record_id: str) -> str:
    """Delete a record from the database."""
    return f"Record {record_id} deleted from {table}"

@function_tool
async def search_records(ctx: RunContextWrapper, query: str) -> str:
    """Search records — no approval needed."""
    return f"Found 5 records matching '{query}'"

Approval Dashboard Endpoint

Expose pending approvals via an API so reviewers can approve or reject actions.

from fastapi import FastAPI

app = FastAPI()

@app.get("/approvals/pending")
async def list_pending():
    pending = await approval_store.get_pending()
    return [r.model_dump() for r in pending]

@app.post("/approvals/{request_id}/decide")
async def decide_approval(request_id: str, approved: bool, reviewer: str):
    request = await approval_store.decide(request_id, approved, reviewer)
    return request.model_dump()

FAQ

How do I notify reviewers when approval is needed?

Integrate your notification channel (Slack, email, PagerDuty) in the approval gate. When a request is created, send a message with the tool name, arguments, and a link to the approval endpoint. Include a direct approve/reject URL for one-click decisions from the notification.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

What happens to the agent while waiting for approval?

The agent's tool call is blocked on the async wait loop. The Runner keeps the agent's state alive. From the user's perspective, the agent is "thinking." For long waits, use streaming to send a progress message like "Waiting for approval from your administrator" so the user is not left without feedback.

How do I handle approval for multi-agent systems with handoffs?

Each agent can have its own approval policy. When a handoff occurs, the receiving agent's policy governs its tool calls independently. Store the originating agent name in the approval request so reviewers have full context about which agent in the chain requested the action.

#OpenAIAgentsSDK #HumanintheLoop #ToolApproval #Safety #Python #Production #AgenticAI #LearnAI #AIEngineering

Building a Tool Approval System with OpenAI Agents SDK: Human-in-the-Loop for Sensitive Actions

Why Human-in-the-Loop Matters

Designing the Approval Framework

Auto-Approve Rules

Building the Approval Gate

Defining Sensitive Tools

Approval Dashboard Endpoint

FAQ

How do I notify reviewers when approval is needed?

What happens to the agent while waiting for approval?

How do I handle approval for multi-agent systems with handoffs?

Try CallSphere AI Voice Agents

Related Articles You May Like

Building an HVAC After-Hours Emergency Escalation System: A Complete Engineering Guide

GPT-Realtime-2 128K Context: What It Unlocks for Voice Agents

Human-in-the-Loop Hybrid Agents: 73% Fewer Errors in 2026

Streaming Agent Responses with OpenAI Agents SDK and LangChain in 2026

Token-Level Evaluation of Streaming Agents: TTFT, Stream Smoothness, and Mid-Stream Hallucination Detection

Tool Selection Accuracy: The Eval Most Teams Skip — and Should Not (2026)