Skip to content
OpenAI Agents SDK Lifecycle Hooks: Before/After Agent Run, Tool Call, and Handoff Events
Learn Agentic AI11 min read21 views

OpenAI Agents SDK Lifecycle Hooks: Before/After Agent Run, Tool Call, and Handoff Events

Master the lifecycle hook system in OpenAI Agents SDK to add custom logging, metrics collection, request modification, and observability at every stage of agent execution.

Why Lifecycle Hooks Matter

In production, you need to know what your agents are doing. Not just the final output — every tool call, every handoff, every model invocation. Lifecycle hooks give you insertion points at every stage of agent execution without modifying your agent logic.

Use hooks for logging, metrics, cost tracking, latency measurement, auditing, and even modifying behavior at runtime.

The AgentHooks Interface

The SDK provides the AgentHooks class with methods you can override. Each method fires at a specific point in the agent lifecycle.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →
flowchart LR
    INPUT(["User input"])
    AGENT["Agent<br/>name plus instructions"]
    HAND{"Handoff to<br/>another agent?"}
    SUB["Sub-agent<br/>specialist"]
    GUARD{"Guardrail<br/>passed?"}
    TOOL["Tool call"]
    SDK[("Tracing<br/>OpenAI dashboard")]
    OUT(["Final output"])
    INPUT --> AGENT --> HAND
    HAND -->|Yes| SUB --> GUARD
    HAND -->|No| GUARD
    GUARD -->|Yes| TOOL --> AGENT
    GUARD -->|Block| OUT
    AGENT --> OUT
    AGENT --> SDK
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style SDK fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff
from agents import AgentHooks, Agent, Runner, RunContextWrapper, Tool
from agents.items import TResponseInputItem, TResponseOutputItem
import time
import logging

logger = logging.getLogger("agent_hooks")

class ObservabilityHooks(AgentHooks):
    """Comprehensive hooks for logging and metrics."""

    async def on_start(
        self, context: RunContextWrapper, agent: Agent, input: list[TResponseInputItem]
    ) -> None:
        """Fires when an agent starts processing."""
        context.context["_start_time"] = time.monotonic()
        context.context["_tool_calls"] = 0
        logger.info(
            f"Agent '{agent.name}' started | input_items={len(input)}"
        )

    async def on_end(
        self, context: RunContextWrapper, agent: Agent, output: list[TResponseOutputItem]
    ) -> None:
        """Fires when an agent finishes processing."""
        elapsed = time.monotonic() - context.context.get("_start_time", 0)
        tool_count = context.context.get("_tool_calls", 0)
        logger.info(
            f"Agent '{agent.name}' completed | "
            f"duration={elapsed:.2f}s | "
            f"tool_calls={tool_count} | "
            f"output_items={len(output)}"
        )

    async def on_tool_start(
        self, context: RunContextWrapper, agent: Agent, tool: Tool
    ) -> None:
        """Fires before a tool executes."""
        context.context["_tool_start"] = time.monotonic()
        context.context["_tool_calls"] += 1
        logger.info(f"Tool '{tool.name}' starting on agent '{agent.name}'")

    async def on_tool_end(
        self, context: RunContextWrapper, agent: Agent, tool: Tool, result: str
    ) -> None:
        """Fires after a tool completes."""
        tool_elapsed = time.monotonic() - context.context.get("_tool_start", 0)
        logger.info(
            f"Tool '{tool.name}' completed | "
            f"duration={tool_elapsed:.2f}s | "
            f"result_length={len(str(result))}"
        )

    async def on_handoff(
        self, context: RunContextWrapper, from_agent: Agent, to_agent: Agent
    ) -> None:
        """Fires when control transfers between agents."""
        logger.info(f"Handoff: '{from_agent.name}' -> '{to_agent.name}'")

Attaching Hooks to Agents

Assign hooks when creating an agent.

hooks = ObservabilityHooks()

monitored_agent = Agent(
    name="support_agent",
    instructions="You are a helpful support agent.",
    hooks=hooks,
)

Building a Metrics Collector

Go beyond logging by collecting structured metrics you can push to Prometheus, Datadog, or any metrics backend.

from dataclasses import dataclass, field
from collections import defaultdict
import json

@dataclass
class AgentMetrics:
    total_runs: int = 0
    total_tool_calls: int = 0
    total_handoffs: int = 0
    total_errors: int = 0
    latency_samples: list[float] = field(default_factory=list)
    tool_call_counts: dict[str, int] = field(default_factory=lambda: defaultdict(int))
    handoff_counts: dict[str, int] = field(default_factory=lambda: defaultdict(int))

    @property
    def avg_latency(self) -> float:
        if not self.latency_samples:
            return 0.0
        return sum(self.latency_samples) / len(self.latency_samples)

    @property
    def p95_latency(self) -> float:
        if not self.latency_samples:
            return 0.0
        sorted_samples = sorted(self.latency_samples)
        idx = int(len(sorted_samples) * 0.95)
        return sorted_samples[min(idx, len(sorted_samples) - 1)]

    def to_dict(self) -> dict:
        return {
            "total_runs": self.total_runs,
            "total_tool_calls": self.total_tool_calls,
            "total_handoffs": self.total_handoffs,
            "avg_latency_ms": round(self.avg_latency * 1000, 2),
            "p95_latency_ms": round(self.p95_latency * 1000, 2),
            "top_tools": dict(sorted(
                self.tool_call_counts.items(),
                key=lambda x: x[1], reverse=True
            )[:10]),
        }

metrics = AgentMetrics()

class MetricsHooks(AgentHooks):
    def __init__(self, metrics_store: AgentMetrics):
        self.metrics = metrics_store

    async def on_start(self, context, agent, input):
        self.metrics.total_runs += 1
        context.context["_run_start"] = time.monotonic()

    async def on_end(self, context, agent, output):
        elapsed = time.monotonic() - context.context.get("_run_start", 0)
        self.metrics.latency_samples.append(elapsed)

    async def on_tool_start(self, context, agent, tool):
        self.metrics.total_tool_calls += 1
        self.metrics.tool_call_counts[tool.name] += 1

    async def on_handoff(self, context, from_agent, to_agent):
        self.metrics.total_handoffs += 1
        key = f"{from_agent.name}->{to_agent.name}"
        self.metrics.handoff_counts[key] += 1

Exposing Metrics via an API

If you are using FastAPI, expose a metrics endpoint.

from fastapi import FastAPI

app = FastAPI()

@app.get("/metrics")
async def get_metrics():
    return metrics.to_dict()

@app.get("/metrics/tools")
async def get_tool_metrics():
    return dict(metrics.tool_call_counts)

Audit Logging with Hooks

For compliance-heavy industries, log every agent action to an audit trail.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

import json
from datetime import datetime

class AuditHooks(AgentHooks):
    def __init__(self, audit_file: str = "audit.jsonl"):
        self.audit_file = audit_file

    def _write_event(self, event_type: str, data: dict):
        event = {
            "timestamp": datetime.utcnow().isoformat(),
            "event_type": event_type,
            **data,
        }
        with open(self.audit_file, "a") as f:
            f.write(json.dumps(event) + "\n")

    async def on_start(self, context, agent, input):
        self._write_event("agent_start", {
            "agent_name": agent.name,
            "input_length": len(str(input)),
        })

    async def on_tool_start(self, context, agent, tool):
        self._write_event("tool_call", {
            "agent_name": agent.name,
            "tool_name": tool.name,
        })

    async def on_handoff(self, context, from_agent, to_agent):
        self._write_event("handoff", {
            "from_agent": from_agent.name,
            "to_agent": to_agent.name,
        })

FAQ

Can I have multiple hooks on a single agent?

The SDK assigns one AgentHooks instance per agent. To combine behaviors, create a composite hook class that delegates to multiple hook implementations internally. Build a CompositeHooks class that holds a list of AgentHooks instances and calls each one in its own methods.

Do hooks affect agent performance?

Hook methods are awaited during execution, so slow hooks will slow your agent. Keep hook logic fast — log to a buffer, push metrics to an async queue, or write to a non-blocking sink. Avoid making network calls inside hooks unless you run them in a fire-and-forget task.

Can hooks modify the agent's behavior at runtime?

Hooks receive mutable context objects. You can add data to the context that the agent's tools read, effectively influencing behavior. However, hooks cannot modify the agent's instructions or tool list during a run. For that level of control, create a new agent instance with different configuration.


#OpenAIAgentsSDK #LifecycleHooks #Observability #Logging #Metrics #Python #AgenticAI #LearnAI #AIEngineering

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

HVAC

Building an HVAC After-Hours Emergency Escalation System: A Complete Engineering Guide

How we built a fault-tolerant HVAC emergency triage and tech-dispatch platform on Kubernetes — three-tier CQRS, 11 micro-agents on the OpenAI Agents SDK + LangGraph, NATS JetStream, DTMF/SMS/WebSocket acceptance, circuit breakers, and an evaluation pipeline that catches regressions before they wake a tech at 3 AM.

AI Infrastructure

Monitoring WebSocket Health: Heartbeats and Prometheus in 2026

How to actually observe a WebSocket fleet: ping/pong heartbeats, Prometheus metrics that matter, dead-man switches, and the alerts that fire before customers notice.

Agentic AI

The Agent Evaluation Stack in 2026: From Trace to Eval Score

How the modern agent eval stack actually flows: instrument, trace, dataset, evaluator, score, CI gate. The full pipeline that keeps agents from regressing.

Agentic AI

Streaming Agent Responses with OpenAI Agents SDK and LangChain in 2026

How to stream tokens, tool-call deltas, and intermediate steps from an agent — with code for both the OpenAI Agents SDK and LangChain — and the gotchas that bite in production.

Agentic AI

Token-Level Evaluation of Streaming Agents: TTFT, Stream Smoothness, and Mid-Stream Hallucination Detection

Streaming changes the eval game — final-answer correctness isn't enough when users perceive the answer one token at a time. Here's the metric set that matters.

Agentic AI

Tool Selection Accuracy: The Eval Most Teams Skip — and Should Not (2026)

Your agent picked the wrong tool 12% of the time and the final answer was still right. That's a latent bug. Here's the eval pipeline that surfaces it.