Custom Spans and Trace Visualization for Complex Workflows
Learn how to use the trace() context manager, custom_span(), and manual span lifecycle to build detailed, hierarchical trace visualizations for complex multi-step agent workflows.
When Built-in Tracing Is Not Enough
The OpenAI Agents SDK auto-traces agent runs, LLM calls, and tool invocations. For simple single-agent workflows, that is usually sufficient. But real production systems have complexity that lives outside the SDK's automatic instrumentation: database queries inside tools, preprocessing pipelines that transform user input before the agent sees it, postprocessing steps that validate and format agent output, and business logic that determines which agent to invoke in the first place.
Custom spans let you extend the trace hierarchy with your own instrumentation points, giving you a complete picture of every step in your workflow — not just the agent parts.
The trace() Context Manager
The trace() context manager creates a top-level trace that wraps your entire workflow. While Runner.run() creates traces automatically, using trace() explicitly gives you control over the trace name, grouping, and metadata:
flowchart TD
START["Custom Spans and Trace Visualization for Complex …"] --> A
A["When Built-in Tracing Is Not Enough"]
A --> B
B["The trace Context Manager"]
B --> C
C["Creating Custom Spans"]
C --> D
D["Nested Span Hierarchies"]
D --> E
E["Manual Span Lifecycle"]
E --> F
F["Adding Data to Spans"]
F --> G
G["Instrumenting Tool Functions with Custo…"]
G --> H
H["Correlating Traces Across Services"]
H --> DONE["Key Takeaways"]
style START fill:#4f46e5,stroke:#4338ca,color:#fff
style DONE fill:#059669,stroke:#047857,color:#fff
from agents import Agent, Runner, trace
agent = Agent(
name="Support Agent",
instructions="You help customers with technical issues.",
)
with trace("customer-support-workflow", metadata={"channel": "web", "tier": "premium"}):
# Preprocessing outside the agent
sanitized_input = sanitize_user_message(raw_input)
customer_context = await fetch_customer_profile(user_id)
# Agent run — automatically nested inside our trace
result = await Runner.run(
agent,
f"Customer context: {customer_context}\nQuery: {sanitized_input}",
)
# Postprocessing outside the agent
formatted = format_response(result.final_output)
await log_interaction(user_id, sanitized_input, formatted)
Every span created by the Runner.run() call is automatically nested under your custom trace. The metadata dictionary appears in the dashboard alongside the trace, enabling you to filter by channel, tier, customer segment, or any other dimension relevant to your application.
Creating Custom Spans
Within a trace, you can create custom spans to instrument specific blocks of code. The custom_span() context manager is the primary tool for this:
from agents import trace, custom_span
with trace("document-processing-pipeline"):
with custom_span("input_validation"):
validated = validate_and_parse(raw_document)
with custom_span("embedding_generation"):
chunks = chunk_document(validated, max_tokens=500)
embeddings = await generate_embeddings(chunks)
with custom_span("vector_store_upsert"):
await vector_db.upsert(embeddings)
with custom_span("agent_analysis"):
result = await Runner.run(analysis_agent, f"Analyze: {validated.summary}")
with custom_span("result_persistence"):
await save_analysis(result.final_output)
This produces a trace with six top-level spans: your four custom spans plus the agent and generation spans nested under "agent_analysis." The dashboard timeline view shows exactly how much time was spent in each phase — embedding generation, database operations, agent reasoning, and persistence.
Nested Span Hierarchies
Custom spans can be nested to create rich hierarchies that reflect the logical structure of your workflow:
from agents import trace, custom_span
async def process_order(order_id: str):
with trace("order-processing", metadata={"order_id": order_id}):
with custom_span("validation"):
with custom_span("schema_check"):
validate_order_schema(order)
with custom_span("inventory_check"):
available = await check_inventory(order.items)
with custom_span("fraud_screening"):
fraud_score = await screen_for_fraud(order)
with custom_span("agent_review"):
if fraud_score > 0.7:
result = await Runner.run(
fraud_review_agent,
f"Review order {order_id} with fraud score {fraud_score}",
)
with custom_span("fulfillment"):
with custom_span("payment_capture"):
await capture_payment(order)
with custom_span("shipping_label"):
label = await generate_shipping_label(order)
with custom_span("notification"):
await send_confirmation_email(order, label)
The resulting trace hierarchy:
Trace: "order-processing" (order_id: ORD-12345)
+-- validation
+-- schema_check (12ms)
+-- inventory_check (145ms)
+-- fraud_screening (890ms)
+-- agent_review
+-- agent_span: Fraud Review Agent
+-- generation_span: gpt-4o
+-- fulfillment
+-- payment_capture (234ms)
+-- shipping_label (567ms)
+-- notification (89ms)
This hierarchical structure makes it immediately obvious that fraud screening dominates the validation phase and shipping label generation is the bottleneck in fulfillment.
Manual Span Lifecycle
Sometimes you need more control than a context manager provides. The SDK supports manual span start and finish for cases where the span boundaries do not align with a Python with block — for example, when a span starts in one callback and finishes in another:
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
from agents import trace, custom_span, get_current_span
class StreamProcessor:
def __init__(self):
self.active_span = None
async def on_stream_start(self, stream_id: str):
# Start a span manually
self.active_span = custom_span("stream_processing")
self.active_span.__enter__()
async def on_chunk_received(self, chunk: bytes):
# Create child spans within the active span
with custom_span("chunk_processing"):
processed = await self.process_chunk(chunk)
await self.buffer.append(processed)
async def on_stream_end(self):
# Finish the span manually
if self.active_span:
self.active_span.__exit__(None, None, None)
self.active_span = None
Manual lifecycle management should be used sparingly. Context managers are safer because they guarantee the span is closed even if an exception occurs. Reserve manual management for event-driven or callback-based architectures where context managers are impractical.
Adding Data to Spans
Spans can carry structured data that appears in the dashboard when you inspect them:
from agents import custom_span
with custom_span("database_query", data={"table": "customers", "filter": "premium"}) as span:
results = await db.query("SELECT * FROM customers WHERE tier = 'premium'")
# Update span data after the operation completes
span.set_data({
"table": "customers",
"filter": "premium",
"row_count": len(results),
"duration_ms": query_duration,
})
Attaching data to spans transforms traces from simple timing records into rich debugging artifacts. When a query returns zero rows unexpectedly, the span data shows you the exact filter that was applied without requiring you to reproduce the issue.
Instrumenting Tool Functions with Custom Spans
While the SDK auto-traces tool invocations, you might want finer granularity inside complex tools:
from agents import function_tool, custom_span
@function_tool
async def analyze_document(document_url: str) -> str:
"""Download, parse, and analyze a document."""
with custom_span("document_download"):
content = await download_document(document_url)
with custom_span("text_extraction"):
text = extract_text(content)
with custom_span("entity_extraction"):
entities = await extract_entities(text)
with custom_span("sentiment_analysis"):
sentiment = await analyze_sentiment(text)
return (
f"Document contains {len(entities)} entities. "
f"Overall sentiment: {sentiment.label} ({sentiment.score:.2f})"
)
Now when you view the trace, the function_span for analyze_document contains four child spans showing exactly where time was spent inside the tool. This is invaluable when a tool that "usually takes 500ms" suddenly takes 10 seconds — the child spans pinpoint whether the download, extraction, or analysis is the culprit.
Correlating Traces Across Services
In microservice architectures, an agent workflow might call external APIs that have their own tracing. You can propagate trace context to enable end-to-end correlation:
from agents import trace, get_current_trace
with trace("cross-service-workflow") as current_trace:
trace_id = current_trace.trace_id
# Pass trace_id to downstream services via headers
response = await httpx.post(
"https://internal-api/process",
headers={"X-Trace-Id": trace_id},
json={"data": payload},
)
# The downstream service can include this trace_id in its own logs
result = await Runner.run(agent, response.text)
This pattern lets you follow a request from the user through your agent system and into backend services, creating a unified debugging experience across your entire infrastructure.
Visualization Best Practices
Name spans after operations, not implementations — Use "fraud_screening" not "call_sift_api." Operation names remain stable even when you swap providers.
Keep hierarchies shallow — Three to four levels of nesting is ideal. Deeper hierarchies become difficult to navigate in the dashboard.
Attach business context as metadata — Include customer IDs, order IDs, and feature flags so you can filter traces by business dimensions.
Use consistent naming conventions — Adopt snake_case for all span names and stick to it. Inconsistent naming makes dashboard filters unreliable.
Instrument the boundaries — The most valuable custom spans are at I/O boundaries: database calls, HTTP requests, file operations, and message queue publishes. These are where latency hides.
Custom spans and the trace() context manager turn the Agents SDK's built-in tracing from a useful default into a comprehensive observability layer for your entire application.
Written by
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.