Debug Logging and Configuration Best Practices for OpenAI Agents
Configure the OpenAI Agents SDK for development and production. Covers API keys, model defaults, verbose logging, sensitive data protection, and a production readiness checklist.
Getting Configuration Right
Configuration is where development convenience meets production security. The OpenAI Agents SDK provides multiple configuration mechanisms — environment variables, programmatic settings, and per-run overrides. Getting these right from the start saves hours of debugging and prevents security incidents.
API Key Configuration
Environment Variable (Recommended)
The SDK automatically reads OPENAI_API_KEY from the environment:
flowchart TD
START["Debug Logging and Configuration Best Practices fo…"] --> A
A["Getting Configuration Right"]
A --> B
B["API Key Configuration"]
B --> C
C["Custom OpenAI Client Configuration"]
C --> D
D["Model Configuration"]
D --> E
E["Debug Logging"]
E --> F
F["Tracing"]
F --> G
G["Sensitive Data Protection"]
G --> H
H["Production Configuration Checklist"]
H --> DONE["Key Takeaways"]
style START fill:#4f46e5,stroke:#4338ca,color:#fff
style DONE fill:#059669,stroke:#047857,color:#fff
export OPENAI_API_KEY="sk-proj-your-key-here"
This is the recommended approach because:
- Keys stay out of source code
- Different environments (dev, staging, prod) use different keys
- Key rotation does not require code changes
Programmatic Key Setting
For cases where environment variables are not practical:
from agents import set_default_openai_key
set_default_openai_key("sk-proj-your-key-here")
This sets the key for all subsequent agent runs in the process. Call this once at application startup, not before every run.
Per-Run Key Override
For multi-tenant applications where different requests use different API keys:
from openai import AsyncOpenAI
from agents import Agent, Runner, OpenAIChatCompletionsModel
# Create a client with a specific key
client = AsyncOpenAI(api_key="sk-proj-tenant-specific-key")
agent = Agent(
name="Tenant Agent",
instructions="Help the user.",
model=OpenAIChatCompletionsModel(
model="gpt-4o",
openai_client=client,
),
)
result = await Runner.run(agent, "Hello")
Custom OpenAI Client Configuration
For advanced scenarios — proxies, custom base URLs, or organization IDs — configure the underlying OpenAI client:
from openai import AsyncOpenAI
from agents import set_default_openai_client
client = AsyncOpenAI(
api_key="sk-proj-your-key",
organization="org-your-org-id",
base_url="https://your-proxy.example.com/v1",
timeout=60.0,
max_retries=3,
)
set_default_openai_client(client)
This is useful for:
- API proxies: Route traffic through a logging proxy or gateway
- Azure OpenAI: Use a custom base URL for Azure-hosted models
- Organization isolation: Set the organization ID for billing separation
Model Configuration
Default Model
The SDK defaults to gpt-4o. Override globally with an environment variable:
export OPENAI_DEFAULT_MODEL="gpt-4o-mini"
Or programmatically:
from agents import Agent
# Per-agent model selection
fast_agent = Agent(
name="Fast Agent",
instructions="Respond quickly.",
model="gpt-4o-mini",
)
smart_agent = Agent(
name="Smart Agent",
instructions="Analyze deeply.",
model="gpt-4o",
)
reasoning_agent = Agent(
name="Reasoning Agent",
instructions="Solve complex problems step by step.",
model="o3-mini",
)
Responses API vs Chat Completions API
By default, the SDK uses the OpenAI Responses API, which is newer and supports features like built-in tools (web search, file search) and constrained JSON output.
For compatibility with non-OpenAI providers or older setups, you can switch to the Chat Completions API:
from agents import Agent
from agents.models.openai_chatcompletions import OpenAIChatCompletionsModel
from openai import AsyncOpenAI
# Use Chat Completions API with any OpenAI-compatible provider
client = AsyncOpenAI(
base_url="https://api.together.xyz/v1",
api_key="your-together-api-key",
)
agent = Agent(
name="Together Agent",
instructions="You are helpful.",
model=OpenAIChatCompletionsModel(
model="meta-llama/Llama-3-70b-chat-hf",
openai_client=client,
),
)
This makes the SDK work with any provider that exposes an OpenAI-compatible Chat Completions endpoint — Together AI, Anyscale, vLLM, Ollama, and more.
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
Debug Logging
Verbose Stdout Logging
The fastest way to see what the agent loop is doing:
flowchart TD
ROOT["Debug Logging and Configuration Best Practic…"]
ROOT --> P0["API Key Configuration"]
P0 --> P0C0["Environment Variable Recommended"]
P0 --> P0C1["Programmatic Key Setting"]
P0 --> P0C2["Per-Run Key Override"]
ROOT --> P1["Model Configuration"]
P1 --> P1C0["Default Model"]
P1 --> P1C1["Responses API vs Chat Completions API"]
ROOT --> P2["Debug Logging"]
P2 --> P2C0["Verbose Stdout Logging"]
P2 --> P2C1["Python Logging Integration"]
P2 --> P2C2["What Gets Logged"]
ROOT --> P3["Tracing"]
P3 --> P3C0["Disabling Tracing"]
style ROOT fill:#4f46e5,stroke:#4338ca,color:#fff
style P0 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
style P1 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
style P2 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
style P3 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
from agents import enable_verbose_stdout_logging
enable_verbose_stdout_logging()
# Now every agent run prints detailed information:
# - Each LLM call with the full message list
# - Tool calls and their results
# - Handoff events
# - Timing information
This is invaluable during development. Never enable this in production — it prints potentially sensitive data including full prompts and responses.
Python Logging Integration
For more control, use Python's standard logging:
import logging
# Set the agents logger to DEBUG
logging.getLogger("agents").setLevel(logging.DEBUG)
# Configure a handler
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter(
"%(asctime)s [%(name)s] %(levelname)s: %(message)s"
))
logging.getLogger("agents").addHandler(handler)
In production, route these logs to your observability stack (Datadog, CloudWatch, etc.) at INFO level or above.
What Gets Logged
At DEBUG level, the SDK logs:
| Log Entry | Contains |
|---|---|
| LLM request | Model name, message count, tool count |
| LLM response | Response type, token usage |
| Tool execution | Tool name, execution time |
| Tool error | Tool name, error message |
| Handoff | Source agent, target agent |
| Loop iteration | Turn number, current agent |
Tracing
The SDK includes built-in tracing that captures the full execution flow of every agent run. Tracing is enabled by default:
from agents import Agent, Runner, RunConfig
result = await Runner.run(
agent,
"User query here",
run_config=RunConfig(
workflow_name="customer-support",
trace_id="req-12345-abc",
group_id="session-67890",
tracing_disabled=False, # Default: enabled
),
)
Traces capture:
- The complete agent loop execution with timing
- All LLM calls with input/output
- Tool calls with arguments and results
- Handoff events
- Error events
Disabling Tracing
For sensitive workloads or to reduce overhead:
result = await Runner.run(
agent,
"Sensitive query",
run_config=RunConfig(tracing_disabled=True),
)
Sensitive Data Protection
What to Protect
In production, be conscious of what data flows through the agent system:
flowchart TD
CENTER(("Core Concepts"))
CENTER --> N0["Keys stay out of source code"]
CENTER --> N1["Different environments dev, staging, pr…"]
CENTER --> N2["Key rotation does not require code chan…"]
CENTER --> N3["API proxies: Route traffic through a lo…"]
CENTER --> N4["Azure OpenAI: Use a custom base URL for…"]
CENTER --> N5["Organization isolation: Set the organiz…"]
style CENTER fill:#4f46e5,stroke:#4338ca,color:#fff
- User PII: Names, emails, phone numbers, addresses
- Financial data: Credit card numbers, bank accounts
- Authentication tokens: API keys, session tokens, passwords
- Health information: Medical records, diagnoses
Protection Strategies
1. Scrub inputs before sending to the agent:
import re
def scrub_pii(text: str) -> str:
# Mask email addresses
text = re.sub(r'[\w.-]+@[\w.-]+\.\w+', '[EMAIL]', text)
# Mask phone numbers
text = re.sub(r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b', '[PHONE]', text)
# Mask credit card numbers
text = re.sub(r'\b\d{4}[- ]?\d{4}[- ]?\d{4}[- ]?\d{4}\b', '[CARD]', text)
return text
result = await Runner.run(agent, scrub_pii(user_input))
2. Use context for sensitive data instead of conversation messages:
@function_tool
async def process_payment(
context: RunContextWrapper[PaymentContext],
amount: float,
) -> str:
"""Process a payment for the current user.
Args:
amount: Payment amount in USD.
"""
# Access payment info from context, not from the conversation
card = context.context.payment_method
# Process payment...
return f"Payment of ${amount} processed successfully."
The payment details are in the context (never sent to the LLM) while the LLM only sees the amount and result.
3. Disable tracing for sensitive operations:
result = await Runner.run(
payment_agent,
"Process my payment",
run_config=RunConfig(tracing_disabled=True),
)
Production Configuration Checklist
Before deploying agents to production, verify each item:
Security
- API keys are loaded from environment variables or a secrets manager
- No API keys are hardcoded in source code
- PII scrubbing is applied to user inputs where appropriate
- Sensitive data flows through context, not conversation messages
- Output guardrails are configured to catch unsafe responses
- Tracing is disabled or filtered for sensitive workflows
Reliability
-
max_turnsis set on everyRunner.run()call - Tool timeouts are configured for all I/O tools
- Retry policies are configured for transient failures
-
MaxTurnsExceededand other exceptions are caught and handled - Circuit breakers are in place for external service calls
Observability
- Logging is configured at INFO level (not DEBUG in production)
- Tracing is enabled with meaningful workflow names and trace IDs
- Trace IDs are correlated with your application's request IDs
- Token usage is tracked for cost monitoring
- Error rates are monitored with alerting
Performance
- Model selection matches the task complexity (do not use gpt-4o for simple classification)
-
max_tokensis set to prevent unnecessarily long responses - WebSocket transport is used for high-frequency streaming applications
- Connection pooling is configured on custom OpenAI clients
- Async
Runner.run()is used in async contexts (notrun_sync())
Cost Control
- Token usage is logged and monitored
-
max_turnsprevents runaway loops -
max_tokensis set appropriately per agent - Cheaper models are used for simple tasks
- Rate limiting is implemented at the application level
Environment-Specific Configuration Pattern
A clean pattern for managing configuration across environments:
import os
from dataclasses import dataclass
@dataclass
class AgentConfig:
openai_api_key: str
default_model: str
max_turns: int
enable_tracing: bool
log_level: str
@classmethod
def from_env(cls) -> "AgentConfig":
env = os.getenv("ENVIRONMENT", "development")
if env == "production":
return cls(
openai_api_key=os.environ["OPENAI_API_KEY"],
default_model="gpt-4o",
max_turns=10,
enable_tracing=True,
log_level="INFO",
)
elif env == "staging":
return cls(
openai_api_key=os.environ["OPENAI_API_KEY"],
default_model="gpt-4o-mini",
max_turns=15,
enable_tracing=True,
log_level="DEBUG",
)
else: # development
return cls(
openai_api_key=os.getenv("OPENAI_API_KEY", ""),
default_model="gpt-4o-mini",
max_turns=25,
enable_tracing=False,
log_level="DEBUG",
)
config = AgentConfig.from_env()
This keeps all environment-specific decisions in one place and makes it easy to audit what each environment uses.
Written by
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.