Skip to content
Learn Agentic AI
Learn Agentic AI11 min read2 views

Python Logging for AI Applications: Structured Logs with structlog and loguru

Configure production-grade logging for AI applications using structlog and loguru with structured JSON output, context binding, correlation IDs, and cost-aware filtering.

Why Standard Logging Fails for AI Applications

AI agent applications produce complex, nested execution traces. A single user query might trigger five tool calls, three LLM completions, and two database lookups — each with different latencies, token counts, and costs. The standard Python logging module's flat string messages cannot capture this structure in a way that is queryable in production.

Structured logging emits JSON objects instead of formatted strings. Each log entry is a dictionary with typed fields that log aggregation systems like Datadog, Elasticsearch, and Loki can index and query. This transforms debugging from "grep through text files" to "query for all requests that exceeded 500 tokens and cost more than $0.05."

structlog: The Production Standard

structlog wraps Python's standard logging with a processor pipeline that builds structured context incrementally.

flowchart TD
    START["Python Logging for AI Applications: Structured Lo…"] --> A
    A["Why Standard Logging Fails for AI Appli…"]
    A --> B
    B["structlog: The Production Standard"]
    B --> C
    C["Context Binding for Agent Traces"]
    C --> D
    D["loguru: Simple and Powerful"]
    D --> E
    E["Cost Tracking with Custom Processors"]
    E --> F
    F["Filtering Sensitive Data"]
    F --> G
    G["FAQ"]
    G --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
import structlog

# Configure once at application startup
structlog.configure(
    processors=[
        structlog.contextvars.merge_contextvars,
        structlog.processors.add_log_level,
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.StackInfoRenderer(),
        structlog.processors.format_exc_info,
        structlog.processors.JSONRenderer(),
    ],
    wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
    context_class=dict,
    logger_factory=structlog.PrintLoggerFactory(),
)

log = structlog.get_logger()

# Structured fields are first-class
log.info("llm_completion", model="gpt-4o", tokens=1523, latency_ms=2100, cost=0.045)
# Output: {"event": "llm_completion", "model": "gpt-4o", "tokens": 1523,
#          "latency_ms": 2100, "cost": 0.045, "level": "info",
#          "timestamp": "2026-03-17T10:30:00Z"}

Context Binding for Agent Traces

The most powerful structlog feature for AI applications is context binding. Bind context once and every subsequent log entry includes it.

import structlog
from uuid import uuid4

async def handle_agent_request(user_id: str, query: str):
    request_id = str(uuid4())

    # Bind context that persists across all log calls in this scope
    log = structlog.get_logger().bind(
        request_id=request_id,
        user_id=user_id,
    )

    log.info("agent_request_started", query=query)

    # These logs automatically include request_id and user_id
    tools = await select_tools(query, log)
    log.info("tools_selected", tool_count=len(tools))

    result = await run_agent(query, tools, log)
    log.info("agent_request_completed", response_length=len(result))

    return result

async def select_tools(query: str, log):
    log.debug("tool_selection_started")
    # log output includes request_id and user_id from parent binding
    return ["web_search", "calculator"]

loguru: Simple and Powerful

loguru takes a different approach — one global logger with a fluent API. It is excellent for smaller projects and prototyping.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

from loguru import logger
import sys

# Remove default handler and add structured JSON output
logger.remove()
logger.add(
    sys.stdout,
    format="{message}",
    serialize=True,  # JSON output
    level="INFO",
)

# Add file rotation for production
logger.add(
    "logs/agent_{time}.log",
    rotation="100 MB",
    retention="7 days",
    compression="gz",
    serialize=True,
)

# Context binding with loguru
def process_tool_call(tool_name: str, args: dict):
    with logger.contextualize(tool=tool_name):
        logger.info("tool_call_started", arguments=args)
        result = execute_tool(tool_name, args)
        logger.info("tool_call_completed", result_length=len(str(result)))
        return result

Cost Tracking with Custom Processors

AI applications need cost observability. Build a custom structlog processor that calculates and attaches cost data.

import structlog

COST_PER_1K_TOKENS = {
    "gpt-4o": {"input": 0.0025, "output": 0.01},
    "gpt-4o-mini": {"input": 0.00015, "output": 0.0006},
    "claude-3-5-sonnet": {"input": 0.003, "output": 0.015},
}

def add_cost_estimate(logger, method_name, event_dict):
    model = event_dict.get("model")
    input_tokens = event_dict.get("input_tokens", 0)
    output_tokens = event_dict.get("output_tokens", 0)

    if model and model in COST_PER_1K_TOKENS:
        rates = COST_PER_1K_TOKENS[model]
        cost = (input_tokens / 1000 * rates["input"]) + (
            output_tokens / 1000 * rates["output"]
        )
        event_dict["estimated_cost_usd"] = round(cost, 6)

    return event_dict

# Add to processor chain
structlog.configure(
    processors=[
        structlog.processors.add_log_level,
        structlog.processors.TimeStamper(fmt="iso"),
        add_cost_estimate,  # custom processor
        structlog.processors.JSONRenderer(),
    ],
)

log = structlog.get_logger()
log.info("llm_call", model="gpt-4o", input_tokens=500, output_tokens=1200)
# Automatically includes: "estimated_cost_usd": 0.01325

Filtering Sensitive Data

AI logs often contain user queries and model responses that may include PII. Filter these before they reach storage.

import re

SENSITIVE_PATTERNS = [
    (re.compile(r"sk-[a-zA-Z0-9]{20,}"), "sk-***REDACTED***"),
    (re.compile(r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b"), "***EMAIL***"),
    (re.compile(r"\b\d{3}-\d{2}-\d{4}\b"), "***SSN***"),
]

def redact_sensitive(logger, method_name, event_dict):
    for key, value in event_dict.items():
        if isinstance(value, str):
            for pattern, replacement in SENSITIVE_PATTERNS:
                value = pattern.sub(replacement, value)
            event_dict[key] = value
    return event_dict

FAQ

Should I use structlog or loguru for production AI applications?

structlog is better for production systems because it integrates with Python's standard logging ecosystem, supports asyncio-safe context variables, and works well in multi-service architectures. loguru is better for single-service applications, scripts, and rapid prototyping where its simpler API saves setup time.

How do I correlate logs across multiple agent steps?

Generate a unique request ID at the entry point and bind it to the logger context. Every downstream function receives the bound logger or uses structlog's contextvars integration, which automatically propagates context across async boundaries. This lets you filter all logs for a single agent execution in your log aggregation tool.

How much logging is too much in an AI application?

Log every LLM call with model, tokens, and latency at INFO level. Log tool calls and their results at INFO. Log internal decision-making at DEBUG. Never log full prompt contents at INFO in production — they consume storage rapidly and may contain sensitive data. Use DEBUG level for full prompt logging during development.


#Python #Logging #Observability #Structlog #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Technical Guides

AI Voice Agent Analytics: The KPIs That Actually Matter

The 15 KPIs that matter for AI voice agent operations — from answer rate and FCR to cost per successful resolution.

Technical Guides

Observability for AI Voice Agents: Distributed Tracing, Metrics, and Logs

A complete observability stack for AI voice agents — distributed tracing across STT/LLM/TTS, metrics, logs, and SLO dashboards.

AI Interview Prep

7 AI Coding Interview Questions From Anthropic, Meta & OpenAI (2026 Edition)

Real AI coding interview questions from Anthropic, Meta, and OpenAI in 2026. Includes implementing attention from scratch, Anthropic's progressive coding screens, Meta's AI-assisted round, and vector search — with solution approaches.

Learn Agentic AI

Fine-Tuning LLMs for Agentic Tasks: When and How to Customize Foundation Models

When fine-tuning beats prompting for AI agents: dataset creation from agent traces, SFT and DPO training approaches, evaluation methodology, and cost-benefit analysis for agentic fine-tuning.

AI Interview Prep

7 Agentic AI & Multi-Agent System Interview Questions for 2026

Real agentic AI and multi-agent system interview questions from Anthropic, OpenAI, and Microsoft in 2026. Covers agent design patterns, memory systems, safety, orchestration frameworks, tool calling, and evaluation.

Learn Agentic AI

Building a Multi-Agent Data Pipeline: Ingestion, Transformation, and Analysis Agents

Build a three-agent data pipeline with ingestion, transformation, and analysis agents that process data from APIs, CSVs, and databases using Python.