---
title: "Python Logging for AI Applications: Structured Logs with structlog and loguru"
description: "Configure production-grade logging for AI applications using structlog and loguru with structured JSON output, context binding, correlation IDs, and cost-aware filtering."
canonical: https://callsphere.ai/blog/python-logging-ai-applications-structured-logs-structlog-loguru
category: "Learn Agentic AI"
tags: ["Python", "Logging", "Observability", "structlog", "Agentic AI"]
author: "CallSphere Team"
published: 2026-03-17T00:00:00.000Z
updated: 2026-05-06T14:29:53.410Z
---

# Python Logging for AI Applications: Structured Logs with structlog and loguru

> Configure production-grade logging for AI applications using structlog and loguru with structured JSON output, context binding, correlation IDs, and cost-aware filtering.

## Why Standard Logging Fails for AI Applications

AI agent applications produce complex, nested execution traces. A single user query might trigger five tool calls, three LLM completions, and two database lookups — each with different latencies, token counts, and costs. The standard Python `logging` module's flat string messages cannot capture this structure in a way that is queryable in production.

Structured logging emits JSON objects instead of formatted strings. Each log entry is a dictionary with typed fields that log aggregation systems like Datadog, Elasticsearch, and Loki can index and query. This transforms debugging from "grep through text files" to "query for all requests that exceeded 500 tokens and cost more than $0.05."

## structlog: The Production Standard

structlog wraps Python's standard logging with a processor pipeline that builds structured context incrementally.

```mermaid
flowchart LR
    APP(["Agent or API"])
    SDK["OTel SDK
GenAI conventions"]
    COL["OTel Collector"]
    subgraph BACKENDS["Backends"]
        TR[("Traces
Tempo or Honeycomb")]
        MET[("Metrics
Prometheus")]
        LOG[("Logs
Loki or ELK")]
    end
    DASH["Grafana plus alerts"]
    PAGE(["Pager"])
    APP --> SDK --> COL
    COL --> TR
    COL --> MET
    COL --> LOG
    TR --> DASH
    MET --> DASH
    LOG --> DASH
    DASH --> PAGE
    style SDK fill:#4f46e5,stroke:#4338ca,color:#fff
    style DASH fill:#f59e0b,stroke:#d97706,color:#1f2937
    style PAGE fill:#dc2626,stroke:#b91c1c,color:#fff
```

```python
import structlog

# Configure once at application startup
structlog.configure(
    processors=[
        structlog.contextvars.merge_contextvars,
        structlog.processors.add_log_level,
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.StackInfoRenderer(),
        structlog.processors.format_exc_info,
        structlog.processors.JSONRenderer(),
    ],
    wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
    context_class=dict,
    logger_factory=structlog.PrintLoggerFactory(),
)

log = structlog.get_logger()

# Structured fields are first-class
log.info("llm_completion", model="gpt-4o", tokens=1523, latency_ms=2100, cost=0.045)
# Output: {"event": "llm_completion", "model": "gpt-4o", "tokens": 1523,
#          "latency_ms": 2100, "cost": 0.045, "level": "info",
#          "timestamp": "2026-03-17T10:30:00Z"}
```

## Context Binding for Agent Traces

The most powerful structlog feature for AI applications is context binding. Bind context once and every subsequent log entry includes it.

```python
import structlog
from uuid import uuid4

async def handle_agent_request(user_id: str, query: str):
    request_id = str(uuid4())

    # Bind context that persists across all log calls in this scope
    log = structlog.get_logger().bind(
        request_id=request_id,
        user_id=user_id,
    )

    log.info("agent_request_started", query=query)

    # These logs automatically include request_id and user_id
    tools = await select_tools(query, log)
    log.info("tools_selected", tool_count=len(tools))

    result = await run_agent(query, tools, log)
    log.info("agent_request_completed", response_length=len(result))

    return result

async def select_tools(query: str, log):
    log.debug("tool_selection_started")
    # log output includes request_id and user_id from parent binding
    return ["web_search", "calculator"]
```

## loguru: Simple and Powerful

loguru takes a different approach — one global logger with a fluent API. It is excellent for smaller projects and prototyping.

```python
from loguru import logger
import sys

# Remove default handler and add structured JSON output
logger.remove()
logger.add(
    sys.stdout,
    format="{message}",
    serialize=True,  # JSON output
    level="INFO",
)

# Add file rotation for production
logger.add(
    "logs/agent_{time}.log",
    rotation="100 MB",
    retention="7 days",
    compression="gz",
    serialize=True,
)

# Context binding with loguru
def process_tool_call(tool_name: str, args: dict):
    with logger.contextualize(tool=tool_name):
        logger.info("tool_call_started", arguments=args)
        result = execute_tool(tool_name, args)
        logger.info("tool_call_completed", result_length=len(str(result)))
        return result
```

## Cost Tracking with Custom Processors

AI applications need cost observability. Build a custom structlog processor that calculates and attaches cost data.

```python
import structlog

COST_PER_1K_TOKENS = {
    "gpt-4o": {"input": 0.0025, "output": 0.01},
    "gpt-4o-mini": {"input": 0.00015, "output": 0.0006},
    "claude-3-5-sonnet": {"input": 0.003, "output": 0.015},
}

def add_cost_estimate(logger, method_name, event_dict):
    model = event_dict.get("model")
    input_tokens = event_dict.get("input_tokens", 0)
    output_tokens = event_dict.get("output_tokens", 0)

    if model and model in COST_PER_1K_TOKENS:
        rates = COST_PER_1K_TOKENS[model]
        cost = (input_tokens / 1000 * rates["input"]) + (
            output_tokens / 1000 * rates["output"]
        )
        event_dict["estimated_cost_usd"] = round(cost, 6)

    return event_dict

# Add to processor chain
structlog.configure(
    processors=[
        structlog.processors.add_log_level,
        structlog.processors.TimeStamper(fmt="iso"),
        add_cost_estimate,  # custom processor
        structlog.processors.JSONRenderer(),
    ],
)

log = structlog.get_logger()
log.info("llm_call", model="gpt-4o", input_tokens=500, output_tokens=1200)
# Automatically includes: "estimated_cost_usd": 0.01325
```

## Filtering Sensitive Data

AI logs often contain user queries and model responses that may include PII. Filter these before they reach storage.

```python
import re

SENSITIVE_PATTERNS = [
    (re.compile(r"sk-[a-zA-Z0-9]{20,}"), "sk-***REDACTED***"),
    (re.compile(r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b"), "***EMAIL***"),
    (re.compile(r"\b\d{3}-\d{2}-\d{4}\b"), "***SSN***"),
]

def redact_sensitive(logger, method_name, event_dict):
    for key, value in event_dict.items():
        if isinstance(value, str):
            for pattern, replacement in SENSITIVE_PATTERNS:
                value = pattern.sub(replacement, value)
            event_dict[key] = value
    return event_dict
```

## FAQ

### Should I use structlog or loguru for production AI applications?

structlog is better for production systems because it integrates with Python's standard logging ecosystem, supports asyncio-safe context variables, and works well in multi-service architectures. loguru is better for single-service applications, scripts, and rapid prototyping where its simpler API saves setup time.

### How do I correlate logs across multiple agent steps?

Generate a unique request ID at the entry point and bind it to the logger context. Every downstream function receives the bound logger or uses structlog's contextvars integration, which automatically propagates context across async boundaries. This lets you filter all logs for a single agent execution in your log aggregation tool.

### How much logging is too much in an AI application?

Log every LLM call with model, tokens, and latency at INFO level. Log tool calls and their results at INFO. Log internal decision-making at DEBUG. Never log full prompt contents at INFO in production — they consume storage rapidly and may contain sensitive data. Use DEBUG level for full prompt logging during development.

---

#Python #Logging #Observability #Structlog #AgenticAI #LearnAI #AIEngineering

---

Source: https://callsphere.ai/blog/python-logging-ai-applications-structured-logs-structlog-loguru
