---
title: "OpenTelemetry GenAI Conventions for AI Agents in 2026"
description: "The OTel GenAI semantic conventions exited experimental for client spans in early 2026. Here's how CallSphere instruments 37 voice and chat agents with gen_ai.* attributes that work across Datadog, Honeycomb, and Grafana."
canonical: https://callsphere.ai/blog/vw3c-opentelemetry-genai-conventions-ai-agents-2026
category: "AI Infrastructure"
tags: ["OpenTelemetry", "Observability", "GenAI", "Tracing"]
author: "CallSphere Team"
published: 2026-03-19T00:00:00.000Z
updated: 2026-05-07T09:59:38.159Z
---

# OpenTelemetry GenAI Conventions for AI Agents in 2026

> The OTel GenAI semantic conventions exited experimental for client spans in early 2026. Here's how CallSphere instruments 37 voice and chat agents with gen_ai.* attributes that work across Datadog, Honeycomb, and Grafana.

> **TL;DR** — In 2026 you don't write custom span attributes for "model name" anymore. You use `gen_ai.request.model` and your traces work in every backend that supports OTel.

## What goes wrong

```mermaid
flowchart LR
  Browser["Browser / Phone"] -- "WebSocket /ws" --> LB["Load Balancer
sticky session"]
  LB --> Pod1["Node A · Socket.IO"]
  LB --> Pod2["Node B · Socket.IO"]
  Pod1 -- "pub/sub" --> Redis[("Redis cluster")]
  Pod2 -- "pub/sub" --> Redis
  Pod1 --> AI["AI Worker · OpenAI Realtime"]
  Pod2 --> AI
```

CallSphere reference architecture

For two years every team rolled its own LLM-tracing schema. `model`, `llm.model`, `openai.model`, `anthropic.model` — all meant the same thing, none queried the same way. A platform team that wanted to chart "tokens spent per model per service" had to write a per-vendor adapter for every framework. By late 2025, the OTel GenAI SIG stabilized client spans and metrics, and most agent frameworks (OpenAI Agents SDK, LangChain, LlamaIndex, AutoGen) shipped emitters by Q1 2026.

The trap is that the *agent* spec is still experimental, and most production agents are agents — not single LLM calls. If you only instrument the chat-completions span you miss the tool-call planning, the handoff between sub-agents, and the loop. You end up with a trace that looks fast and an experience that feels slow.

## How to monitor

Use three layers of OTel GenAI conventions:

1. **`gen_ai.client` spans** (stable) — one per LLM round-trip. Attributes: `gen_ai.request.model`, `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`, `gen_ai.response.finish_reasons`.
2. **`gen_ai.agent` spans** (experimental) — one per agent invocation. Attributes: `gen_ai.agent.name`, `gen_ai.agent.id`, `gen_ai.agent.description`.
3. **`gen_ai.tool.*` events** — attached to agent spans. Captures every tool call the agent makes and its result.

Standard metrics in 2026: `gen_ai.client.token.usage` (histogram), `gen_ai.client.operation.duration` (histogram). Datadog, Honeycomb, Grafana, and OpenObserve all auto-detect these.

## CallSphere stack

We run 37 agents across six verticals on k3s with Cloudflare Tunnel. Every agent emits OTel GenAI spans through an OpenTelemetry Collector deployed as a DaemonSet. The collector tail-samples to 5% (100% for errors and slow turns) and forwards to two backends:

- **Honeycomb** for tracing (developer ergonomics on agent traces)
- **Prometheus + Grafana** for SLO dashboards

The Healthcare FastAPI service on `:8084` decorates each route with our `@trace_genai_agent` decorator that auto-emits parent agent span and child client spans. The Real Estate 6-container pod sends spans across NATS subjects and reuses the trace context header so a single call shows as one trace across all six containers. Sales WebSocket workers (PM2) batch-export every 5 seconds. The After-hours Bull/Redis queue worker emits one trace per job — Bull's job ID becomes the trace ID prefix.

Plans on [/pricing](/pricing) include trace export to your own OTel collector at the $499 tier; $1499 enterprise gets a dedicated tenant in our Honeycomb. Try it on the [14-day trial](/trial).

## Implementation

1. **Install the OTel SDK** for your framework. For Python:

```bash
pip install opentelemetry-distro \
  opentelemetry-instrumentation-openai \
  opentelemetry-exporter-otlp
```

1. **Wrap your agent loop** with explicit agent spans:

```python
from opentelemetry import trace
tracer = trace.get_tracer("callsphere.healthcare")

def run_agent(user_input: str):
    with tracer.start_as_current_span(
        "gen_ai.agent.invoke",
        attributes={
            "gen_ai.agent.name": "healthcare_intake",
            "gen_ai.agent.id": "hc-intake-v3",
            "gen_ai.system": "openai",
        },
    ) as span:
        # tool calls and llm calls inside here
        # auto-instrument adds gen_ai.client spans
        result = agent_loop(user_input)
        span.set_attribute("gen_ai.completion.text", result.text[:512])
        return result
```

1. **Configure the collector** to validate semconv:

```yaml
processors:
  transform:
    metric_statements:
      - context: datapoint
        statements:
          - keep_keys(attributes, ["gen_ai.request.model","gen_ai.system"])
```

1. **Build dashboards on the standard names.** A "tokens per model per route" panel that uses `gen_ai.request.model` works for OpenAI, Anthropic, and Cohere with no code changes.
2. **Tail-sample.** 100% of error traces, 100% of traces with FTL > 1500ms, 5% of everything else. Tail-sampling at the collector saves 95% of storage cost.

## FAQ

**Q: Are GenAI agent spans stable yet?**
A: Client spans and metrics are stable. Agent and framework spans are experimental but have been very stable in practice through Q1 2026.

**Q: Do I need a vendor SDK on top of OTel?**
A: No. OTel + auto-instrumentation covers 80% of needs. Add a vendor SDK (Langfuse, LangSmith) if you want their UI on top — they all consume OTel.

**Q: How do I keep PII out of the spans?**
A: Use the collector's `redaction` processor or run Microsoft Presidio in a sidecar before export. Our [/industries/healthcare](/industries/healthcare) build does this in the collector.

**Q: Will my Datadog APM see this?**
A: Yes. Datadog LLM Observability natively maps OTel GenAI semconv to its product UI as of late 2025.

**Q: What about voice-specific attributes?**
A: We add `callsphere.audio.first_token_ms` and `callsphere.audio.barge_in_count` as custom attributes — namespaced so they don't collide with future OTel additions.

## Sources

- [OpenTelemetry — Semantic conventions for generative AI systems](https://opentelemetry.io/docs/specs/semconv/gen-ai/)
- [OTel — GenAI agent and framework spans](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-agent-spans/)
- [Datadog — Native support for OTel GenAI semconv](https://www.datadoghq.com/blog/llm-otel-semantic-convention/)
- [OpenObserve — OpenTelemetry for LLMs Complete SRE Guide](https://openobserve.ai/blog/opentelemetry-for-llms/)

---

Source: https://callsphere.ai/blog/vw3c-opentelemetry-genai-conventions-ai-agents-2026