---
title: "LiteLLM Integration: Using Non-OpenAI Models with Agents SDK"
description: "Integrate Anthropic, Google, Mistral, and other LLM providers into OpenAI's Agents SDK using LiteLLM's unified interface with LitellmModel, provider prefix notation, and cross-provider tracing."
canonical: https://callsphere.ai/blog/litellm-integration-non-openai-models-agents-sdk
category: "Learn Agentic AI"
tags: ["OpenAI", "LiteLLM", "Multi-Provider", "Anthropic"]
author: "CallSphere Team"
published: 2026-03-14T00:00:00.000Z
updated: 2026-05-25T16:03:26.252Z
---

# LiteLLM Integration: Using Non-OpenAI Models with Agents SDK

> Integrate Anthropic, Google, Mistral, and other LLM providers into OpenAI's Agents SDK using LiteLLM's unified interface with LitellmModel, provider prefix notation, and cross-provider tracing.

## Why Use Non-OpenAI Models with the Agents SDK

The OpenAI Agents SDK provides an excellent framework for building multi-agent systems — structured outputs, handoffs, guardrails, and tracing. But sometimes you need a different model provider. Maybe your contract requires using Anthropic for certain workloads. Maybe a Mistral model outperforms on a specific language task. Maybe you want redundancy across providers for reliability.

LiteLLM provides a unified interface to 100+ LLM providers using the OpenAI API format. The Agents SDK's `LitellmModel` adapter lets you plug any LiteLLM-supported model into your agents while keeping the full SDK feature set.

## Installing LiteLLM

LiteLLM is an optional extension of the Agents SDK:

```mermaid
sequenceDiagram
    autonumber
    participant Caller as Caller
    participant Agent as CallSphere Agent
    participant API as CRM API
    participant DB as CRM Database
    participant Webhook as Webhook Listener
    Caller->>Agent: Inbound call begins
    Agent->>Agent: STT plus intent detection
    Agent->>API: Lookup contact by phone
    API->>DB: Read contact record
    DB-->>API: Contact and history
    API-->>Agent: Personalized context
    Agent->>API: Create call activity
    Agent->>API: Update deal stage
    API->>Webhook: Outbound webhook fires
    Webhook-->>Agent: Confirmed
    Agent->>Caller: Spoken confirmation
```

```bash
pip install "openai-agents[litellm]"
```

This installs the `LitellmModel` adapter alongside the base SDK.

## Basic LiteLLM Usage

The simplest way to use a non-OpenAI model is with the `LitellmModel` class:

```python
from agents import Agent, Runner
from agents.extensions.models.litellm_model import LitellmModel
import asyncio

# Use Anthropic's Claude
claude_agent = Agent(
    name="ClaudeAgent",
    model=LitellmModel(model="anthropic/claude-sonnet-4-20250514"),
    instructions="You are a helpful research assistant.",
)

# Use Google's Gemini
gemini_agent = Agent(
    name="GeminiAgent",
    model=LitellmModel(model="gemini/gemini-2.5-pro"),
    instructions="You are a creative writing assistant.",
)

# Use Mistral
mistral_agent = Agent(
    name="MistralAgent",
    model=LitellmModel(model="mistral/mistral-large-latest"),
    instructions="You are a code review assistant.",
)

async def main():
    result = await Runner.run(claude_agent, input="Summarize recent advances in robotics.")
    print(result.final_output)

asyncio.run(main())
```

The provider prefix notation (`anthropic/`, `gemini/`, `mistral/`) tells LiteLLM which provider to route to. LiteLLM handles the API translation automatically.

## Setting Up API Keys

Each provider needs its own API key. Set them as environment variables:

```bash
# OpenAI (for native SDK agents)
export OPENAI_API_KEY="sk-..."

# Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."

# Google
export GEMINI_API_KEY="..."

# Mistral
export MISTRAL_API_KEY="..."

# Azure OpenAI
export AZURE_API_KEY="..."
export AZURE_API_BASE="https://your-resource.openai.azure.com/"
export AZURE_API_VERSION="2024-12-01-preview"
```

LiteLLM reads these environment variables automatically — no additional configuration needed.

## Mixing Providers in a Multi-Agent Workflow

The real power comes from mixing providers. Use each model where it excels:

```python
from agents import Agent, Runner, handoff
from agents.extensions.models.litellm_model import LitellmModel

# Triage with a fast, cheap OpenAI model (native SDK)
triage_agent = Agent(
    name="TriageAgent",
    model="gpt-4.1-mini",
    instructions=(
        "Classify the request as: research, creative, code, or general. "
        "Hand off to the appropriate specialist."
    ),
)

# Research with Claude (strong at analysis and long documents)
research_agent = Agent(
    name="ResearchAgent",
    model=LitellmModel(model="anthropic/claude-sonnet-4-20250514"),
    instructions=(
        "Conduct thorough research on the given topic. "
        "Cite sources and provide balanced analysis."
    ),
)

# Creative writing with Gemini (strong generative capabilities)
creative_agent = Agent(
    name="CreativeAgent",
    model=LitellmModel(model="gemini/gemini-2.5-pro"),
    instructions="Write creative, engaging content based on the brief.",
)

# Code review with GPT-4.1 (best tool-calling for code tools)
code_agent = Agent(
    name="CodeAgent",
    model="gpt-4.1",
    instructions="Review code, identify issues, and suggest improvements.",
    tools=[run_linter, run_tests, search_codebase],
)

# Wire up handoffs
triage_agent.handoffs = [research_agent, creative_agent, code_agent]

async def main():
    result = await Runner.run(
        triage_agent,
        input="Research the latest developments in rust async runtimes.",
    )
    print(result.final_output)
```

The triage agent runs on GPT-4.1-mini for speed and cost. Research goes to Claude. Creative tasks go to Gemini. Code analysis stays on GPT-4.1 because it has the best tool-calling reliability.

## Tool Calling Across Providers

One important consideration: tool calling support varies by provider. LiteLLM translates the OpenAI tool format to each provider's native format, but some providers handle complex tool schemas better than others.

```python
from agents import Agent, function_tool
from agents.extensions.models.litellm_model import LitellmModel

@function_tool
def search_database(query: str, limit: int = 10) -> str:
    """Search the product database."""
    # Implementation here
    return f"Found {limit} results for: {query}"

@function_tool
def get_user_profile(user_id: str) -> str:
    """Retrieve a user profile by ID."""
    return f"Profile for user {user_id}: Premium tier, joined 2024"

# Claude handles tool calling well
claude_with_tools = Agent(
    name="ClaudeToolAgent",
    model=LitellmModel(model="anthropic/claude-sonnet-4-20250514"),
    instructions="Help users find products and manage their accounts.",
    tools=[search_database, get_user_profile],
)
```

If you encounter tool calling issues with a specific provider, you can implement a fallback pattern:

```python
from agents import Agent, Runner
from agents.extensions.models.litellm_model import LitellmModel
from agents.exceptions import AgentsException

async def run_with_fallback(agent_input: str, tools: list):
    """Try the primary provider, fall back to OpenAI if tool calling fails."""
    primary = Agent(
        name="PrimaryAgent",
        model=LitellmModel(model="anthropic/claude-sonnet-4-20250514"),
        instructions="Process the request using available tools.",
        tools=tools,
    )

    fallback = Agent(
        name="FallbackAgent",
        model="gpt-4.1",
        instructions="Process the request using available tools.",
        tools=tools,
    )

    try:
        result = await Runner.run(primary, input=agent_input)
        return result.final_output
    except AgentsException:
        result = await Runner.run(fallback, input=agent_input)
        return result.final_output
```

## Tracing Across Providers

Tracing works seamlessly across providers. The Agents SDK trace captures spans regardless of which model backend is used:

```python
from agents import Agent, Runner, trace
from agents.extensions.models.litellm_model import LitellmModel

async def multi_provider_workflow(query: str):
    with trace(workflow_name="multi-provider-research"):
        # Step 1: Classify with GPT-4.1-mini
        classifier = Agent(
            name="Classifier",
            model="gpt-4.1-mini",
            instructions="Classify the query topic into one category.",
        )
        classification = await Runner.run(classifier, input=query)

        # Step 2: Research with Claude
        researcher = Agent(
            name="Researcher",
            model=LitellmModel(model="anthropic/claude-sonnet-4-20250514"),
            instructions="Research the topic thoroughly.",
        )
        research = await Runner.run(researcher, input=query)

        # Step 3: Synthesize with GPT-5
        synthesizer = Agent(
            name="Synthesizer",
            model="gpt-5",
            instructions="Synthesize the research into a clear summary.",
        )
        result = await Runner.run(
            synthesizer,
            input=f"Research findings: {research.final_output}",
        )
        return result.final_output
```

The trace in the OpenAI dashboard shows all three agent spans with their respective models, token usage, and latency — giving you a complete picture of the cross-provider workflow.

## Cost Comparison Across Providers

Track costs across providers to optimize your model mix:

```python
PROVIDER_PRICING = {
    "gpt-4.1": {"input": 2.00, "output": 8.00},
    "gpt-4.1-mini": {"input": 0.40, "output": 1.60},
    "anthropic/claude-sonnet-4-20250514": {"input": 3.00, "output": 15.00},
    "gemini/gemini-2.5-pro": {"input": 1.25, "output": 10.00},
    "mistral/mistral-large-latest": {"input": 2.00, "output": 6.00},
}

def estimate_cost(model: str, input_tokens: int, output_tokens: int) -> float:
    pricing = PROVIDER_PRICING.get(model, {"input": 5.0, "output": 15.0})
    return (
        (input_tokens / 1_000_000) * pricing["input"] +
        (output_tokens / 1_000_000) * pricing["output"]
    )
```

LiteLLM integration transforms the Agents SDK from an OpenAI-only framework into a truly provider-agnostic agent platform. Use it to leverage each provider's strengths, build redundancy into your systems, and optimize costs by routing to the most cost-effective model for each task. The key is to measure — run evaluations across providers for your specific use cases and let the data drive your model selection.

---

Source: https://callsphere.ai/blog/litellm-integration-non-openai-models-agents-sdk
