Why Use Non-OpenAI Models with the Agents SDK

The OpenAI Agents SDK provides an excellent framework for building multi-agent systems — structured outputs, handoffs, guardrails, and tracing. But sometimes you need a different model provider. Maybe your contract requires using Anthropic for certain workloads. Maybe a Mistral model outperforms on a specific language task. Maybe you want redundancy across providers for reliability.

LiteLLM provides a unified interface to 100+ LLM providers using the OpenAI API format. The Agents SDK's LitellmModel adapter lets you plug any LiteLLM-supported model into your agents while keeping the full SDK feature set.

Installing LiteLLM

LiteLLM is an optional extension of the Agents SDK:

sequenceDiagram
    autonumber
    participant Caller as Caller
    participant Agent as CallSphere Agent
    participant API as CRM API
    participant DB as CRM Database
    participant Webhook as Webhook Listener
    Caller->>Agent: Inbound call begins
    Agent->>Agent: STT plus intent detection
    Agent->>API: Lookup contact by phone
    API->>DB: Read contact record
    DB-->>API: Contact and history
    API-->>Agent: Personalized context
    Agent->>API: Create call activity
    Agent->>API: Update deal stage
    API->>Webhook: Outbound webhook fires
    Webhook-->>Agent: Confirmed
    Agent->>Caller: Spoken confirmation

pip install "openai-agents[litellm]"

This installs the LitellmModel adapter alongside the base SDK.

Basic LiteLLM Usage

The simplest way to use a non-OpenAI model is with the LitellmModel class:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

from agents import Agent, Runner
from agents.extensions.models.litellm_model import LitellmModel
import asyncio

# Use Anthropic's Claude
claude_agent = Agent(
    name="ClaudeAgent",
    model=LitellmModel(model="anthropic/claude-sonnet-4-20250514"),
    instructions="You are a helpful research assistant.",
)

# Use Google's Gemini
gemini_agent = Agent(
    name="GeminiAgent",
    model=LitellmModel(model="gemini/gemini-2.5-pro"),
    instructions="You are a creative writing assistant.",
)

# Use Mistral
mistral_agent = Agent(
    name="MistralAgent",
    model=LitellmModel(model="mistral/mistral-large-latest"),
    instructions="You are a code review assistant.",
)

async def main():
    result = await Runner.run(claude_agent, input="Summarize recent advances in robotics.")
    print(result.final_output)

asyncio.run(main())

The provider prefix notation (anthropic/, gemini/, mistral/) tells LiteLLM which provider to route to. LiteLLM handles the API translation automatically.

Setting Up API Keys

Each provider needs its own API key. Set them as environment variables:

# OpenAI (for native SDK agents)
export OPENAI_API_KEY="sk-..."

# Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."

# Google
export GEMINI_API_KEY="..."

# Mistral
export MISTRAL_API_KEY="..."

# Azure OpenAI
export AZURE_API_KEY="..."
export AZURE_API_BASE="https://your-resource.openai.azure.com/"
export AZURE_API_VERSION="2024-12-01-preview"

LiteLLM reads these environment variables automatically — no additional configuration needed.

Mixing Providers in a Multi-Agent Workflow

The real power comes from mixing providers. Use each model where it excels:

from agents import Agent, Runner, handoff
from agents.extensions.models.litellm_model import LitellmModel

# Triage with a fast, cheap OpenAI model (native SDK)
triage_agent = Agent(
    name="TriageAgent",
    model="gpt-4.1-mini",
    instructions=(
        "Classify the request as: research, creative, code, or general. "
        "Hand off to the appropriate specialist."
    ),
)

# Research with Claude (strong at analysis and long documents)
research_agent = Agent(
    name="ResearchAgent",
    model=LitellmModel(model="anthropic/claude-sonnet-4-20250514"),
    instructions=(
        "Conduct thorough research on the given topic. "
        "Cite sources and provide balanced analysis."
    ),
)

# Creative writing with Gemini (strong generative capabilities)
creative_agent = Agent(
    name="CreativeAgent",
    model=LitellmModel(model="gemini/gemini-2.5-pro"),
    instructions="Write creative, engaging content based on the brief.",
)

# Code review with GPT-4.1 (best tool-calling for code tools)
code_agent = Agent(
    name="CodeAgent",
    model="gpt-4.1",
    instructions="Review code, identify issues, and suggest improvements.",
    tools=[run_linter, run_tests, search_codebase],
)

# Wire up handoffs
triage_agent.handoffs = [research_agent, creative_agent, code_agent]

async def main():
    result = await Runner.run(
        triage_agent,
        input="Research the latest developments in rust async runtimes.",
    )
    print(result.final_output)

The triage agent runs on GPT-4.1-mini for speed and cost. Research goes to Claude. Creative tasks go to Gemini. Code analysis stays on GPT-4.1 because it has the best tool-calling reliability.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Tool Calling Across Providers

One important consideration: tool calling support varies by provider. LiteLLM translates the OpenAI tool format to each provider's native format, but some providers handle complex tool schemas better than others.

from agents import Agent, function_tool
from agents.extensions.models.litellm_model import LitellmModel

@function_tool
def search_database(query: str, limit: int = 10) -> str:
    """Search the product database."""
    # Implementation here
    return f"Found {limit} results for: {query}"

@function_tool
def get_user_profile(user_id: str) -> str:
    """Retrieve a user profile by ID."""
    return f"Profile for user {user_id}: Premium tier, joined 2024"

# Claude handles tool calling well
claude_with_tools = Agent(
    name="ClaudeToolAgent",
    model=LitellmModel(model="anthropic/claude-sonnet-4-20250514"),
    instructions="Help users find products and manage their accounts.",
    tools=[search_database, get_user_profile],
)

If you encounter tool calling issues with a specific provider, you can implement a fallback pattern:

from agents import Agent, Runner
from agents.extensions.models.litellm_model import LitellmModel
from agents.exceptions import AgentsException

async def run_with_fallback(agent_input: str, tools: list):
    """Try the primary provider, fall back to OpenAI if tool calling fails."""
    primary = Agent(
        name="PrimaryAgent",
        model=LitellmModel(model="anthropic/claude-sonnet-4-20250514"),
        instructions="Process the request using available tools.",
        tools=tools,
    )

    fallback = Agent(
        name="FallbackAgent",
        model="gpt-4.1",
        instructions="Process the request using available tools.",
        tools=tools,
    )

    try:
        result = await Runner.run(primary, input=agent_input)
        return result.final_output
    except AgentsException:
        result = await Runner.run(fallback, input=agent_input)
        return result.final_output

Tracing Across Providers

Tracing works seamlessly across providers. The Agents SDK trace captures spans regardless of which model backend is used:

from agents import Agent, Runner, trace
from agents.extensions.models.litellm_model import LitellmModel

async def multi_provider_workflow(query: str):
    with trace(workflow_name="multi-provider-research"):
        # Step 1: Classify with GPT-4.1-mini
        classifier = Agent(
            name="Classifier",
            model="gpt-4.1-mini",
            instructions="Classify the query topic into one category.",
        )
        classification = await Runner.run(classifier, input=query)

        # Step 2: Research with Claude
        researcher = Agent(
            name="Researcher",
            model=LitellmModel(model="anthropic/claude-sonnet-4-20250514"),
            instructions="Research the topic thoroughly.",
        )
        research = await Runner.run(researcher, input=query)

        # Step 3: Synthesize with GPT-5
        synthesizer = Agent(
            name="Synthesizer",
            model="gpt-5",
            instructions="Synthesize the research into a clear summary.",
        )
        result = await Runner.run(
            synthesizer,
            input=f"Research findings: {research.final_output}",
        )
        return result.final_output

The trace in the OpenAI dashboard shows all three agent spans with their respective models, token usage, and latency — giving you a complete picture of the cross-provider workflow.

Cost Comparison Across Providers

Track costs across providers to optimize your model mix:

PROVIDER_PRICING = {
    "gpt-4.1": {"input": 2.00, "output": 8.00},
    "gpt-4.1-mini": {"input": 0.40, "output": 1.60},
    "anthropic/claude-sonnet-4-20250514": {"input": 3.00, "output": 15.00},
    "gemini/gemini-2.5-pro": {"input": 1.25, "output": 10.00},
    "mistral/mistral-large-latest": {"input": 2.00, "output": 6.00},
}

def estimate_cost(model: str, input_tokens: int, output_tokens: int) -> float:
    pricing = PROVIDER_PRICING.get(model, {"input": 5.0, "output": 15.0})
    return (
        (input_tokens / 1_000_000) * pricing["input"] +
        (output_tokens / 1_000_000) * pricing["output"]
    )

LiteLLM integration transforms the Agents SDK from an OpenAI-only framework into a truly provider-agnostic agent platform. Use it to leverage each provider's strengths, build redundancy into your systems, and optimize costs by routing to the most cost-effective model for each task. The key is to measure — run evaluations across providers for your specific use cases and let the data drive your model selection.

LiteLLM Integration: Using Non-OpenAI Models with Agents SDK

Why Use Non-OpenAI Models with the Agents SDK

Installing LiteLLM

Basic LiteLLM Usage

Setting Up API Keys

Mixing Providers in a Multi-Agent Workflow

Tool Calling Across Providers

Tracing Across Providers

Cost Comparison Across Providers

Try CallSphere AI Voice Agents

Related Articles You May Like

Desktop AI Agents in 2026: Project Arc, Claude Cowork, OpenAI Agents Compared

OpenAI Frontier: Model-Native Orchestration Is the Default in 2026

Gemini Enterprise vs Anthropic vs OpenAI Frontier: 2026 Comparison

Anthropic's Financial Services Platform: State of Play in May 2026

Project Arc vs Anthropic Managed Agents: Enterprise Agent Comparison

Model-Native Harness: Why OpenAI and Anthropic Are Killing ReAct Loops