From Chatbot to Chat Agent

Most chat interfaces built on top of LLMs are simple request-response wrappers — the user sends a message, the API returns a completion, and the frontend displays it. These are chatbots, not agents. A chat agent is fundamentally different: it can reason about which tools to use, execute multi-step plans, hand off to specialized sub-agents, and maintain state across a conversation session.

The OpenAI Agents SDK provides the primitives to build chat agents that go beyond text generation. In this guide, we build a production chat agent from scratch using the Agents SDK, FastAPI for the backend, and proper session management for multi-user deployments.

Chat Agent Architecture

A production chat agent has four components:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

flowchart LR
    INPUT(["User input"])
    AGENT["Agent<br/>name plus instructions"]
    HAND{"Handoff to<br/>another agent?"}
    SUB["Sub-agent<br/>specialist"]
    GUARD{"Guardrail<br/>passed?"}
    TOOL["Tool call"]
    SDK[("Tracing<br/>OpenAI dashboard")]
    OUT(["Final output"])
    INPUT --> AGENT --> HAND
    HAND -->|Yes| SUB --> GUARD
    HAND -->|No| GUARD
    GUARD -->|Yes| TOOL --> AGENT
    GUARD -->|Block| OUT
    AGENT --> OUT
    AGENT --> SDK
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style SDK fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff

Agent Definition — the instructions, model, and tools that define the agent's behavior
Session Layer — tracks conversation history and state per user
API Layer — FastAPI endpoints that accept messages and return responses
Tool Layer — functions the agent can call to interact with external systems

┌─────────────┐        HTTP         ┌──────────────────┐
│  React Chat │◄────────────────────►│   FastAPI Server  │
│   Frontend  │                      │                   │
└─────────────┘                      │  ┌─────────────┐  │
                                     │  │  Session Mgr │  │
                                     │  └──────┬──────┘  │
                                     │         │         │
                                     │  ┌──────▼──────┐  │
                                     │  │  Chat Agent  │  │
                                     │  │  + Tools     │  │
                                     │  └──────┬──────┘  │
                                     │         │         │
                                     │  ┌──────▼──────┐  │
                                     │  │  OpenAI API  │  │
                                     │  └─────────────┘  │
                                     └──────────────────┘

Step 1: Define the Chat Agent

The agent definition is the core of the system. It specifies the model, system instructions, and available tools.

# agents/support_agent.py
from agents import Agent, function_tool
import httpx

@function_tool
async def search_knowledge_base(query: str) -> str:
    """Search the company knowledge base for relevant articles."""
    async with httpx.AsyncClient() as client:
        resp = await client.post(
            "http://localhost:8000/internal/search",
            json={"query": query, "limit": 3},
        )
        results = resp.json()

    if not results["articles"]:
        return "No relevant articles found."

    formatted = []
    for article in results["articles"]:
        formatted.append(
            f"**{article['title']}**\n{article['snippet']}"
        )
    return "\n\n".join(formatted)

@function_tool
async def create_support_ticket(
    subject: str,
    description: str,
    priority: str = "medium",
) -> str:
    """Create a support ticket when the issue cannot be resolved in chat."""
    async with httpx.AsyncClient() as client:
        resp = await client.post(
            "http://localhost:8000/internal/tickets",
            json={
                "subject": subject,
                "description": description,
                "priority": priority,
            },
        )
        ticket = resp.json()
    return f"Ticket {ticket['id']} created with priority {priority}."

@function_tool
def get_business_hours() -> str:
    """Return current business hours and support availability."""
    return (
        "Business hours: Monday-Friday 9 AM - 6 PM EST. "
        "Live agent support is available during business hours. "
        "Chat agent support is available 24/7."
    )

support_agent = Agent(
    name="support_agent",
    model="gpt-4o",
    instructions="""You are a helpful customer support agent for Acme Corp.

Your responsibilities:
- Answer questions about products and services using the knowledge base
- Help troubleshoot common issues
- Create support tickets for issues you cannot resolve
- Provide business hours and availability information

Guidelines:
- Always search the knowledge base before answering product questions
- Be concise but thorough in your responses
- If you cannot resolve an issue, create a ticket and let the user know
- Never make up information — if you do not know, say so
- Use a friendly, professional tone""",
    tools=[search_knowledge_base, create_support_ticket, get_business_hours],
)

Step 2: Build the Session Manager

In production, multiple users chat simultaneously. Each conversation needs its own history. The session manager stores conversation state and converts it to the format the Agents SDK expects.

# session_manager.py
import time
from dataclasses import dataclass, field
from typing import Optional

@dataclass
class ChatMessage:
    role: str  # "user" or "assistant"
    content: str
    timestamp: float = field(default_factory=time.time)

@dataclass
class ChatSession:
    session_id: str
    messages: list[ChatMessage] = field(default_factory=list)
    created_at: float = field(default_factory=time.time)
    last_active: float = field(default_factory=time.time)
    result: Optional[object] = None  # stores last Runner result

    def add_message(self, role: str, content: str):
        self.messages.append(ChatMessage(role=role, content=content))
        self.last_active = time.time()

    def to_input_list(self) -> list[dict]:
        """Convert session history to Agents SDK input format."""
        if self.result is not None:
            return self.result.to_input_list()
        return [
            {"role": msg.role, "content": msg.content}
            for msg in self.messages
        ]

class SessionManager:
    def __init__(self, max_sessions: int = 10000, ttl_seconds: int = 3600):
        self._sessions: dict[str, ChatSession] = {}
        self._max_sessions = max_sessions
        self._ttl = ttl_seconds

    def get_or_create(self, session_id: str) -> ChatSession:
        self._cleanup_expired()
        if session_id not in self._sessions:
            self._sessions[session_id] = ChatSession(session_id=session_id)
        return self._sessions[session_id]

    def _cleanup_expired(self):
        now = time.time()
        expired = [
            sid for sid, s in self._sessions.items()
            if now - s.last_active > self._ttl
        ]
        for sid in expired:
            del self._sessions[sid]

The key method is to_input_list(). When a previous Runner.run() result exists, we call result.to_input_list() to get the full conversation history including tool calls and their results. This preserves the agent's complete context across turns.

Step 3: FastAPI Integration

The API layer connects the frontend to the agent. It handles session routing, input validation, and response formatting.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

# main.py
from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from agents import Runner

from agents.support_agent import support_agent
from session_manager import SessionManager

app = FastAPI(title="Chat Agent API")
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_methods=["*"],
    allow_headers=["*"],
)

sessions = SessionManager()

class ChatRequest(BaseModel):
    session_id: str
    message: str

class ChatResponse(BaseModel):
    session_id: str
    response: str
    tools_used: list[str]

@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
    if not request.message.strip():
        raise HTTPException(status_code=422, detail="Message cannot be empty")

    session = sessions.get_or_create(request.session_id)
    session.add_message("user", request.message)

    input_list = session.to_input_list()

    result = await Runner.run(
        support_agent,
        input=input_list,
    )

    # Store the result for next turn context
    session.result = result
    session.add_message("assistant", result.final_output)

    # Extract tool names from the run
    tools_used = []
    for item in result.new_items:
        if hasattr(item, "name"):
            tools_used.append(item.name)

    return ChatResponse(
        session_id=request.session_id,
        response=result.final_output,
        tools_used=tools_used,
    )

@app.delete("/chat/{session_id}")
async def end_session(session_id: str):
    if session_id in sessions._sessions:
        del sessions._sessions[session_id]
    return {"status": "session_ended"}

Step 4: Error Handling and Resilience

Production chat agents must handle failures gracefully. Network errors, API rate limits, and tool failures should not crash the server or leave the user hanging.

# error_handling.py
from agents import Runner
from agents.exceptions import MaxTurnsExceeded, ModelBehaviorError
import logging

logger = logging.getLogger(__name__)

FALLBACK_MESSAGE = (
    "I apologize, but I am experiencing a temporary issue. "
    "Please try again in a moment, or I can create a support "
    "ticket for you."
)

async def safe_agent_run(agent, input_list, max_retries=2):
    """Run the agent with error handling and retry logic."""
    for attempt in range(max_retries + 1):
        try:
            result = await Runner.run(
                agent,
                input=input_list,
                max_turns=15,
            )
            return result

        except MaxTurnsExceeded:
            logger.warning("Agent exceeded max turns, returning partial result")
            return None

        except ModelBehaviorError as e:
            logger.error(f"Model behavior error: {e}")
            if attempt < max_retries:
                continue
            return None

        except Exception as e:
            logger.exception(f"Unexpected error on attempt {attempt + 1}")
            if attempt < max_retries:
                continue
            return None

    return None

Integrate the safe runner into the chat endpoint by replacing the direct Runner.run() call with safe_agent_run(). When it returns None, respond with the fallback message and offer to create a support ticket.

Step 5: Observability and Logging

Every production chat agent needs structured logging that captures the full lifecycle of each request — from user message to tool calls to final response. This is essential for debugging issues, measuring quality, and understanding user behavior.

# middleware.py
import time
import uuid
import logging
from fastapi import Request

logger = logging.getLogger("chat_agent")

async def log_chat_request(request: Request, call_next):
    request_id = str(uuid.uuid4())[:8]
    start_time = time.time()

    response = await call_next(request)

    duration_ms = (time.time() - start_time) * 1000
    logger.info(
        "chat_request",
        extra={
            "request_id": request_id,
            "method": request.method,
            "path": request.url.path,
            "status": response.status_code,
            "duration_ms": round(duration_ms, 2),
        },
    )
    return response

The combination of structured agent definitions, proper session management, resilient error handling, and observability gives you a chat agent that can serve real users at scale. The Agents SDK handles the complex orchestration of tool calling and multi-turn reasoning, while the FastAPI layer provides the production infrastructure around it.

Building Production Chat Agents with OpenAI Agents SDK

From Chatbot to Chat Agent

Chat Agent Architecture

Step 1: Define the Chat Agent

Step 2: Build the Session Manager

Step 3: FastAPI Integration

Step 4: Error Handling and Resilience

Step 5: Observability and Logging

Try CallSphere AI Voice Agents

Related Articles You May Like

Desktop AI Agents in 2026: Project Arc, Claude Cowork, OpenAI Agents Compared

OpenAI Frontier: Model-Native Orchestration Is the Default in 2026

Gemini Enterprise vs Anthropic vs OpenAI Frontier: 2026 Comparison

Anthropic's Financial Services Platform: State of Play in May 2026

Model-Native Harness: Why OpenAI and Anthropic Are Killing ReAct Loops

GPT-Realtime-Whisper vs Deepgram: Streaming STT in 2026