Skip to content
Learn Agentic AI
Learn Agentic AI10 min read2 views

The Chain of Responsibility Pattern: Cascading Agent Attempts Until Success

Implement the Chain of Responsibility pattern for AI agents with fallback chains, capability matching, and cost-optimized ordering to handle requests efficiently.

What Is the Chain of Responsibility?

The Chain of Responsibility pattern passes a request along a chain of handlers. Each handler examines the request and either processes it or passes it to the next handler in the chain. The request travels down the chain until a handler successfully processes it, or the chain is exhausted.

In AI agent systems, this pattern is invaluable for building fallback chains. You might try a fast, cheap model first, fall back to a more capable model if the first one fails, and escalate to a specialized agent or human as a last resort. Each link in the chain can also check whether it has the right capabilities before attempting to handle the request.

Core Implementation

from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import Any


@dataclass
class Request:
    content: str
    required_capabilities: set[str]
    metadata: dict


@dataclass
class Response:
    content: str
    handler_name: str
    success: bool
    cost: float  # estimated cost in USD


class AgentHandler(ABC):
    def __init__(self, name: str, capabilities: set[str],
                 cost_per_call: float):
        self.name = name
        self.capabilities = capabilities
        self.cost_per_call = cost_per_call
        self._next: AgentHandler | None = None

    def set_next(self, handler: "AgentHandler") -> "AgentHandler":
        self._next = handler
        return handler

    def can_handle(self, request: Request) -> bool:
        return request.required_capabilities.issubset(
            self.capabilities
        )

    def handle(self, request: Request) -> Response | None:
        if self.can_handle(request):
            try:
                result = self.process(request)
                if result.success:
                    return result
            except Exception as e:
                print(f"{self.name} failed: {e}")

        if self._next:
            print(f"{self.name} passing to {self._next.name}")
            return self._next.handle(request)

        return None

    @abstractmethod
    def process(self, request: Request) -> Response:
        pass

Building Concrete Handlers

import openai


class LightweightAgent(AgentHandler):
    def __init__(self):
        super().__init__(
            name="GPT-4o-mini",
            capabilities={"text_generation", "summarization",
                          "classification"},
            cost_per_call=0.001,
        )
        self.client = openai.OpenAI()

    def process(self, request: Request) -> Response:
        response = self.client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": request.content}],
        )
        content = response.choices[0].message.content
        # Simple quality check
        if len(content) < 20:
            return Response(content, self.name, success=False,
                            cost=self.cost_per_call)
        return Response(content, self.name, success=True,
                        cost=self.cost_per_call)


class PowerfulAgent(AgentHandler):
    def __init__(self):
        super().__init__(
            name="GPT-4o",
            capabilities={"text_generation", "summarization",
                          "classification", "reasoning",
                          "code_generation"},
            cost_per_call=0.01,
        )
        self.client = openai.OpenAI()

    def process(self, request: Request) -> Response:
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": request.content}],
        )
        return Response(
            response.choices[0].message.content,
            self.name, success=True,
            cost=self.cost_per_call,
        )


class HumanEscalation(AgentHandler):
    def __init__(self):
        super().__init__(
            name="Human Reviewer",
            capabilities={"text_generation", "summarization",
                          "classification", "reasoning",
                          "code_generation", "human_judgment"},
            cost_per_call=5.0,
        )

    def process(self, request: Request) -> Response:
        # In production, this would create a ticket or send
        # a notification to a human review queue
        return Response(
            content="[Escalated to human review queue]",
            handler_name=self.name,
            success=True,
            cost=self.cost_per_call,
        )

Assembling the Chain

def build_cost_optimized_chain() -> AgentHandler:
    lightweight = LightweightAgent()
    powerful = PowerfulAgent()
    human = HumanEscalation()

    # Chain: cheap -> expensive -> human
    lightweight.set_next(powerful)
    powerful.set_next(human)

    return lightweight


chain = build_cost_optimized_chain()

# Simple request — handled by lightweight agent
simple = Request(
    content="Summarize this paragraph in one sentence.",
    required_capabilities={"summarization"},
    metadata={},
)
result = chain.handle(simple)
print(f"Handled by: {result.handler_name}, Cost: ${result.cost}")

# Complex request — needs reasoning, skips to powerful agent
complex_req = Request(
    content="Analyze the time complexity of this algorithm.",
    required_capabilities={"reasoning", "code_generation"},
    metadata={},
)
result = chain.handle(complex_req)
print(f"Handled by: {result.handler_name}, Cost: ${result.cost}")

The capability check in can_handle means the chain intelligently skips handlers that lack the required capabilities, so a request needing reasoning jumps straight to GPT-4o without wasting a call on GPT-4o-mini.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

flowchart TD
    START["The Chain of Responsibility Pattern: Cascading Ag…"] --> A
    A["What Is the Chain of Responsibility?"]
    A --> B
    B["Core Implementation"]
    B --> C
    C["Building Concrete Handlers"]
    C --> D
    D["Assembling the Chain"]
    D --> E
    E["FAQ"]
    E --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff

FAQ

How do I order the handlers for cost efficiency?

Place the cheapest handler first and the most expensive last. This ensures simple requests are handled cheaply while complex requests still get resolved. Track the percentage of requests handled at each level to monitor whether your chain ordering is optimal.

What if I want to try all handlers and pick the best result?

That is a different pattern — closer to Map-Reduce or an ensemble. The Chain of Responsibility is specifically designed for "first success wins" semantics. If you need to compare outputs from multiple agents, use a fan-out approach and a separate evaluator to pick the best.

How do I handle the case where no handler in the chain can process a request?

The handle method returns None when the chain is exhausted. Wrap the chain call in logic that detects this and returns a graceful error to the user, such as "We could not process your request. A support ticket has been created."


#AgentDesignPatterns #ChainOfResponsibility #Python #AgenticAI #FaultTolerance #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Interview Prep

7 AI Coding Interview Questions From Anthropic, Meta & OpenAI (2026 Edition)

Real AI coding interview questions from Anthropic, Meta, and OpenAI in 2026. Includes implementing attention from scratch, Anthropic's progressive coding screens, Meta's AI-assisted round, and vector search — with solution approaches.

Learn Agentic AI

Fine-Tuning LLMs for Agentic Tasks: When and How to Customize Foundation Models

When fine-tuning beats prompting for AI agents: dataset creation from agent traces, SFT and DPO training approaches, evaluation methodology, and cost-benefit analysis for agentic fine-tuning.

AI Interview Prep

7 Agentic AI & Multi-Agent System Interview Questions for 2026

Real agentic AI and multi-agent system interview questions from Anthropic, OpenAI, and Microsoft in 2026. Covers agent design patterns, memory systems, safety, orchestration frameworks, tool calling, and evaluation.

Learn Agentic AI

Building a Multi-Agent Data Pipeline: Ingestion, Transformation, and Analysis Agents

Build a three-agent data pipeline with ingestion, transformation, and analysis agents that process data from APIs, CSVs, and databases using Python.

Learn Agentic AI

Adaptive Thinking in Claude 4.6: How AI Agents Decide When and How Much to Reason

Technical exploration of adaptive thinking in Claude 4.6 — how the model dynamically adjusts reasoning depth, its impact on agent architectures, and practical implementation patterns.

Learn Agentic AI

How NVIDIA Vera CPU Solves the Agentic AI Bottleneck: Architecture Deep Dive

Technical analysis of NVIDIA's Vera CPU designed for agentic AI workloads — why the CPU is the bottleneck, how Vera's architecture addresses it, and what it means for agent performance.