---
title: "MRKL Architecture: Modular Reasoning, Knowledge, and Language for Expert Systems"
description: "Understand the MRKL (Modular Reasoning, Knowledge, and Language) architecture that combines LLMs with specialized expert modules, intelligent routing, and structured knowledge retrieval for building powerful AI systems."
canonical: https://callsphere.ai/blog/mrkl-architecture-modular-reasoning-knowledge-language
category: "Learn Agentic AI"
tags: ["MRKL", "Expert Systems", "Modular AI", "Knowledge Retrieval", "Python"]
author: "CallSphere Team"
published: 2026-03-18T00:00:00.000Z
updated: 2026-05-06T01:02:46.088Z
---

# MRKL Architecture: Modular Reasoning, Knowledge, and Language for Expert Systems

> Understand the MRKL (Modular Reasoning, Knowledge, and Language) architecture that combines LLMs with specialized expert modules, intelligent routing, and structured knowledge retrieval for building powerful AI systems.

## What Is MRKL?

MRKL — pronounced "miracle" — stands for **Modular Reasoning, Knowledge, and Language**. Introduced by Karpas et al. (2022), the MRKL architecture recognizes that no single neural model excels at everything. Instead, it pairs a large language model as a central router with a collection of specialized **expert modules** — calculators, databases, APIs, symbolic reasoners — each handling the tasks it does best.

Think of it like a hospital: the triage nurse (the LLM) evaluates your symptoms and routes you to the right specialist (an expert module). The nurse does not perform surgery, and the surgeon does not do triage.

## Core Components

A MRKL system has three layers:

```mermaid
flowchart LR
    PR(["PR opened"])
    UNIT["Unit tests"]
    EVAL["Eval harness
PromptFoo or Braintrust"]
    GOLD[("Golden set
200 tagged cases")]
    JUDGE["LLM as judge
plus regex graders"]
    SCORE["Aggregate score
and per slice"]
    GATE{"Score regress
more than 2 percent?"}
    BLOCK(["Block merge"])
    MERGE(["Merge to main"])
    PR --> UNIT --> EVAL --> GOLD --> JUDGE --> SCORE --> GATE
    GATE -->|Yes| BLOCK
    GATE -->|No| MERGE
    style EVAL fill:#4f46e5,stroke:#4338ca,color:#fff
    style GATE fill:#f59e0b,stroke:#d97706,color:#1f2937
    style BLOCK fill:#dc2626,stroke:#b91c1c,color:#fff
    style MERGE fill:#059669,stroke:#047857,color:#fff
```

1. **Router** — the LLM that interprets user queries and decides which expert to invoke
2. **Expert Modules** — specialized tools or models (calculator, SQL engine, search API, etc.)
3. **Reasoning Chain** — the logic that combines expert outputs into a coherent final answer

```python
from dataclasses import dataclass
from typing import Callable, Any
from openai import OpenAI

client = OpenAI()

@dataclass
class ExpertModule:
    name: str
    description: str
    execute: Callable[[str], str]

class MRKLSystem:
    def __init__(self, experts: list[ExpertModule]):
        self.experts = {e.name: e for e in experts}

    def route(self, query: str) -> tuple[str, str]:
        """Use LLM to select the right expert and extract the sub-query."""
        expert_descriptions = "\n".join(
            f"- {e.name}: {e.description}" for e in self.experts.values()
        )

        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {"role": "system", "content": (
                    "You are a routing agent. Given a user query, select "
                    "the best expert module and extract the sub-query "
                    "for that expert.\n\n"
                    f"Available experts:\n{expert_descriptions}\n\n"
                    "Return JSON: {expert, sub_query}"
                )},
                {"role": "user", "content": query},
            ],
            response_format={"type": "json_object"},
        )
        import json
        data = json.loads(response.choices[0].message.content)
        return data["expert"], data["sub_query"]
```

## Building Expert Modules

Each module handles a narrow domain. Here are some practical examples:

```python
import math

def calculator_expert(expression: str) -> str:
    """Safely evaluate mathematical expressions."""
    allowed = set("0123456789+-*/().^ ")
    cleaned = expression.replace("^", "**")
    if not all(c in allowed for c in cleaned):
        return "Error: invalid characters in expression"
    try:
        result = eval(cleaned, {"__builtins__": {}}, {"math": math})
        return str(result)
    except Exception as e:
        return f"Calculation error: {e}"

def database_expert(sql_description: str) -> str:
    """Convert natural language to SQL and execute."""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": (
                "Convert the description to a PostgreSQL query. "
                "Only SELECT queries are allowed."
            )},
            {"role": "user", "content": sql_description},
        ],
    )
    sql = response.choices[0].message.content
    # Execute against actual DB connection in production
    return f"Generated SQL: {sql}"

experts = [
    ExpertModule("calculator", "Performs math calculations", calculator_expert),
    ExpertModule("database", "Queries structured data", database_expert),
]
```

## The Reasoning Chain

After routing and execution, the system synthesizes the expert output into a final response:

```python
def answer(self, query: str) -> str:
    expert_name, sub_query = self.route(query)
    expert = self.experts.get(expert_name)

    if not expert:
        return "No suitable expert found for this query."

    expert_output = expert.execute(sub_query)

    # Synthesize final answer
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": (
                "Combine the expert's output with the original "
                "question to provide a clear, complete answer."
            )},
            {"role": "user", "content": (
                f"Question: {query}\n"
                f"Expert ({expert_name}) output: {expert_output}"
            )},
        ],
    )
    return response.choices[0].message.content
```

## Multi-Expert Chaining

Complex queries often require multiple experts in sequence. For example, "What percentage of our revenue comes from customers in California?" needs the database expert first (to query revenue by state), then the calculator expert (to compute the percentage). The router must recognize this and chain calls accordingly.

## MRKL vs Tool-Use Agents

Modern tool-use agents (like those built with OpenAI function calling) are essentially MRKL systems with a standardized interface. The MRKL paper laid the conceptual foundation — tools as expert modules, the LLM as the router. Understanding the MRKL framing helps you design better tool interfaces and routing logic.

## FAQ

### How is MRKL different from RAG?

RAG (Retrieval-Augmented Generation) is a specific pattern where the expert module is a document retriever. MRKL is a broader architecture — RAG is one possible expert within a MRKL system, alongside calculators, APIs, databases, and other specialists.

### How do you handle routing errors?

Implement a fallback chain. If the selected expert returns an error or low-confidence result, route to the next most likely expert. You can also ask the LLM to select its top 3 experts ranked by relevance, then try them in order.

### Can you use different LLMs for routing vs synthesis?

Absolutely. A smaller, faster model (GPT-4o-mini) can handle routing since the task is classification-like. Reserve the larger model for the synthesis step where nuanced reasoning matters most.

---

#MRKL #ModularAI #ExpertSystems #AIArchitecture #AgenticAI #KnowledgeRetrieval #PythonAI #ToolUse

---

Source: https://callsphere.ai/blog/mrkl-architecture-modular-reasoning-knowledge-language
