Skip to content
Learn Agentic AI
Learn Agentic AI13 min read4 views

Building a Custom MCP Server for Your REST API

Build a production-ready MCP server that wraps your existing REST API endpoints as callable tools, using FastAPI and the MCP Python SDK to expose your business logic to AI agents.

Why Build a Custom MCP Server?

Most MCP tutorials use pre-built servers — the filesystem server, the Git server, the Postgres server. These cover common use cases. But every company has its own REST APIs: inventory systems, billing platforms, CRM endpoints, internal dashboards. To let an AI agent interact with your specific business logic, you need to wrap those APIs as MCP tools.

A custom MCP server sits between the AI agent and your REST API. The agent calls tools defined by your server, and your server translates those tool calls into HTTP requests against your existing endpoints. Your API does not need to change at all. The MCP server is an adapter layer.

In this post, we will build a complete custom MCP server using the official MCP Python SDK and FastAPI, exposing a sample e-commerce REST API as a set of agent-callable tools.

MCP Server Architecture

An MCP server has three responsibilities:

flowchart TD
    START["Building a Custom MCP Server for Your REST API"] --> A
    A["Why Build a Custom MCP Server?"]
    A --> B
    B["MCP Server Architecture"]
    B --> C
    C["Setting Up the Project"]
    C --> D
    D["The REST API We Are Wrapping"]
    D --> E
    E["Building the API Client"]
    E --> F
    F["Defining the MCP Server"]
    F --> G
    G["Running as a Stdio Server"]
    G --> H
    H["Running as an HTTP Server"]
    H --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
  1. Declare tools — Expose a list of tools with names, descriptions, and input schemas so agents know what they can call.
  2. Execute tools — When an agent invokes a tool, run the associated logic (in our case, an HTTP request to your API) and return the result.
  3. Communicate via protocol — Speak the MCP protocol over either stdio (for local subprocess servers) or HTTP with SSE (for remote servers).

The architecture looks like this:

Agent (OpenAI SDK)
    |
    | MCP Protocol (stdio or HTTP+SSE)
    v
Custom MCP Server
    |
    | HTTP requests
    v
Your REST API (FastAPI, Express, Rails, etc.)

Setting Up the Project

Start by installing the MCP Python SDK:

pip install mcp httpx pydantic

Create a project structure:

my-mcp-server/
  server.py        # MCP server definition
  api_client.py    # HTTP client for your REST API
  config.py        # Configuration and environment variables
  requirements.txt

The REST API We Are Wrapping

For this tutorial, assume we have an e-commerce API with these endpoints:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

GET    /api/products              List all products
GET    /api/products/{id}         Get product details
POST   /api/orders                Create an order
GET    /api/orders/{id}           Get order status
GET    /api/customers/{id}        Get customer profile
POST   /api/customers/{id}/notes  Add a note to a customer

This is a standard CRUD API. The goal is to make every endpoint callable by an AI agent through MCP tools.

Building the API Client

First, create a typed HTTP client for your API. This keeps the MCP server code clean and separates protocol logic from HTTP logic:

# api_client.py
import httpx
from typing import Optional

class EcommerceAPIClient:
    def __init__(self, base_url: str, api_key: str):
        self.base_url = base_url.rstrip("/")
        self.headers = {
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json",
        }

    async def list_products(
        self, category: Optional[str] = None, limit: int = 20
    ) -> dict:
        params = {"limit": limit}
        if category:
            params["category"] = category
        async with httpx.AsyncClient() as client:
            resp = await client.get(
                f"{self.base_url}/api/products",
                headers=self.headers,
                params=params,
            )
            resp.raise_for_status()
            return resp.json()

    async def create_order(
        self, customer_id: str, product_ids: list[str], shipping_address: str
    ) -> dict:
        async with httpx.AsyncClient() as client:
            resp = await client.post(
                f"{self.base_url}/api/orders",
                headers=self.headers,
                json={
                    "customer_id": customer_id,
                    "product_ids": product_ids,
                    "shipping_address": shipping_address,
                },
            )
            resp.raise_for_status()
            return resp.json()

    async def get_order(self, order_id: str) -> dict:
        async with httpx.AsyncClient() as client:
            resp = await client.get(
                f"{self.base_url}/api/orders/{order_id}",
                headers=self.headers,
            )
            resp.raise_for_status()
            return resp.json()

    # Additional methods follow the same pattern:
    # get_product(), get_customer(), add_customer_note()

Defining the MCP Server

Now create the MCP server that registers each API method as a tool:

# server.py
import json
import os
from mcp.server import Server
from mcp.types import Tool, TextContent
from api_client import EcommerceAPIClient

# Initialize
api = EcommerceAPIClient(
    base_url=os.environ["ECOMMERCE_API_URL"],
    api_key=os.environ["ECOMMERCE_API_KEY"],
)
server = Server("ecommerce-mcp")

@server.list_tools()
async def list_tools() -> list[Tool]:
    return [
        Tool(
            name="list_products",
            description="List available products, optionally filtered by category",
            inputSchema={
                "type": "object",
                "properties": {
                    "category": {
                        "type": "string",
                        "description": "Filter by category (e.g. electronics, clothing)",
                    },
                    "limit": {
                        "type": "integer",
                        "description": "Max results to return (default 20)",
                        "default": 20,
                    },
                },
            },
        ),
        Tool(
            name="create_order",
            description="Place a new order for a customer",
            inputSchema={
                "type": "object",
                "properties": {
                    "customer_id": {"type": "string"},
                    "product_ids": {
                        "type": "array",
                        "items": {"type": "string"},
                        "description": "List of product IDs to order",
                    },
                    "shipping_address": {"type": "string"},
                },
                "required": ["customer_id", "product_ids", "shipping_address"],
            },
        ),
        # Additional tools: get_product, get_order_status,
        # get_customer, add_customer_note follow the same pattern
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    try:
        if name == "list_products":
            result = await api.list_products(
                category=arguments.get("category"),
                limit=arguments.get("limit", 20),
            )
        elif name == "create_order":
            result = await api.create_order(
                customer_id=arguments["customer_id"],
                product_ids=arguments["product_ids"],
                shipping_address=arguments["shipping_address"],
            )
        # ... handle remaining tools with the same pattern
        else:
            return [TextContent(type="text", text=f"Unknown tool: {name}")]
        return [TextContent(type="text", text=json.dumps(result, indent=2))]
    except Exception as e:
        return [TextContent(type="text", text=f"Error: {str(e)}")]

Running as a Stdio Server

The simplest deployment is stdio — the agent SDK spawns your server as a subprocess:

# At the bottom of server.py
import asyncio
from mcp.server.stdio import stdio_server

async def main():
    async with stdio_server() as (read_stream, write_stream):
        await server.run(read_stream, write_stream)

if __name__ == "__main__":
    asyncio.run(main())

Connect it from the agent side:

from agents import Agent, Runner
from agents.mcp import MCPServerStdio

ecommerce = MCPServerStdio(
    name="Ecommerce",
    params={
        "command": "python",
        "args": ["server.py"],
        "env": {
            "ECOMMERCE_API_URL": "https://api.myshop.com",
            "ECOMMERCE_API_KEY": "sk-...",
        },
    },
    cache_tools_list=True,
)

agent = Agent(
    name="Shop Assistant",
    instructions="You help customers browse products, place orders, and check order status.",
    mcp_servers=[ecommerce],
)

async def main():
    async with ecommerce:
        result = await Runner.run(agent, "What electronics do you have in stock?")
        print(result.final_output)

Running as an HTTP Server

For production, you often want the MCP server to run as a standalone service. Use the Streamable HTTP transport:

# http_server.py
from mcp.server.streamable_http import StreamableHTTPServer
from server import server

app = StreamableHTTPServer(server)

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8001)

Then connect from the agent:

from agents.mcp import MCPServerStreamableHTTP

ecommerce = MCPServerStreamableHTTP(
    name="Ecommerce",
    params={"url": "http://ecommerce-mcp:8001/mcp"},
    cache_tools_list=True,
)

Error Handling Best Practices

Your MCP server must handle errors gracefully. API failures should return informative messages, not crash the server:

@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    try:
        result = await dispatch_tool(name, arguments)
        return [TextContent(type="text", text=json.dumps(result, indent=2))]
    except httpx.HTTPStatusError as e:
        error_msg = f"API returned {e.response.status_code}"
        if e.response.status_code == 404:
            error_msg = f"Resource not found: {arguments}"
        elif e.response.status_code == 403:
            error_msg = "Permission denied for this operation"
        return [TextContent(type="text", text=error_msg)]
    except httpx.ConnectError:
        return [TextContent(
            type="text",
            text="Cannot reach the API server. Please try again later.",
        )]
    except Exception as e:
        return [TextContent(type="text", text=f"Unexpected error: {str(e)}")]

Testing Your MCP Server

Test tools individually before connecting them to an agent. The MCP SDK provides a test client:

import pytest
from mcp.client import ClientSession
from mcp.client.stdio import stdio_client

@pytest.mark.asyncio
async def test_list_products():
    async with stdio_client("python", ["server.py"]) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            tools = await session.list_tools()
            tool_names = [t.name for t in tools]
            assert "list_products" in tool_names
            result = await session.call_tool("list_products", {"limit": 5})
            assert len(result.content) > 0

Building a custom MCP server is the bridge between your existing APIs and the world of AI agents. The pattern is always the same: define tools with schemas, map tool calls to API requests, and handle errors cleanly. Once your first server is working, adding new tools takes minutes.

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Technical Guides

How AI Voice Agents Actually Work: Technical Deep Dive (2026 Edition)

A full technical walkthrough of how modern AI voice agents work — speech-to-text, LLM orchestration, TTS, tool calling, and sub-second latency.

Technical Guides

Building Voice Agents with the OpenAI Realtime API: Full Tutorial

Hands-on tutorial for building voice agents with the OpenAI Realtime API — WebSocket setup, PCM16 audio, server VAD, and function calling.

Technical Guides

Voice AI Latency: Why Sub-Second Response Time Matters (And How to Hit It)

A technical breakdown of voice AI latency budgets — STT, LLM, TTS, network — and how to hit sub-second end-to-end response times.

AI Interview Prep

8 AI System Design Interview Questions Actually Asked at FAANG in 2026

Real AI system design interview questions from Google, Meta, OpenAI, and Anthropic. Covers LLM serving, RAG pipelines, recommendation systems, AI agents, and more — with detailed answer frameworks.

AI Interview Prep

8 LLM & RAG Interview Questions That OpenAI, Anthropic & Google Actually Ask

Real LLM and RAG interview questions from top AI labs in 2026. Covers fine-tuning vs RAG decisions, production RAG pipelines, evaluation, PEFT methods, positional embeddings, and safety guardrails with expert answers.

AI Interview Prep

7 ML Fundamentals Questions That Top AI Companies Still Ask in 2026

Real machine learning fundamentals interview questions from OpenAI, Google DeepMind, Meta, and xAI in 2026. Covers attention mechanisms, KV cache, distributed training, MoE, speculative decoding, and emerging architectures.