Skip to content
Learn Agentic AI
Learn Agentic AI13 min read8 views

MCP Security Best Practices for Production Agents

Secure your MCP-powered agents for production with authentication, network policies, tool approval workflows, audit logging, rate limiting, and defense-in-depth strategies.

Why MCP Security Matters

MCP servers give AI agents the ability to take real actions — read files, query databases, send emails, modify records. A misconfigured MCP server is not just a bug. It is a security vulnerability that an adversary or a hallucinating model can exploit to access data or modify systems.

The default configuration of most MCP servers is designed for development convenience, not production security. Moving to production requires deliberately layering security controls at every level. This post covers five essential layers: authentication, network policies, tool approval, audit logging, and rate limiting.

Layer 1: Authentication and Authorization

Server-to-Server Authentication

Every MCP server should require authentication. For HTTP-based MCP servers (Streamable HTTP transport), use bearer tokens or mutual TLS:

flowchart TD
    START["MCP Security Best Practices for Production Agents"] --> A
    A["Why MCP Security Matters"]
    A --> B
    B["Layer 1: Authentication and Authorizati…"]
    B --> C
    C["Layer 2: Network Policies"]
    C --> D
    D["Layer 3: Tool Approval Workflows"]
    D --> E
    E["Layer 4: Audit Logging"]
    E --> F
    F["Layer 5: Rate Limiting"]
    F --> G
    G["Defense in Depth: Putting It All Togeth…"]
    G --> H
    H["Security Checklist for Production MCP"]
    H --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.responses import JSONResponse
import os

VALID_TOKENS = set(os.environ.get("MCP_AUTH_TOKENS", "").split(","))

class AuthMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request, call_next):
        auth = request.headers.get("Authorization", "")
        if not auth.startswith("Bearer "):
            return JSONResponse({"error": "Unauthorized"}, status_code=401)
        if auth[7:] not in VALID_TOKENS:
            return JSONResponse({"error": "Forbidden"}, status_code=403)
        return await call_next(request)

server = Server("secure-server")
app = StreamableHTTPServer(server, middleware=[Middleware(AuthMiddleware)])

On the agent side, pass auth headers when connecting:

from agents.mcp import MCPServerStreamableHTTP

secure_server = MCPServerStreamableHTTP(
    name="SecureDB",
    params={
        "url": "http://db-mcp:8001/mcp",
        "headers": {
            "Authorization": f"Bearer {os.environ['MCP_DB_TOKEN']}",
        },
    },
    cache_tools_list=True,
)

Per-User Authorization

Not every user should have access to every tool. Implement per-user authorization by passing user context (role, user ID) through the MCP arguments and checking a permissions map server-side. Map each tool to a list of allowed roles (e.g., "delete_records" requires "admin", while "read_records" allows "viewer"). Reject calls from unauthorized roles with a clear permission denied message.

Layer 2: Network Policies

Principle of Least Network Access

MCP servers should only be accessible from the agent service, never from the public internet. In Kubernetes, use NetworkPolicy to restrict traffic:

# network-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: mcp-server-policy
  namespace: agents
spec:
  podSelector:
    matchLabels:
      app: mcp-database-server
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: agent-service
      ports:
        - port: 8001
          protocol: TCP

This ensures only the agent service pod can reach the MCP database server. No other pod, and no external traffic, can connect.

Stdio Server Isolation

For stdio-based MCP servers, the security boundary is the subprocess environment. Limit what the subprocess can access:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

import os

# Restrict environment variables passed to the subprocess
safe_env = {
    "PATH": "/usr/local/bin:/usr/bin:/bin",
    "HOME": "/tmp/mcp-sandbox",
    "ECOMMERCE_API_URL": os.environ["ECOMMERCE_API_URL"],
    "ECOMMERCE_API_KEY": os.environ["ECOMMERCE_API_KEY"],
}

server = MCPServerStdio(
    name="EcommerceTools",
    params={
        "command": "python",
        "args": ["ecommerce_server.py"],
        "env": safe_env,  # Only these env vars are visible
    },
)

Never pass the full os.environ to a subprocess. This could leak database passwords, cloud credentials, or API keys that the MCP server does not need.

Layer 3: Tool Approval Workflows

Even with authentication and network controls, you may want human approval before certain tools execute. The Agents SDK supports approval policies for this purpose.

flowchart TD
    ROOT["MCP Security Best Practices for Production A…"] 
    ROOT --> P0["Layer 1: Authentication and Authorizati…"]
    P0 --> P0C0["Server-to-Server Authentication"]
    P0 --> P0C1["Per-User Authorization"]
    ROOT --> P1["Layer 2: Network Policies"]
    P1 --> P1C0["Principle of Least Network Access"]
    P1 --> P1C1["Stdio Server Isolation"]
    ROOT --> P2["Layer 3: Tool Approval Workflows"]
    P2 --> P2C0["Static Approval for Dangerous Tools"]
    ROOT --> P3["Layer 4: Audit Logging"]
    P3 --> P3C0["Structured Audit Logs"]
    style ROOT fill:#4f46e5,stroke:#4338ca,color:#fff
    style P0 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
    style P1 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
    style P2 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
    style P3 fill:#e0e7ff,stroke:#6366f1,color:#1e293b

Static Approval for Dangerous Tools

Use a static tool filter combined with an approval callback:

from agents.mcp.util import create_static_tool_filter

# Allow read tools freely, require approval for writes
read_tools = {"list_products", "get_product", "get_order", "get_customer"}
write_tools = {"create_order", "update_order", "delete_order", "add_note"}

async def approval_callback(tool_name: str, arguments: dict) -> bool:
    """In production, this sends a Slack message or UI prompt."""
    if tool_name in read_tools:
        return True

    print(f"APPROVAL REQUIRED: {tool_name}")
    print(f"Arguments: {arguments}")
    # In production: send to approval queue, wait for response
    # For demo: auto-approve
    return True

tool_filter = create_static_tool_filter(
    allowed_tool_names=read_tools | write_tools
)

You can extend this pattern with context-aware approval that checks argument values — auto-approving small orders while requiring human sign-off for large ones, or always requiring approval for delete operations.

Layer 4: Audit Logging

Every tool invocation should be logged with enough context to reconstruct what happened, when, and why. This is essential for compliance, debugging, and incident response.

Structured Audit Logs

import structlog, time
from datetime import datetime, timezone

audit = structlog.get_logger("mcp.audit")
SENSITIVE = {"password", "token", "api_key", "secret", "ssn"}

def sanitize(args: dict) -> dict:
    return {k: "***" if k.lower() in SENSITIVE else v for k, v in args.items()}

@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    user_id = arguments.pop("_user_id", "unknown")
    start = time.perf_counter()
    try:
        result = await execute_tool(name, arguments)
        audit.info("tool_ok", tool=name, user=user_id,
                    args=sanitize(arguments),
                    ms=round((time.perf_counter() - start) * 1000, 2))
        return result
    except Exception as e:
        audit.error("tool_fail", tool=name, user=user_id,
                     args=sanitize(arguments), error=str(e),
                     ms=round((time.perf_counter() - start) * 1000, 2))
        raise

For compliance, persist audit logs to a durable store like PostgreSQL or a dedicated logging service rather than just stdout. Include a sanitize_arguments step that redacts sensitive fields (passwords, tokens, API keys) before writing to the log.

Layer 5: Rate Limiting

Without rate limiting, a runaway agent loop could hammer your MCP servers with thousands of tool calls per minute. This can exhaust database connections, trigger API rate limits on downstream services, or simply consume excessive resources.

flowchart TD
    CENTER(("Core Concepts"))
    CENTER --> N0["All HTTP MCP servers require bearer tok…"]
    CENTER --> N1["Stdio servers receive only the environm…"]
    CENTER --> N2["Write and delete tools require explicit…"]
    CENTER --> N3["Every tool call is logged with user ID,…"]
    CENTER --> N4["Per-tool and per-session rate limits ar…"]
    CENTER --> N5["You have a runbook for revoking tokens …"]
    style CENTER fill:#4f46e5,stroke:#4338ca,color:#fff

Per-Tool Rate Limiting

from collections import defaultdict
from datetime import datetime, timedelta

class ToolRateLimiter:
    def __init__(self):
        self.call_timestamps = defaultdict(list)
        self.limits = {
            "create_order": {"max_calls": 10, "window_seconds": 60},
            "delete_order": {"max_calls": 5, "window_seconds": 60},
            "_default": {"max_calls": 100, "window_seconds": 60},
        }

    def check(self, tool_name: str) -> bool:
        config = self.limits.get(tool_name, self.limits["_default"])
        cutoff = datetime.now() - timedelta(seconds=config["window_seconds"])
        self.call_timestamps[tool_name] = [
            ts for ts in self.call_timestamps[tool_name] if ts > cutoff
        ]
        if len(self.call_timestamps[tool_name]) >= config["max_calls"]:
            return False
        self.call_timestamps[tool_name].append(datetime.now())
        return True

rate_limiter = ToolRateLimiter()

@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    if not rate_limiter.check(name):
        return [TextContent(
            type="text",
            text=f"Rate limit exceeded for '{name}'. Try again later.",
        )]
    return await execute_tool(name, arguments)

In addition to per-tool limits, add a global per-session rate limit that caps the total number of tool calls any single agent session can make. This prevents runaway loops from exhausting resources even if individual tool limits are not exceeded.

Defense in Depth: Putting It All Together

No single security layer is sufficient. Production MCP deployments should combine all five: authenticated connections on the agent side (Layer 1), NetworkPolicies restricting server access in Kubernetes (Layer 2), tool approval callbacks for write operations (Layer 3), structured audit logging inside the server (Layer 4), and per-tool and per-session rate limiting (Layer 5). Each layer catches threats that others miss.

Security Checklist for Production MCP

Before deploying an MCP-powered agent to production, verify each item:

  • All HTTP MCP servers require bearer tokens or mTLS, rotated every 90 days
  • Stdio servers receive only the environment variables they need — no secrets hardcoded in source
  • MCP servers are not exposed to the public internet; Kubernetes NetworkPolicies restrict ingress to the agent service only
  • Write and delete tools require explicit approval; tool filters block unnecessary tools
  • Every tool call is logged with user ID, tool name, sanitized arguments, status, and duration
  • Per-tool and per-session rate limits are configured, with violations logged and alerted
  • You have a runbook for revoking tokens and disabling tools without redeployment

MCP security is not a one-time setup. It requires ongoing attention as new servers are added, tools are modified, and agents are given new capabilities. Treat every MCP tool like an API endpoint — because that is exactly what it is. Apply the same security rigor you would to any production API surface.

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.