MCP Security Best Practices for Production Agents
Secure your MCP-powered agents for production with authentication, network policies, tool approval workflows, audit logging, rate limiting, and defense-in-depth strategies.
Why MCP Security Matters
MCP servers give AI agents the ability to take real actions — read files, query databases, send emails, modify records. A misconfigured MCP server is not just a bug. It is a security vulnerability that an adversary or a hallucinating model can exploit to access data or modify systems.
The default configuration of most MCP servers is designed for development convenience, not production security. Moving to production requires deliberately layering security controls at every level. This post covers five essential layers: authentication, network policies, tool approval, audit logging, and rate limiting.
Layer 1: Authentication and Authorization
Server-to-Server Authentication
Every MCP server should require authentication. For HTTP-based MCP servers (Streamable HTTP transport), use bearer tokens or mutual TLS:
flowchart TD
START["MCP Security Best Practices for Production Agents"] --> A
A["Why MCP Security Matters"]
A --> B
B["Layer 1: Authentication and Authorizati…"]
B --> C
C["Layer 2: Network Policies"]
C --> D
D["Layer 3: Tool Approval Workflows"]
D --> E
E["Layer 4: Audit Logging"]
E --> F
F["Layer 5: Rate Limiting"]
F --> G
G["Defense in Depth: Putting It All Togeth…"]
G --> H
H["Security Checklist for Production MCP"]
H --> DONE["Key Takeaways"]
style START fill:#4f46e5,stroke:#4338ca,color:#fff
style DONE fill:#059669,stroke:#047857,color:#fff
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.responses import JSONResponse
import os
VALID_TOKENS = set(os.environ.get("MCP_AUTH_TOKENS", "").split(","))
class AuthMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request, call_next):
auth = request.headers.get("Authorization", "")
if not auth.startswith("Bearer "):
return JSONResponse({"error": "Unauthorized"}, status_code=401)
if auth[7:] not in VALID_TOKENS:
return JSONResponse({"error": "Forbidden"}, status_code=403)
return await call_next(request)
server = Server("secure-server")
app = StreamableHTTPServer(server, middleware=[Middleware(AuthMiddleware)])
On the agent side, pass auth headers when connecting:
from agents.mcp import MCPServerStreamableHTTP
secure_server = MCPServerStreamableHTTP(
name="SecureDB",
params={
"url": "http://db-mcp:8001/mcp",
"headers": {
"Authorization": f"Bearer {os.environ['MCP_DB_TOKEN']}",
},
},
cache_tools_list=True,
)
Per-User Authorization
Not every user should have access to every tool. Implement per-user authorization by passing user context (role, user ID) through the MCP arguments and checking a permissions map server-side. Map each tool to a list of allowed roles (e.g., "delete_records" requires "admin", while "read_records" allows "viewer"). Reject calls from unauthorized roles with a clear permission denied message.
Layer 2: Network Policies
Principle of Least Network Access
MCP servers should only be accessible from the agent service, never from the public internet. In Kubernetes, use NetworkPolicy to restrict traffic:
# network-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: mcp-server-policy
namespace: agents
spec:
podSelector:
matchLabels:
app: mcp-database-server
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: agent-service
ports:
- port: 8001
protocol: TCP
This ensures only the agent service pod can reach the MCP database server. No other pod, and no external traffic, can connect.
Stdio Server Isolation
For stdio-based MCP servers, the security boundary is the subprocess environment. Limit what the subprocess can access:
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
import os
# Restrict environment variables passed to the subprocess
safe_env = {
"PATH": "/usr/local/bin:/usr/bin:/bin",
"HOME": "/tmp/mcp-sandbox",
"ECOMMERCE_API_URL": os.environ["ECOMMERCE_API_URL"],
"ECOMMERCE_API_KEY": os.environ["ECOMMERCE_API_KEY"],
}
server = MCPServerStdio(
name="EcommerceTools",
params={
"command": "python",
"args": ["ecommerce_server.py"],
"env": safe_env, # Only these env vars are visible
},
)
Never pass the full os.environ to a subprocess. This could leak database passwords, cloud credentials, or API keys that the MCP server does not need.
Layer 3: Tool Approval Workflows
Even with authentication and network controls, you may want human approval before certain tools execute. The Agents SDK supports approval policies for this purpose.
flowchart TD
ROOT["MCP Security Best Practices for Production A…"]
ROOT --> P0["Layer 1: Authentication and Authorizati…"]
P0 --> P0C0["Server-to-Server Authentication"]
P0 --> P0C1["Per-User Authorization"]
ROOT --> P1["Layer 2: Network Policies"]
P1 --> P1C0["Principle of Least Network Access"]
P1 --> P1C1["Stdio Server Isolation"]
ROOT --> P2["Layer 3: Tool Approval Workflows"]
P2 --> P2C0["Static Approval for Dangerous Tools"]
ROOT --> P3["Layer 4: Audit Logging"]
P3 --> P3C0["Structured Audit Logs"]
style ROOT fill:#4f46e5,stroke:#4338ca,color:#fff
style P0 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
style P1 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
style P2 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
style P3 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
Static Approval for Dangerous Tools
Use a static tool filter combined with an approval callback:
from agents.mcp.util import create_static_tool_filter
# Allow read tools freely, require approval for writes
read_tools = {"list_products", "get_product", "get_order", "get_customer"}
write_tools = {"create_order", "update_order", "delete_order", "add_note"}
async def approval_callback(tool_name: str, arguments: dict) -> bool:
"""In production, this sends a Slack message or UI prompt."""
if tool_name in read_tools:
return True
print(f"APPROVAL REQUIRED: {tool_name}")
print(f"Arguments: {arguments}")
# In production: send to approval queue, wait for response
# For demo: auto-approve
return True
tool_filter = create_static_tool_filter(
allowed_tool_names=read_tools | write_tools
)
You can extend this pattern with context-aware approval that checks argument values — auto-approving small orders while requiring human sign-off for large ones, or always requiring approval for delete operations.
Layer 4: Audit Logging
Every tool invocation should be logged with enough context to reconstruct what happened, when, and why. This is essential for compliance, debugging, and incident response.
Structured Audit Logs
import structlog, time
from datetime import datetime, timezone
audit = structlog.get_logger("mcp.audit")
SENSITIVE = {"password", "token", "api_key", "secret", "ssn"}
def sanitize(args: dict) -> dict:
return {k: "***" if k.lower() in SENSITIVE else v for k, v in args.items()}
@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
user_id = arguments.pop("_user_id", "unknown")
start = time.perf_counter()
try:
result = await execute_tool(name, arguments)
audit.info("tool_ok", tool=name, user=user_id,
args=sanitize(arguments),
ms=round((time.perf_counter() - start) * 1000, 2))
return result
except Exception as e:
audit.error("tool_fail", tool=name, user=user_id,
args=sanitize(arguments), error=str(e),
ms=round((time.perf_counter() - start) * 1000, 2))
raise
For compliance, persist audit logs to a durable store like PostgreSQL or a dedicated logging service rather than just stdout. Include a sanitize_arguments step that redacts sensitive fields (passwords, tokens, API keys) before writing to the log.
Layer 5: Rate Limiting
Without rate limiting, a runaway agent loop could hammer your MCP servers with thousands of tool calls per minute. This can exhaust database connections, trigger API rate limits on downstream services, or simply consume excessive resources.
flowchart TD
CENTER(("Core Concepts"))
CENTER --> N0["All HTTP MCP servers require bearer tok…"]
CENTER --> N1["Stdio servers receive only the environm…"]
CENTER --> N2["Write and delete tools require explicit…"]
CENTER --> N3["Every tool call is logged with user ID,…"]
CENTER --> N4["Per-tool and per-session rate limits ar…"]
CENTER --> N5["You have a runbook for revoking tokens …"]
style CENTER fill:#4f46e5,stroke:#4338ca,color:#fff
Per-Tool Rate Limiting
from collections import defaultdict
from datetime import datetime, timedelta
class ToolRateLimiter:
def __init__(self):
self.call_timestamps = defaultdict(list)
self.limits = {
"create_order": {"max_calls": 10, "window_seconds": 60},
"delete_order": {"max_calls": 5, "window_seconds": 60},
"_default": {"max_calls": 100, "window_seconds": 60},
}
def check(self, tool_name: str) -> bool:
config = self.limits.get(tool_name, self.limits["_default"])
cutoff = datetime.now() - timedelta(seconds=config["window_seconds"])
self.call_timestamps[tool_name] = [
ts for ts in self.call_timestamps[tool_name] if ts > cutoff
]
if len(self.call_timestamps[tool_name]) >= config["max_calls"]:
return False
self.call_timestamps[tool_name].append(datetime.now())
return True
rate_limiter = ToolRateLimiter()
@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
if not rate_limiter.check(name):
return [TextContent(
type="text",
text=f"Rate limit exceeded for '{name}'. Try again later.",
)]
return await execute_tool(name, arguments)
In addition to per-tool limits, add a global per-session rate limit that caps the total number of tool calls any single agent session can make. This prevents runaway loops from exhausting resources even if individual tool limits are not exceeded.
Defense in Depth: Putting It All Together
No single security layer is sufficient. Production MCP deployments should combine all five: authenticated connections on the agent side (Layer 1), NetworkPolicies restricting server access in Kubernetes (Layer 2), tool approval callbacks for write operations (Layer 3), structured audit logging inside the server (Layer 4), and per-tool and per-session rate limiting (Layer 5). Each layer catches threats that others miss.
Security Checklist for Production MCP
Before deploying an MCP-powered agent to production, verify each item:
- All HTTP MCP servers require bearer tokens or mTLS, rotated every 90 days
- Stdio servers receive only the environment variables they need — no secrets hardcoded in source
- MCP servers are not exposed to the public internet; Kubernetes NetworkPolicies restrict ingress to the agent service only
- Write and delete tools require explicit approval; tool filters block unnecessary tools
- Every tool call is logged with user ID, tool name, sanitized arguments, status, and duration
- Per-tool and per-session rate limits are configured, with violations logged and alerted
- You have a runbook for revoking tokens and disabling tools without redeployment
MCP security is not a one-time setup. It requires ongoing attention as new servers are added, tools are modified, and agents are given new capabilities. Treat every MCP tool like an API endpoint — because that is exactly what it is. Apply the same security rigor you would to any production API surface.
Written by
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.