Skip to content
Learn Agentic AI
Learn Agentic AI10 min read1 views

API Security Headers for AI Agent Services: CORS, CSP, and Rate Limit Headers

Configure essential security headers for AI agent APIs including CORS policies, Content Security Policy, rate limit communication headers, and other protective headers with FastAPI middleware examples.

Security Headers: Your API's First Line of Defense

HTTP security headers protect your AI agent API from common attack vectors: cross-origin abuse, content injection, information leakage, and protocol downgrade attacks. Unlike authentication and authorization (which verify who is making the request), security headers define how the request and response should be handled by browsers, proxies, and clients.

For AI agent APIs, security headers serve a dual purpose. They protect browser-based agent interfaces from XSS and clickjacking, and they communicate rate limiting information so agents can self-throttle rather than hitting walls.

CORS Configuration

Cross-Origin Resource Sharing controls which domains can call your API from a browser. For AI agent APIs, you need to balance accessibility (agents running on various domains) with security (preventing unauthorized cross-origin requests).

flowchart TD
    START["API Security Headers for AI Agent Services: CORS,…"] --> A
    A["Security Headers: Your API39s First Lin…"]
    A --> B
    B["CORS Configuration"]
    B --> C
    C["Rate Limit Headers"]
    C --> D
    D["Comprehensive Security Headers Middlewa…"]
    D --> E
    E["Request ID Tracking"]
    E --> F
    F["FAQ"]
    F --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware

app = FastAPI()

# Production CORS: restrict to known origins
app.add_middleware(
    CORSMiddleware,
    allow_origins=[
        "https://app.example.com",
        "https://dashboard.example.com",
        "https://playground.example.com",
    ],
    allow_credentials=True,
    allow_methods=["GET", "POST", "PUT", "PATCH", "DELETE"],
    allow_headers=[
        "Authorization",
        "Content-Type",
        "X-API-Key",
        "X-Request-ID",
        "Idempotency-Key",
    ],
    expose_headers=[
        "X-Request-ID",
        "X-RateLimit-Limit",
        "X-RateLimit-Remaining",
        "X-RateLimit-Reset",
        "Retry-After",
    ],
    max_age=3600,
)

The expose_headers configuration is often overlooked. By default, browsers only expose a handful of response headers to JavaScript. Without listing your rate limit headers here, browser-based agents cannot read them, even though server-to-server agents can.

Rate Limit Headers

Rate limiting is essential for AI agent APIs where a single agent can generate hundreds of requests per minute. Communicate limits clearly using standardized headers so agents can self-regulate.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

from starlette.middleware.base import BaseHTTPMiddleware
from fastapi import Request
from fastapi.responses import JSONResponse
import time

class RateLimitMiddleware(BaseHTTPMiddleware):
    def __init__(self, app, requests_per_minute: int = 60):
        super().__init__(app)
        self.rpm = requests_per_minute
        # In production, use Redis with sliding window
        self.buckets: dict[str, dict] = {}

    async def dispatch(self, request: Request, call_next):
        client_id = self._get_client_id(request)
        now = time.time()

        bucket = self.buckets.get(client_id, {
            "count": 0, "reset_at": now + 60,
        })

        if now > bucket["reset_at"]:
            bucket = {"count": 0, "reset_at": now + 60}

        bucket["count"] += 1
        self.buckets[client_id] = bucket

        remaining = max(0, self.rpm - bucket["count"])
        reset_at = int(bucket["reset_at"])

        rate_headers = {
            "X-RateLimit-Limit": str(self.rpm),
            "X-RateLimit-Remaining": str(remaining),
            "X-RateLimit-Reset": str(reset_at),
        }

        if bucket["count"] > self.rpm:
            retry_after = int(bucket["reset_at"] - now)
            return JSONResponse(
                status_code=429,
                content={
                    "type": "https://api.example.com/errors/rate-limit",
                    "title": "Rate Limit Exceeded",
                    "detail": f"Limit: {self.rpm} requests/minute",
                    "retryable": True,
                    "retry_after_seconds": retry_after,
                },
                headers={
                    **rate_headers,
                    "Retry-After": str(retry_after),
                },
            )

        response = await call_next(request)
        for key, value in rate_headers.items():
            response.headers[key] = value
        return response

    def _get_client_id(self, request: Request) -> str:
        api_key = request.headers.get("X-API-Key", "")
        if api_key:
            return f"key:{api_key}"
        forwarded = request.headers.get("X-Forwarded-For", "")
        return f"ip:{forwarded or request.client.host}"

app.add_middleware(RateLimitMiddleware, requests_per_minute=100)

Comprehensive Security Headers Middleware

Beyond CORS and rate limiting, add headers that prevent common web attacks and information leakage.

class SecurityHeadersMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        response = await call_next(request)

        # Prevent MIME type sniffing
        response.headers["X-Content-Type-Options"] = "nosniff"

        # Prevent clickjacking
        response.headers["X-Frame-Options"] = "DENY"

        # Control referrer information
        response.headers["Referrer-Policy"] = "strict-origin-when-cross-origin"

        # Force HTTPS
        response.headers["Strict-Transport-Security"] = (
            "max-age=31536000; includeSubDomains; preload"
        )

        # Remove server identification
        response.headers.pop("Server", None)

        # Permissions Policy - disable unused browser features
        response.headers["Permissions-Policy"] = (
            "camera=(), microphone=(), geolocation=(), "
            "payment=(), usb=(), magnetometer=()"
        )

        # Content Security Policy for API responses
        if "text/html" in response.headers.get("content-type", ""):
            response.headers["Content-Security-Policy"] = (
                "default-src 'none'; "
                "script-src 'self'; "
                "style-src 'self' 'unsafe-inline'; "
                "img-src 'self' data:; "
                "font-src 'self'; "
                "connect-src 'self'"
            )

        return response

app.add_middleware(SecurityHeadersMiddleware)

Request ID Tracking

Assign a unique ID to every request for distributed tracing. If the client sends an X-Request-ID header, propagate it; otherwise, generate one. This is invaluable for debugging agent interactions across multiple services.

import uuid

class RequestIDMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        request_id = request.headers.get(
            "X-Request-ID", str(uuid.uuid4())
        )
        request.state.request_id = request_id

        response = await call_next(request)
        response.headers["X-Request-ID"] = request_id
        return response

app.add_middleware(RequestIDMiddleware)

FAQ

Should I use wildcard CORS (*) for my AI agent API?

Never use wildcard CORS in production for APIs that use cookies or bearer tokens. A wildcard origin with allow_credentials=True is actually rejected by browsers for security reasons. For public APIs that use API keys in headers rather than cookies, a wildcard origin is acceptable but still not recommended. List specific allowed origins and use environment variables to configure them per deployment environment.

What is the difference between X-RateLimit headers and the standard Retry-After header?

They serve complementary purposes. The X-RateLimit-* headers are informational and sent on every response, telling the client their current quota status (limit, remaining, reset time). The Retry-After header is directive and only sent with 429 or 503 responses, telling the client exactly how many seconds to wait before retrying. Always include both: the rate limit headers for proactive throttling and Retry-After for reactive recovery.

Should I apply rate limiting per API key or per IP address?

Apply rate limiting per API key for authenticated requests and per IP for unauthenticated requests. API key-based limiting is more accurate since multiple users may share an IP (corporate NATs, VPNs). Consider tiered rate limits based on the subscription plan — a free tier might get 10 requests per minute while an enterprise tier gets 1000. Always communicate the current tier's limits in the rate limit response headers.


#APISecurity #CORS #RateLimiting #HTTPHeaders #FastAPI #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Learn Agentic AI

Agent Gateway Pattern: Rate Limiting, Authentication, and Request Routing for AI Agents

Implementing an agent gateway with API key management, per-agent rate limiting, intelligent request routing, audit logging, and cost tracking for enterprise AI systems.

Learn Agentic AI

Building a Social Media Automation Agent: Content Posting, Scheduling, and Engagement

Learn to build an AI agent for social media automation covering platform API integration versus browser automation, content scheduling, engagement monitoring, and rate limiting strategies.

Learn Agentic AI

Stripe Webhook Agent: Handling Payments, Subscriptions, and Invoice Events

Build an AI agent that processes Stripe webhook events for payments, subscriptions, and invoices with proper handler routing, state management, and failure recovery.

Learn Agentic AI

Deploying AI Agents with FastAPI: REST Endpoints for Agent Interactions

Learn how to expose AI agents through production-grade FastAPI REST endpoints with async request handling, Pydantic validation, structured error responses, and streaming support.

Learn Agentic AI

Building a GitHub Event Agent: Auto-Responding to Issues, PRs, and Deployments

Build a GitHub webhook-powered AI agent that automatically triages issues, reviews pull requests, and monitors deployment status using FastAPI and the GitHub API.

Learn Agentic AI

Health Checks and Readiness Probes for AI Agent Services

Design robust health check and readiness probe endpoints for AI agent services that verify dependencies, enable graceful startup and shutdown, and integrate with container orchestrators.