Skip to content
Learn Agentic AI
Learn Agentic AI14 min read3 views

Capstone: Building a Multi-Tenant AI Agent SaaS with Usage-Based Billing

Build a production SaaS platform where multiple tenants can create and deploy AI agents with tenant isolation, a visual agent builder, usage tracking, and Stripe-based usage billing.

SaaS Architecture for AI Agents

Building a multi-tenant AI agent platform requires solving four hard problems simultaneously: tenant isolation (one customer's data and agents must never leak to another), dynamic agent configuration (tenants create agents without writing code), usage metering (track every LLM call, tool invocation, and conversation), and billing (charge based on actual consumption).

This capstone builds a platform where each tenant signs up, creates agents through a web-based builder, deploys them to their own endpoints, and pays based on usage. The architecture uses a shared PostgreSQL database with row-level tenant isolation, a FastAPI backend, and Stripe for billing.

Data Model with Tenant Isolation

Every table includes a tenant_id column. All queries are scoped to the authenticated tenant.

flowchart TD
    START["Capstone: Building a Multi-Tenant AI Agent SaaS w…"] --> A
    A["SaaS Architecture for AI Agents"]
    A --> B
    B["Data Model with Tenant Isolation"]
    B --> C
    C["Tenant-Scoped Dependency Injection"]
    C --> D
    D["Dynamic Agent Builder"]
    D --> E
    E["Usage Metering"]
    E --> F
    F["Stripe Billing Integration"]
    F --> G
    G["Tenant API Endpoint"]
    G --> H
    H["FAQ"]
    H --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
# models.py
from sqlalchemy import Column, String, Text, Integer, Float, DateTime, ForeignKey
from sqlalchemy.dialects.postgresql import UUID, JSONB
import uuid

class Tenant(Base):
    __tablename__ = "tenants"
    id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
    name = Column(String(200), nullable=False)
    slug = Column(String(100), unique=True, nullable=False)
    stripe_customer_id = Column(String(100), nullable=True)
    plan = Column(String(50), default="free")  # free, starter, pro, enterprise
    api_key = Column(String(100), unique=True)
    created_at = Column(DateTime, server_default="now()")

class AgentConfig(Base):
    __tablename__ = "agent_configs"
    id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
    tenant_id = Column(UUID(as_uuid=True), ForeignKey("tenants.id"), index=True)
    name = Column(String(200))
    instructions = Column(Text)
    model = Column(String(50), default="gpt-4o")
    tools = Column(JSONB, default=[])  # list of enabled tool configs
    is_active = Column(String(10), default="true")
    created_at = Column(DateTime, server_default="now()")

class UsageRecord(Base):
    __tablename__ = "usage_records"
    id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
    tenant_id = Column(UUID(as_uuid=True), ForeignKey("tenants.id"), index=True)
    agent_id = Column(UUID(as_uuid=True), ForeignKey("agent_configs.id"))
    event_type = Column(String(50))  # "llm_call", "tool_call", "conversation"
    tokens_input = Column(Integer, default=0)
    tokens_output = Column(Integer, default=0)
    cost_cents = Column(Float, default=0)
    metadata_ = Column(JSONB, default={})
    created_at = Column(DateTime, server_default="now()")

Tenant-Scoped Dependency Injection

Use a FastAPI dependency that extracts the tenant from the API key and scopes all database queries.

# core/auth.py
from fastapi import Depends, HTTPException, Security
from fastapi.security import APIKeyHeader

api_key_header = APIKeyHeader(name="X-API-Key")

async def get_current_tenant(
    api_key: str = Security(api_key_header),
    db=Depends(get_db),
) -> Tenant:
    tenant = db.query(Tenant).filter(Tenant.api_key == api_key).first()
    if not tenant:
        raise HTTPException(status_code=401, detail="Invalid API key")
    return tenant

class TenantScoped:
    """Utility to scope queries to the current tenant."""
    def __init__(self, db, tenant: Tenant):
        self.db = db
        self.tenant_id = tenant.id

    def query(self, model):
        return self.db.query(model).filter(model.tenant_id == self.tenant_id)

Dynamic Agent Builder

Tenants configure agents through the admin dashboard. The backend loads agent configurations from the database and instantiates them on demand.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

# services/agent_factory.py
from agents import Agent, function_tool

# Registry of available tools that tenants can enable
TOOL_REGISTRY = {
    "search_kb": search_knowledge_base,
    "send_email": send_email_tool,
    "create_ticket": create_ticket_tool,
    "lookup_order": lookup_order_tool,
    "check_calendar": check_calendar_tool,
}

def build_agent_from_config(config: AgentConfig) -> Agent:
    """Dynamically build an Agent from a database configuration."""
    enabled_tools = []
    for tool_config in config.tools:
        tool_name = tool_config["name"]
        if tool_name in TOOL_REGISTRY:
            enabled_tools.append(TOOL_REGISTRY[tool_name])

    return Agent(
        name=config.name,
        instructions=config.instructions,
        model=config.model,
        tools=enabled_tools,
    )

Usage Metering

Every LLM call and tool invocation is recorded for billing.

# services/metering.py
from datetime import datetime

TOKEN_COSTS = {
    "gpt-4o": {"input": 0.25, "output": 1.00},      # per 100k tokens
    "gpt-4o-mini": {"input": 0.015, "output": 0.06},
}

async def record_usage(
    db, tenant_id: str, agent_id: str,
    event_type: str, tokens_in: int, tokens_out: int, model: str
):
    costs = TOKEN_COSTS.get(model, TOKEN_COSTS["gpt-4o"])
    cost = (tokens_in * costs["input"] + tokens_out * costs["output"]) / 100_000

    record = UsageRecord(
        tenant_id=tenant_id,
        agent_id=agent_id,
        event_type=event_type,
        tokens_input=tokens_in,
        tokens_output=tokens_out,
        cost_cents=cost * 100,  # store in cents
    )
    db.add(record)
    db.commit()

Stripe Billing Integration

Sync usage to Stripe at the end of each billing period using Stripe metered billing.

# services/billing.py
import stripe
from sqlalchemy import func
from datetime import datetime, timedelta

stripe.api_key = os.environ["STRIPE_SECRET_KEY"]

async def sync_usage_to_stripe(tenant_id: str, db):
    """Report usage to Stripe for metered billing."""
    tenant = db.query(Tenant).get(tenant_id)
    if not tenant.stripe_customer_id:
        return

    # Calculate usage since last sync
    period_start = datetime.utcnow() - timedelta(days=1)
    total_cost = db.query(func.sum(UsageRecord.cost_cents)).filter(
        UsageRecord.tenant_id == tenant_id,
        UsageRecord.created_at >= period_start,
    ).scalar() or 0

    # Report to Stripe
    stripe.billing.MeterEvent.create(
        event_name="ai_agent_usage",
        payload={
            "value": str(int(total_cost)),
            "stripe_customer_id": tenant.stripe_customer_id,
        },
    )

async def get_tenant_usage_summary(tenant_id: str, days: int, db) -> dict:
    since = datetime.utcnow() - timedelta(days=days)
    records = db.query(UsageRecord).filter(
        UsageRecord.tenant_id == tenant_id,
        UsageRecord.created_at >= since,
    ).all()
    return {
        "total_cost_cents": sum(r.cost_cents for r in records),
        "total_llm_calls": sum(1 for r in records if r.event_type == "llm_call"),
        "total_tokens_input": sum(r.tokens_input for r in records),
        "total_tokens_output": sum(r.tokens_output for r in records),
        "total_conversations": sum(1 for r in records if r.event_type == "conversation"),
    }

Tenant API Endpoint

Each tenant gets their own agent endpoint, authenticated by their API key.

# routes/agent_api.py
from fastapi import APIRouter

router = APIRouter()

@router.post("/v1/chat")
async def chat(
    body: ChatRequest,
    tenant: Tenant = Depends(get_current_tenant),
    db=Depends(get_db),
):
    scoped = TenantScoped(db, tenant)
    config = scoped.query(AgentConfig).filter(
        AgentConfig.id == body.agent_id
    ).first()
    if not config:
        raise HTTPException(404, "Agent not found")

    agent = build_agent_from_config(config)
    result = await Runner.run(agent, body.message)

    # Record usage
    usage = result.raw_responses[-1].usage
    await record_usage(
        db, str(tenant.id), str(config.id),
        "llm_call", usage.input_tokens, usage.output_tokens, config.model
    )

    return {"reply": result.final_output, "agent": config.name}

FAQ

How do I prevent one tenant's heavy usage from affecting others?

Implement per-tenant rate limiting using a Redis-backed token bucket. Each tenant gets a request-per-minute and tokens-per-day limit based on their plan tier. When a tenant exceeds their limit, return a 429 status code with a Retry-After header.

How do I handle tenant data deletion for compliance?

Implement a cascade delete that removes all tenant data: agent configs, usage records, conversations, and any uploaded knowledge base documents. Use a soft-delete first (mark as deleted with a timestamp) and run a hard-delete job after a 30-day grace period. Log the deletion for audit compliance.

How do I let tenants bring their own API keys?

Store tenant-provided API keys encrypted in the database. When building an agent for that tenant, configure the OpenAI client with their key instead of yours. This shifts LLM costs to the tenant while you charge only for platform usage. Validate the key on save by making a minimal API call.


#CapstoneProject #SaaS #MultiTenant #Billing #AgentBuilder #FullStackAI #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Healthcare

Los Angeles Small Practices and Billing Questions and Payment Collection: The AI Voice Approach

How small healthcare practices in Los Angeles use AI voice and chat agents to automate billing questions and payment collection and give their admin staff real ho...

Buyer Guides

Self-Hosted vs SaaS AI Voice Agents: Which Deployment Model Is Right for You?

Comparing self-hosted and SaaS AI voice agent deployments — security, cost, latency, and compliance tradeoffs.

Use Cases

Billing Questions Swamp Finance and Support: Use Chat and Voice Agents to Deflect the Repeaters

Billing and invoice questions often bounce between departments. Learn how AI chat and voice agents answer the common ones and route only real exceptions.

Learn Agentic AI

User Onboarding for AI Agent Platforms: Self-Service Agent Creation and Configuration

Design a user onboarding flow that takes customers from sign-up to a working AI agent in under five minutes, including template selection, guided prompt configuration, and first-conversation testing.

Learn Agentic AI

AI Agent for Customer Onboarding: Guided Setup and Feature Discovery

Build an AI onboarding agent that guides new customers through product setup, tracks their progress, offers contextual help, and optimizes for activation metrics.

Learn Agentic AI

Building an Agent Builder UI: No-Code Agent Configuration for Non-Technical Users

Design and implement a no-code agent builder that lets non-technical users create, configure, and test AI agents through visual flows, prompt editors, tool configuration panels, and a live testing sandbox.