Skip to content
Learn Agentic AI
Learn Agentic AI11 min read2 views

Building AI Agents with Next.js API Routes: Full-Stack Agent Applications

Learn how to build full-stack AI agent applications using Next.js API routes. Covers streaming responses, middleware for authentication, edge runtime considerations, conversation persistence, and production patterns for server-side agent logic.

Why Next.js for AI Agent Applications

Next.js provides the rare combination of a React frontend, a server-side API layer, and deployment infrastructure in a single framework. For AI agent applications, this means you can define your agent logic in API routes, stream responses to React components, and deploy everything as one unit — no separate backend service required.

The App Router's route handlers, combined with the Vercel AI SDK or raw streaming APIs, make Next.js one of the fastest paths from idea to deployed agent application.

Basic Agent API Route

Create a route handler that processes messages and returns agent responses:

flowchart TD
    START["Building AI Agents with Next.js API Routes: Full-…"] --> A
    A["Why Next.js for AI Agent Applications"]
    A --> B
    B["Basic Agent API Route"]
    B --> C
    C["Streaming Responses from API Routes"]
    C --> D
    D["Authentication Middleware"]
    D --> E
    E["Conversation Persistence"]
    E --> F
    F["Rate Limiting"]
    F --> G
    G["Edge Runtime Considerations"]
    G --> H
    H["FAQ"]
    H --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
// app/api/agent/route.ts
import { NextRequest, NextResponse } from "next/server";
import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

export async function POST(req: NextRequest) {
  const { messages, threadId } = await req.json();

  if (!messages || !Array.isArray(messages)) {
    return NextResponse.json(
      { error: "messages array is required" },
      { status: 400 }
    );
  }

  const completion = await client.chat.completions.create({
    model: "gpt-4o",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      ...messages,
    ],
  });

  return NextResponse.json({
    message: completion.choices[0].message,
    usage: completion.usage,
  });
}

Streaming Responses from API Routes

For real-time UIs, stream tokens instead of waiting for the full response:

// app/api/agent/stream/route.ts
import { NextRequest } from "next/server";
import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

export async function POST(req: NextRequest) {
  const { messages } = await req.json();

  const stream = await client.chat.completions.create({
    model: "gpt-4o",
    messages,
    stream: true,
  });

  const encoder = new TextEncoder();

  const readable = new ReadableStream({
    async start(controller) {
      for await (const chunk of stream) {
        const text = chunk.choices[0]?.delta?.content;
        if (text) {
          controller.enqueue(
            encoder.encode(`data: ${JSON.stringify({ text })}

`)
          );
        }
      }
      controller.enqueue(encoder.encode("data: [DONE]

"));
      controller.close();
    },
  });

  return new Response(readable, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
      Connection: "keep-alive",
    },
  });
}

This implements Server-Sent Events (SSE) manually. The client connects to this endpoint and receives tokens as they arrive from the LLM.

Authentication Middleware

Protect your agent endpoints with middleware that validates session tokens:

// middleware.ts
import { NextResponse } from "next/server";
import type { NextRequest } from "next/server";

export function middleware(request: NextRequest) {
  if (request.nextUrl.pathname.startsWith("/api/agent")) {
    const authHeader = request.headers.get("authorization");

    if (!authHeader?.startsWith("Bearer ")) {
      return NextResponse.json(
        { error: "Authentication required" },
        { status: 401 }
      );
    }

    // Validate the token (JWT verification, database lookup, etc.)
    const token = authHeader.slice(7);
    // Add your token validation logic here
  }

  return NextResponse.next();
}

export const config = {
  matcher: "/api/agent/:path*",
};

Conversation Persistence

Store conversation history so users can resume sessions:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

// app/api/agent/route.ts
import { prisma } from "@/lib/prisma";

export async function POST(req: NextRequest) {
  const { message, conversationId } = await req.json();
  const userId = req.headers.get("x-user-id")!;

  // Load or create conversation
  let conversation = conversationId
    ? await prisma.conversation.findUnique({
        where: { id: conversationId, userId },
        include: { messages: { orderBy: { createdAt: "asc" } } },
      })
    : await prisma.conversation.create({
        data: { userId },
        include: { messages: true },
      });

  if (!conversation) {
    return NextResponse.json({ error: "Not found" }, { status: 404 });
  }

  // Build messages array from history
  const chatMessages = conversation.messages.map((m) => ({
    role: m.role as "user" | "assistant",
    content: m.content,
  }));
  chatMessages.push({ role: "user", content: message });

  // Call LLM
  const completion = await client.chat.completions.create({
    model: "gpt-4o",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      ...chatMessages,
    ],
  });

  const reply = completion.choices[0].message.content ?? "";

  // Persist both messages
  await prisma.message.createMany({
    data: [
      { conversationId: conversation.id, role: "user", content: message },
      { conversationId: conversation.id, role: "assistant", content: reply },
    ],
  });

  return NextResponse.json({
    conversationId: conversation.id,
    reply,
  });
}

Rate Limiting

Protect your agent endpoint from abuse:

// lib/rate-limit.ts
const rateLimitMap = new Map<string, { count: number; resetTime: number }>();

export function checkRateLimit(
  userId: string,
  maxRequests: number = 20,
  windowMs: number = 60_000
): boolean {
  const now = Date.now();
  const entry = rateLimitMap.get(userId);

  if (!entry || now > entry.resetTime) {
    rateLimitMap.set(userId, { count: 1, resetTime: now + windowMs });
    return true;
  }

  if (entry.count >= maxRequests) {
    return false;
  }

  entry.count++;
  return true;
}

Use it in your route handler:

if (!checkRateLimit(userId)) {
  return NextResponse.json(
    { error: "Rate limit exceeded. Try again in a minute." },
    { status: 429 }
  );
}

Edge Runtime Considerations

Next.js route handlers can run on the Edge Runtime for lower latency. However, agents often need Node.js APIs (database drivers, file system access). Use edge selectively:

// This route can run on edge — it only calls external APIs
export const runtime = "edge";

export async function POST(req: Request) {
  // OpenAI SDK works on edge
  const stream = await client.chat.completions.create({
    model: "gpt-4o",
    messages: await req.json().then((b) => b.messages),
    stream: true,
  });
  // ...stream response
}

For routes that need Prisma, Redis, or other Node.js-dependent libraries, keep the default Node.js runtime.

FAQ

Should I use API routes or Server Actions for AI agents?

Use API routes for agent interactions. Server Actions are designed for form mutations and do not support streaming responses. API route handlers give you full control over the response format, headers, and streaming behavior that AI agents require.

How do I handle long-running agent tasks that exceed the serverless timeout?

For tasks longer than the default timeout (60 seconds on Vercel Hobby, 300 seconds on Pro), use the maxDuration export in your route handler: export const maxDuration = 300;. For even longer tasks, offload to a background job queue (Inngest, Trigger.dev) and poll for results from the client.

Can I deploy a Next.js agent app to platforms other than Vercel?

Yes. Next.js deploys to any platform that supports Node.js: Railway, Fly.io, AWS (via SST or standalone mode), Docker containers, or a traditional VPS. The only features that are Vercel-specific are edge middleware optimizations and some caching behaviors.


#Nextjs #APIRoutes #FullStack #AIAgents #Streaming #EdgeRuntime #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Use Cases

Automating Client Document Collection: How AI Agents Chase Missing Tax Documents and Reduce Filing Delays

See how AI agents automate tax document collection — chasing missing W-2s, 1099s, and receipts via calls and texts to eliminate the #1 CPA bottleneck.

Technical Guides

AI Voice Agent Architecture: Real-Time STT, LLM, and TTS Pipeline

Deep dive into the real-time STT → LLM → TTS pipeline that powers modern AI voice agents — latency, streaming, and error recovery.

Learn Agentic AI

API Design for AI Agent Tool Functions: Best Practices and Anti-Patterns

How to design tool functions that LLMs can use effectively with clear naming, enum parameters, structured responses, informative error messages, and documentation.

Learn Agentic AI

Computer Use in GPT-5.4: Building AI Agents That Navigate Desktop Applications

Technical guide to GPT-5.4's computer use capabilities for building AI agents that interact with desktop UIs, browser automation, and real-world application workflows.

Learn Agentic AI

Prompt Engineering for AI Agents: System Prompts, Tool Descriptions, and Few-Shot Patterns

Agent-specific prompt engineering techniques: crafting effective system prompts, writing clear tool descriptions for function calling, and few-shot examples that improve complex task performance.

Learn Agentic AI

AI Agents for IT Helpdesk: L1 Automation, Ticket Routing, and Knowledge Base Integration

Build IT helpdesk AI agents with multi-agent architecture for triage, device, network, and security issues. RAG-powered knowledge base, automated ticket creation, routing, and escalation.