---
title: "Building AI Agents with Next.js API Routes: Full-Stack Agent Applications"
description: "Learn how to build full-stack AI agent applications using Next.js API routes. Covers streaming responses, middleware for authentication, edge runtime considerations, conversation persistence, and production patterns for server-side agent logic."
canonical: https://callsphere.ai/blog/nextjs-api-routes-full-stack-ai-agent-applications
category: "Learn Agentic AI"
tags: ["Next.js", "API Routes", "Full-Stack", "AI Agents", "Streaming", "Edge Runtime"]
author: "CallSphere Team"
published: 2026-03-17T00:00:00.000Z
updated: 2026-05-06T01:02:44.077Z
---

# Building AI Agents with Next.js API Routes: Full-Stack Agent Applications

> Learn how to build full-stack AI agent applications using Next.js API routes. Covers streaming responses, middleware for authentication, edge runtime considerations, conversation persistence, and production patterns for server-side agent logic.

## Why Next.js for AI Agent Applications

Next.js provides the rare combination of a React frontend, a server-side API layer, and deployment infrastructure in a single framework. For AI agent applications, this means you can define your agent logic in API routes, stream responses to React components, and deploy everything as one unit — no separate backend service required.

The App Router's route handlers, combined with the Vercel AI SDK or raw streaming APIs, make Next.js one of the fastest paths from idea to deployed agent application.

## Basic Agent API Route

Create a route handler that processes messages and returns agent responses:

```mermaid
sequenceDiagram
    autonumber
    participant Client
    participant Edge as Edge Worker
    participant LLM as LLM Provider
    participant DB as Logs and Trace
    Client->>Edge: POST /chat (stream=true)
    Edge->>LLM: messages.create(stream=true)
    loop Each token
        LLM-->>Edge: SSE chunk delta
        Edge-->>Client: SSE chunk delta
        Edge->>DB: append token to span
    end
    LLM-->>Edge: stop_reason=end_turn
    Edge-->>Client: event: done
    Edge->>DB: finalize trace
```

```typescript
// app/api/agent/route.ts
import { NextRequest, NextResponse } from "next/server";
import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

export async function POST(req: NextRequest) {
  const { messages, threadId } = await req.json();

  if (!messages || !Array.isArray(messages)) {
    return NextResponse.json(
      { error: "messages array is required" },
      { status: 400 }
    );
  }

  const completion = await client.chat.completions.create({
    model: "gpt-4o",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      ...messages,
    ],
  });

  return NextResponse.json({
    message: completion.choices[0].message,
    usage: completion.usage,
  });
}
```

## Streaming Responses from API Routes

For real-time UIs, stream tokens instead of waiting for the full response:

```typescript
// app/api/agent/stream/route.ts
import { NextRequest } from "next/server";
import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

export async function POST(req: NextRequest) {
  const { messages } = await req.json();

  const stream = await client.chat.completions.create({
    model: "gpt-4o",
    messages,
    stream: true,
  });

  const encoder = new TextEncoder();

  const readable = new ReadableStream({
    async start(controller) {
      for await (const chunk of stream) {
        const text = chunk.choices[0]?.delta?.content;
        if (text) {
          controller.enqueue(
            encoder.encode(`data: ${JSON.stringify({ text })}

`)
          );
        }
      }
      controller.enqueue(encoder.encode("data: [DONE]

"));
      controller.close();
    },
  });

  return new Response(readable, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
      Connection: "keep-alive",
    },
  });
}
```

This implements Server-Sent Events (SSE) manually. The client connects to this endpoint and receives tokens as they arrive from the LLM.

## Authentication Middleware

Protect your agent endpoints with middleware that validates session tokens:

```typescript
// middleware.ts
import { NextResponse } from "next/server";
import type { NextRequest } from "next/server";

export function middleware(request: NextRequest) {
  if (request.nextUrl.pathname.startsWith("/api/agent")) {
    const authHeader = request.headers.get("authorization");

    if (!authHeader?.startsWith("Bearer ")) {
      return NextResponse.json(
        { error: "Authentication required" },
        { status: 401 }
      );
    }

    // Validate the token (JWT verification, database lookup, etc.)
    const token = authHeader.slice(7);
    // Add your token validation logic here
  }

  return NextResponse.next();
}

export const config = {
  matcher: "/api/agent/:path*",
};
```

## Conversation Persistence

Store conversation history so users can resume sessions:

```typescript
// app/api/agent/route.ts
import { prisma } from "@/lib/prisma";

export async function POST(req: NextRequest) {
  const { message, conversationId } = await req.json();
  const userId = req.headers.get("x-user-id")!;

  // Load or create conversation
  let conversation = conversationId
    ? await prisma.conversation.findUnique({
        where: { id: conversationId, userId },
        include: { messages: { orderBy: { createdAt: "asc" } } },
      })
    : await prisma.conversation.create({
        data: { userId },
        include: { messages: true },
      });

  if (!conversation) {
    return NextResponse.json({ error: "Not found" }, { status: 404 });
  }

  // Build messages array from history
  const chatMessages = conversation.messages.map((m) => ({
    role: m.role as "user" | "assistant",
    content: m.content,
  }));
  chatMessages.push({ role: "user", content: message });

  // Call LLM
  const completion = await client.chat.completions.create({
    model: "gpt-4o",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      ...chatMessages,
    ],
  });

  const reply = completion.choices[0].message.content ?? "";

  // Persist both messages
  await prisma.message.createMany({
    data: [
      { conversationId: conversation.id, role: "user", content: message },
      { conversationId: conversation.id, role: "assistant", content: reply },
    ],
  });

  return NextResponse.json({
    conversationId: conversation.id,
    reply,
  });
}
```

## Rate Limiting

Protect your agent endpoint from abuse:

```typescript
// lib/rate-limit.ts
const rateLimitMap = new Map();

export function checkRateLimit(
  userId: string,
  maxRequests: number = 20,
  windowMs: number = 60_000
): boolean {
  const now = Date.now();
  const entry = rateLimitMap.get(userId);

  if (!entry || now > entry.resetTime) {
    rateLimitMap.set(userId, { count: 1, resetTime: now + windowMs });
    return true;
  }

  if (entry.count >= maxRequests) {
    return false;
  }

  entry.count++;
  return true;
}
```

Use it in your route handler:

```typescript
if (!checkRateLimit(userId)) {
  return NextResponse.json(
    { error: "Rate limit exceeded. Try again in a minute." },
    { status: 429 }
  );
}
```

## Edge Runtime Considerations

Next.js route handlers can run on the Edge Runtime for lower latency. However, agents often need Node.js APIs (database drivers, file system access). Use edge selectively:

```typescript
// This route can run on edge — it only calls external APIs
export const runtime = "edge";

export async function POST(req: Request) {
  // OpenAI SDK works on edge
  const stream = await client.chat.completions.create({
    model: "gpt-4o",
    messages: await req.json().then((b) => b.messages),
    stream: true,
  });
  // ...stream response
}
```

For routes that need Prisma, Redis, or other Node.js-dependent libraries, keep the default Node.js runtime.

## FAQ

### Should I use API routes or Server Actions for AI agents?

Use API routes for agent interactions. Server Actions are designed for form mutations and do not support streaming responses. API route handlers give you full control over the response format, headers, and streaming behavior that AI agents require.

### How do I handle long-running agent tasks that exceed the serverless timeout?

For tasks longer than the default timeout (60 seconds on Vercel Hobby, 300 seconds on Pro), use the `maxDuration` export in your route handler: `export const maxDuration = 300;`. For even longer tasks, offload to a background job queue (Inngest, Trigger.dev) and poll for results from the client.

### Can I deploy a Next.js agent app to platforms other than Vercel?

Yes. Next.js deploys to any platform that supports Node.js: Railway, Fly.io, AWS (via SST or standalone mode), Docker containers, or a traditional VPS. The only features that are Vercel-specific are edge middleware optimizations and some caching behaviors.

---

#Nextjs #APIRoutes #FullStack #AIAgents #Streaming #EdgeRuntime #AgenticAI #LearnAI #AIEngineering

---

Source: https://callsphere.ai/blog/nextjs-api-routes-full-stack-ai-agent-applications
