Skip to content
Learn Agentic AI
Learn Agentic AI12 min read0 views

Streaming Text Display in React: Typewriter Effect for AI Agent Responses

Implement token-by-token streaming display for AI agent responses using Server-Sent Events, React state, and cursor animation. Includes markdown rendering during streaming.

Why Streaming Matters for Agent UX

When an AI agent takes 3-8 seconds to generate a full response, showing a blank loading spinner creates anxiety. Streaming tokens as they arrive gives users immediate feedback and makes the agent feel responsive. This pattern — used by ChatGPT, Claude, and every major AI interface — is achieved through Server-Sent Events (SSE) on the backend and incremental state updates on the frontend.

Setting Up the SSE Consumer

The browser EventSource API is simple but limited. It only supports GET requests and cannot send custom headers. For agent APIs that require POST bodies and authentication headers, use the Fetch API with a readable stream instead.

flowchart TD
    START["Streaming Text Display in React: Typewriter Effec…"] --> A
    A["Why Streaming Matters for Agent UX"]
    A --> B
    B["Setting Up the SSE Consumer"]
    B --> C
    C["The Streaming Hook"]
    C --> D
    D["Rendering Streaming Markdown"]
    D --> E
    E["Batching Token Updates for Performance"]
    E --> F
    F["Cancellation and Cleanup"]
    F --> G
    G["FAQ"]
    G --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
async function* streamAgentResponse(
  message: string,
  signal: AbortSignal
): AsyncGenerator<string> {
  const response = await fetch("/api/agent/chat", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${getToken()}`,
    },
    body: JSON.stringify({ message }),
    signal,
  });

  if (!response.ok) {
    throw new Error(`Agent error: ${response.status}`);
  }

  const reader = response.body!.getReader();
  const decoder = new TextDecoder();

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    const chunk = decoder.decode(value, { stream: true });
    const lines = chunk.split("\n");

    for (const line of lines) {
      if (line.startsWith("data: ")) {
        const data = line.slice(6);
        if (data === "[DONE]") return;
        const parsed = JSON.parse(data);
        if (parsed.token) {
          yield parsed.token;
        }
      }
    }
  }
}

The async generator pattern is ideal here. It produces tokens lazily, handles back-pressure naturally, and composes cleanly with React hooks.

The Streaming Hook

Wrap the generator in a custom hook that manages accumulated text, streaming state, and cancellation.

import { useState, useRef, useCallback } from "react";

interface StreamState {
  text: string;
  isStreaming: boolean;
  error: string | null;
}

function useStreamingResponse() {
  const [state, setState] = useState<StreamState>({
    text: "",
    isStreaming: false,
    error: null,
  });
  const abortRef = useRef<AbortController | null>(null);

  const startStream = useCallback(async (message: string) => {
    abortRef.current?.abort();
    const controller = new AbortController();
    abortRef.current = controller;

    setState({ text: "", isStreaming: true, error: null });

    try {
      for await (const token of streamAgentResponse(
        message,
        controller.signal
      )) {
        setState((prev) => ({
          ...prev,
          text: prev.text + token,
        }));
      }
      setState((prev) => ({ ...prev, isStreaming: false }));
    } catch (err) {
      if ((err as Error).name !== "AbortError") {
        setState((prev) => ({
          ...prev,
          isStreaming: false,
          error: (err as Error).message,
        }));
      }
    }
  }, []);

  const cancel = useCallback(() => {
    abortRef.current?.abort();
    setState((prev) => ({ ...prev, isStreaming: false }));
  }, []);

  return { ...state, startStream, cancel };
}

Each token appends to the existing text through a state updater function. This avoids stale closure issues that would occur if you read state.text directly inside the loop.

Rendering Streaming Markdown

During streaming, partial markdown tokens arrive that may not form complete syntax. A naive markdown renderer would flicker between valid and invalid states. The solution: render markdown on every update but debounce expensive operations like syntax highlighting.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

import ReactMarkdown from "react-markdown";

interface StreamingMessageProps {
  text: string;
  isStreaming: boolean;
}

function StreamingMessage({ text, isStreaming }: StreamingMessageProps) {
  return (
    <div className="prose prose-sm max-w-none">
      <ReactMarkdown>{text}</ReactMarkdown>
      {isStreaming && <BlinkingCursor />}
    </div>
  );
}

function BlinkingCursor() {
  return (
    <span
      className="inline-block w-2 h-5 bg-gray-800 ml-0.5 animate-pulse"
      aria-hidden="true"
    />
  );
}

The BlinkingCursor component creates the familiar typing indicator. The aria-hidden attribute prevents screen readers from announcing the cursor element.

Batching Token Updates for Performance

Setting state on every single token can cause excessive re-renders. If the backend streams tokens at high speed, batch them using requestAnimationFrame.

function useTokenBatcher(
  onBatch: (tokens: string) => void
) {
  const bufferRef = useRef("");
  const rafRef = useRef<number | null>(null);

  const addToken = useCallback((token: string) => {
    bufferRef.current += token;

    if (rafRef.current === null) {
      rafRef.current = requestAnimationFrame(() => {
        onBatch(bufferRef.current);
        bufferRef.current = "";
        rafRef.current = null;
      });
    }
  }, [onBatch]);

  return addToken;
}

This batches all tokens that arrive within a single animation frame into one state update. Instead of 50 re-renders per second you get at most 60, and each render processes multiple tokens at once.

Cancellation and Cleanup

Users must be able to stop a running stream. The AbortController pattern handles this cleanly. Wire a stop button to the cancel function from the hook.

function ChatControls({
  isStreaming,
  onCancel,
}: {
  isStreaming: boolean;
  onCancel: () => void;
}) {
  if (!isStreaming) return null;

  return (
    <button
      onClick={onCancel}
      className="flex items-center gap-1.5 rounded-lg border
                 px-3 py-1.5 text-sm hover:bg-gray-50"
    >
      <span className="w-3 h-3 rounded-sm bg-gray-700" />
      Stop generating
    </button>
  );
}

FAQ

How do I handle code blocks that arrive partially during streaming?

Most markdown renderers handle partial code blocks gracefully by treating unclosed fences as plain text until the closing fence arrives. If you see flicker, wrap your markdown component in React.memo and avoid re-parsing the entire string on every token. Libraries like react-markdown handle incremental content well out of the box.

What is the difference between SSE and WebSockets for streaming?

SSE is unidirectional (server to client), uses plain HTTP, and reconnects automatically. WebSockets are bidirectional and require a persistent connection. For AI agent streaming where the server sends tokens and the client only listens, SSE is simpler and sufficient. Use WebSockets when you need bidirectional communication, such as real-time collaborative editing or push notifications from the agent.

How do I add a copy button for completed responses?

After streaming finishes (isStreaming is false), render a copy button that calls navigator.clipboard.writeText(text). During streaming, hide the copy button to prevent users from copying incomplete content.


#React #Streaming #ServerSentEvents #TypeScript #AIAgentInterface #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Technical Guides

AI Voice Agent Architecture: Real-Time STT, LLM, and TTS Pipeline

Deep dive into the real-time STT → LLM → TTS pipeline that powers modern AI voice agents — latency, streaming, and error recovery.

Learn Agentic AI

How to Build an AI Coding Assistant with Claude and MCP: Step-by-Step Guide

Build a powerful AI coding assistant that reads files, runs tests, and fixes bugs using the Claude API and Model Context Protocol servers in TypeScript.

Learn Agentic AI

Building Your First MCP Server: Connect AI Agents to Any External Tool

Step-by-step tutorial on building an MCP server in TypeScript, registering tools and resources, handling requests, and connecting to Claude and other LLM clients.

Learn Agentic AI

Streaming Agent Architectures: Real-Time Token-by-Token Output with Tool Call Interleaving

Master the architecture of streaming AI agents that deliver token-by-token output while interleaving tool calls, using Server-Sent Events and progressive rendering to create responsive user experiences.

Learn Agentic AI

Building an Agent Playground: Interactive Testing Environment for Prompt and Tool Development

Build a full-featured agent playground with a web UI that lets you test prompts live, tune parameters, compare model outputs side by side, and export working configurations for production deployment.

Learn Agentic AI

Generative UI with AI Agents: Dynamically Creating React Components from Natural Language

Explore how the Vercel AI SDK's generativeUI capability lets AI agents stream fully interactive React components to users, replacing static text responses with dynamic, data-rich interfaces.