---
title: "Build a Chat Agent with LangChain.js + Ollama (Local, 2026)"
description: "LangChain v1 + LangGraph v1 in JS, paired with Ollama, gives you a fully local chat agent with tools, memory, and structured output. No OpenAI key required."
canonical: https://callsphere.ai/blog/vw4h-build-chat-agent-langchain-js-ollama
category: "AI Engineering"
tags: ["LangChain.js", "Ollama", "Chat Agent", "Local AI", "Tutorial"]
author: "CallSphere Team"
published: 2026-04-26T00:00:00.000Z
updated: 2026-05-07T16:13:47.183Z
---

# Build a Chat Agent with LangChain.js + Ollama (Local, 2026)

> LangChain v1 + LangGraph v1 in JS, paired with Ollama, gives you a fully local chat agent with tools, memory, and structured output. No OpenAI key required.

> **TL;DR** — LangChain v1 (released February 2026) cleaned up the JS API. Combined with the `@langchain/ollama` package and a local Llama 3.1 8B, you get a tool-using chat agent in a Node.js process — no cloud, no API key, no per-token bill.

## What you'll build

A Node.js CLI chat agent: REPL on the terminal, persistent in-memory thread, two tools (web search via Tavily-or-stub and a SQLite "notes" store), running on Ollama via LangGraph's prebuilt `createReactAgent`.

## Prerequisites

1. Node.js 22+, `npm i langchain @langchain/core @langchain/ollama @langchain/langgraph zod better-sqlite3`.
2. Ollama running: `ollama pull llama3.1:8b`.
3. (Optional) Tavily API key for web search.

## Architecture

```mermaid
flowchart LR
  TTY[Terminal] --> AGT[createReactAgent]
  AGT --> LLM[ChatOllama llama3.1:8b]
  AGT --> T1[searchTool]
  AGT --> T2[saveNoteTool SQLite]
  AGT --> MEM[MemorySaver]
```

## Step 1 — Set up the model

```js
// model.js
import { ChatOllama } from "@langchain/ollama";

export const llm = new ChatOllama({
  model: "llama3.1:8b",
  baseUrl: "[http://127.0.0.1:11434](http://127.0.0.1:11434)",
  temperature: 0.4,
});
```

`@langchain/ollama` 0.2+ supports tool calling natively for Llama 3.1.

## Step 2 — Define tools

```js
// tools.js
import { tool } from "@langchain/core/tools";
import { z } from "zod";
import Database from "better-sqlite3";
const db = new Database("notes.db");
db.exec("CREATE TABLE IF NOT EXISTS notes (id INTEGER PRIMARY KEY, body TEXT)");

export const saveNote = tool(async ({ body }) => {
  const r = db.prepare("INSERT INTO notes (body) VALUES (?)").run(body);
  return `Saved note #${r.lastInsertRowid}`;
}, {
  name: "save_note",
  description: "Save a short note to the local notes database.",
  schema: z.object({ body: z.string().describe("Note content") }),
});

export const search = tool(async ({ q }) => {
  // Replace with Tavily / SerpAPI / fetch as you like
  return `Stub search result for: ${q}`;
}, {
  name: "search",
  description: "Search the web for fresh information.",
  schema: z.object({ q: z.string() }),
});
```

## Step 3 — Build the React-style agent with LangGraph

```js
// agent.js
import { createReactAgent } from "@langchain/langgraph/prebuilt";
import { MemorySaver } from "@langchain/langgraph";
import { llm } from "./model.js";
import { saveNote, search } from "./tools.js";

export const agent = createReactAgent({
  llm,
  tools: [saveNote, search],
  checkpointSaver: new MemorySaver(),  // per-thread memory
  prompt: "You are a concise assistant. Use tools when relevant. Keep replies under 3 sentences.",
});
```

`createReactAgent` from LangGraph 1.x is the recommended way to build tool-using agents in 2026 — it replaces the older `AgentExecutor` pattern.

## Step 4 — REPL with persistent thread

```js
// repl.js
import readline from "node:readline/promises";
import { agent } from "./agent.js";

const rl = readline.createInterface({ input: process.stdin, output: process.stdout });
const threadId = "main";

while (true) {
  const userInput = await rl.question("you> ");
  if (!userInput) continue;
  if (userInput === "/exit") break;
  const out = await agent.invoke(
    { messages: [{ role: "user", content: userInput }] },
    { configurable: { thread_id: threadId } });
  const last = out.messages[out.messages.length - 1];
  console.log("bot>", last.content);
}
rl.close();
```

`thread_id` keeps memory separate per conversation; swap `MemorySaver` for `PostgresSaver` in production.

## Step 5 — Stream tokens (better UX)

```js
const stream = await agent.stream(
  { messages: [{ role: "user", content: userInput }] },
  { configurable: { thread_id: threadId }, streamMode: "messages" });

for await (const [chunk] of stream) {
  if (chunk?.content) process.stdout.write(chunk.content);
}
console.log();
```

`streamMode: "messages"` yields token-level events; `updates` yields per-node deltas.

## Step 6 — Structured output

```js
import { z } from "zod";
const Lead = z.object({ name: z.string(), email: z.string().email() });
const structured = llm.withStructuredOutput(Lead);
const lead = await structured.invoke("My name is Sagar and email is [sagar@callsphere.ai](mailto:sagar@callsphere.ai)");
console.log(lead);  // { name: 'Sagar', email: '[sagar@callsphere.ai](mailto:sagar@callsphere.ai)' }
```

## Common pitfalls

- **Ollama tool support varies by model.** Llama 3.1, Mistral, and Qwen 2.5 are reliable; smaller 1B models often hallucinate tool args.
- **`baseUrl` not localhost.** If Ollama is in Docker, use the host gateway IP.
- **Memory growth.** `MemorySaver` is in-process; long-running services need a real saver.

## How CallSphere does this in production

CallSphere's chat agents share a memory architecture with LangGraph's checkpointers but back it with Postgres + Redis. 37 specialists across 6 verticals, 90+ tools, 115+ DB tables. Healthcare runs 14 HIPAA tools on FastAPI :8084; OneRoof's 10 specialists handle property workflows. Pricing flat $149 / $499 / $1499. [14-day trial](/trial) · [22% affiliate](/affiliate) · [/demo](/demo).

## FAQ

**LangChain.js vs Python?** Same APIs; choose by team language.

**Best Ollama model for tools?** `llama3.1:8b-instruct-q4_K_M` for speed, `qwen2.5:14b-instruct` for quality.

**Production memory store?** `@langchain/langgraph-checkpoint-postgres`.

**Streaming + tools?** Yes — tool events come through the stream too.

**Multi-agent?** LangGraph supports supervisor, swarm, and hierarchical patterns.

## Sources

- [LangChain.js Ollama integration](https://js.langchain.com/docs/integrations/chat/ollama/)
- [LangChain Ollama complete tutorial](https://latenode.com/blog/ai-frameworks-technical-infrastructure/langchain-setup-tools-agents-memory/langchain-ollama-integration-complete-tutorial-with-examples)
- [LangChain.js + Ollama quickstart](https://raokarthik83.medium.com/crafting-conversations-with-langchain-js-and-ollama-a-quickstart-guide-7b2f47d85659)
- [Building first AI agent with Ollama + LangChain](https://blog.devops.dev/building-your-first-ai-agent-using-ollama-langchain-local-llms-91bfdb0634f3)

---

Source: https://callsphere.ai/blog/vw4h-build-chat-agent-langchain-js-ollama
