The Knowledge Base Problem

Every organization has institutional knowledge scattered across documents, wikis, PDFs, and Slack threads. A knowledge base chat agent gives users a natural language interface to this information — ask a question, get a grounded answer with citations pointing to the source documents.

OpenAI's vector stores provide a fully managed RAG (Retrieval-Augmented Generation) pipeline: you upload documents, they are automatically chunked and embedded, and the FileSearchTool retrieves relevant chunks at query time. No Pinecone, no Chroma, no embedding pipeline to maintain.

Creating a Vector Store

from openai import OpenAI

client = OpenAI()

# Create a vector store for your knowledge base
vector_store = client.vector_stores.create(
    name="company-knowledge-base",
    expires_after={"anchor": "last_active_at", "days": 365},
    metadata={"team": "engineering", "version": "v2"},
)
print(f"Vector store created: {vector_store.id}")
# vs_abc123...

Uploading Documents

You can upload files in bulk. OpenAI supports PDF, Markdown, plain text, HTML, JSON, and many more formats. Documents are automatically chunked using a sensible default strategy:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

flowchart LR
    Q(["User query"])
    EMB["Embed query<br/>text-embedding-3"]
    VEC[("Vector DB<br/>pgvector or Pinecone")]
    RET["Top-k retrieval<br/>k = 8"]
    PROMPT["Augmented prompt<br/>system plus context"]
    LLM["LLM generation<br/>Claude or GPT"]
    CITE["Inline citations<br/>and page anchors"]
    OUT(["Grounded answer"])
    Q --> EMB --> VEC --> RET --> PROMPT --> LLM --> CITE --> OUT
    style EMB fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style VEC fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style LLM fill:#4f46e5,stroke:#4338ca,color:#fff
    style OUT fill:#059669,stroke:#047857,color:#fff

import os
from pathlib import Path

def upload_documents(vector_store_id: str, docs_directory: str):
    """Upload all documents from a directory to the vector store."""
    supported_extensions = {".pdf", ".md", ".txt", ".html", ".json", ".docx"}
    uploaded = []

    for filepath in Path(docs_directory).rglob("*"):
        if filepath.suffix.lower() not in supported_extensions:
            continue

        # Upload the file
        with open(filepath, "rb") as f:
            file_obj = client.files.create(
                file=f,
                purpose="assistants",
            )

        # Add file to vector store
        client.vector_stores.files.create(
            vector_store_id=vector_store_id,
            file_id=file_obj.id,
            metadata={
                "source": str(filepath),
                "filename": filepath.name,
            },
        )
        uploaded.append(filepath.name)
        print(f"Uploaded: {filepath.name}")

    return uploaded

# Upload everything in the docs folder
uploaded = upload_documents(vector_store.id, "./company-docs")
print(f"Uploaded {len(uploaded)} documents")

Chunking Configuration

For more control over how documents are split, configure the chunking strategy:

vector_store = client.vector_stores.create(
    name="technical-docs",
    chunking_strategy={
        "type": "static",
        "static": {
            "max_chunk_size_tokens": 800,
            "chunk_overlap_tokens": 400,
        }
    },
)

The overlap ensures that information spanning chunk boundaries is still retrievable. For technical documentation, 800 tokens with 400 overlap works well. For conversational content like FAQs, use smaller chunks (400 tokens, 200 overlap).

Building the Knowledge Base Agent

With the vector store ready, create an agent that uses FileSearchTool to answer questions:

from agents import Agent, Runner, FileSearchTool

knowledge_agent = Agent(
    name="Knowledge Base Agent",
    instructions="""You are a helpful knowledge base assistant for Acme Corp.
    Answer questions using ONLY information found in the knowledge base.

    RULES:
    1. Always search the knowledge base before answering
    2. If the knowledge base does not contain relevant information,
       say "I don't have information about that in our knowledge base"
    3. Cite your sources — mention the document name and section
    4. If information from multiple documents conflicts, mention both
       perspectives and note the discrepancy
    5. Never make up information or fill gaps with general knowledge
    6. For procedural questions, provide step-by-step answers
    7. Keep answers concise but complete""",
    tools=[
        FileSearchTool(
            vector_store_ids=[vector_store.id],
            max_num_results=10,
            include_search_results=True,
        ),
    ],
)

Querying with Citations

async def ask_knowledge_base(question: str) -> dict:
    result = await Runner.run(knowledge_agent, input=question)

    # Extract citations from the response
    annotations = []
    if hasattr(result, "raw_responses"):
        for response in result.raw_responses:
            for item in response.output:
                if hasattr(item, "content"):
                    for block in item.content:
                        if hasattr(block, "annotations"):
                            for ann in block.annotations:
                                if ann.type == "file_citation":
                                    annotations.append({
                                        "file_id": ann.file_citation.file_id,
                                        "quote": ann.file_citation.quote,
                                    })

    return {
        "answer": result.final_output,
        "citations": annotations,
    }

# Example usage
import asyncio

response = asyncio.run(ask_knowledge_base(
    "What is our policy on remote work for engineering teams?"
))
print(response["answer"])
for cite in response["citations"]:
    print(f"  Source: {cite['quote'][:100]}...")

Answer Grounding and Hallucination Prevention

The FileSearchTool automatically grounds responses in retrieved documents, but you can add an additional verification layer:

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

from pydantic import BaseModel, Field
from typing import List, Optional

class GroundedAnswer(BaseModel):
    answer: str = Field(description="The answer based on knowledge base documents")
    confidence: str = Field(
        description="high, medium, or low based on source quality"
    )
    sources: List[str] = Field(
        description="List of source document names referenced"
    )
    gaps: Optional[str] = Field(
        default=None,
        description="Any information gaps or areas where the KB was incomplete"
    )

grounded_agent = Agent(
    name="Grounded KB Agent",
    instructions="""You answer questions strictly from the knowledge base.
    Rate your confidence based on how directly the sources address the question.
    High = sources directly answer the question.
    Medium = sources partially address it, some inference needed.
    Low = sources are tangentially related at best.
    If confidence is low, explicitly state the limitations.""",
    tools=[
        FileSearchTool(
            vector_store_ids=[vector_store.id],
            max_num_results=10,
        ),
    ],
    output_type=GroundedAnswer,
)

Keeping the Knowledge Base Updated

Documents change. You need a process to keep the vector store in sync:

import hashlib
from datetime import datetime

class KnowledgeBaseManager:
    def __init__(self, client: OpenAI, vector_store_id: str):
        self.client = client
        self.vs_id = vector_store_id
        self.file_hashes: dict[str, str] = {}

    def _hash_file(self, filepath: str) -> str:
        with open(filepath, "rb") as f:
            return hashlib.sha256(f.read()).hexdigest()

    async def sync_directory(self, docs_dir: str):
        """Sync a directory of documents with the vector store.
        Adds new files, updates changed files, removes deleted files."""
        current_files = {}
        docs_path = Path(docs_dir)

        # Scan local files
        for fp in docs_path.rglob("*"):
            if fp.is_file():
                rel_path = str(fp.relative_to(docs_path))
                current_files[rel_path] = self._hash_file(str(fp))

        # Find files to add or update
        for rel_path, file_hash in current_files.items():
            old_hash = self.file_hashes.get(rel_path)
            if old_hash is None:
                # New file — upload
                await self._upload_file(docs_path / rel_path, rel_path)
            elif old_hash != file_hash:
                # Changed file — remove old, upload new
                await self._remove_file(rel_path)
                await self._upload_file(docs_path / rel_path, rel_path)

        # Find files to remove
        for rel_path in list(self.file_hashes.keys()):
            if rel_path not in current_files:
                await self._remove_file(rel_path)

        self.file_hashes = current_files

    async def _upload_file(self, filepath: Path, rel_path: str):
        with open(filepath, "rb") as f:
            file_obj = self.client.files.create(file=f, purpose="assistants")
        self.client.vector_stores.files.create(
            vector_store_id=self.vs_id,
            file_id=file_obj.id,
            metadata={"source_path": rel_path},
        )
        print(f"Uploaded: {rel_path}")

    async def _remove_file(self, rel_path: str):
        # List files and find the one matching this path
        vs_files = self.client.vector_stores.files.list(
            vector_store_id=self.vs_id,
        )
        for vs_file in vs_files.data:
            file_detail = self.client.files.retrieve(vs_file.id)
            if file_detail.metadata.get("source_path") == rel_path:
                self.client.vector_stores.files.delete(
                    vector_store_id=self.vs_id,
                    file_id=vs_file.id,
                )
                self.client.files.delete(vs_file.id)
                print(f"Removed: {rel_path}")
                break

Multi-Collection Search

For large knowledge bases, split documents into themed collections and search across them:

# Create separate stores for different document types
product_store = client.vector_stores.create(name="product-docs")
engineering_store = client.vector_stores.create(name="engineering-docs")
hr_store = client.vector_stores.create(name="hr-policies")

# Agent searches across all collections
multi_kb_agent = Agent(
    name="Multi-KB Agent",
    instructions="""You have access to three knowledge bases:
    product documentation, engineering documentation, and HR policies.
    Search the most relevant knowledge base(s) for each question.
    If a question spans multiple domains, search all relevant stores.""",
    tools=[
        FileSearchTool(
            vector_store_ids=[product_store.id, engineering_store.id, hr_store.id],
            max_num_results=15,
        ),
    ],
)

Performance Optimization

Limit results — Set max_num_results to the minimum needed (5-10 is usually sufficient)
Use metadata filters — Tag documents with categories and filter at query time
Chunk size tuning — Smaller chunks for FAQ-style content, larger for narrative documents
Cache frequent queries — Hash the query and cache results with a short TTL (5-15 minutes)
Monitor retrieval quality — Log which documents are retrieved for each query and review for relevance

This approach gives you a production-ready knowledge base agent with no external vector database to manage.

Building a Knowledge Base Chat Agent with OpenAI Vector Stores

The Knowledge Base Problem

Creating a Vector Store

Uploading Documents

Chunking Configuration

Building the Knowledge Base Agent

Querying with Citations

Answer Grounding and Hallucination Prevention

Keeping the Knowledge Base Updated

Multi-Collection Search

Performance Optimization

Try CallSphere AI Voice Agents

Related Articles You May Like

Chatbot for Answering Questions: How to Build One That Works

Graphiti: How Temporal Knowledge Graphs Give AI Voice Agents Persistent Memory (2026 Guide)

How To Create A Chatbot In 2026: A Founder's Practical Guide

Desktop AI Agents in 2026: Project Arc, Claude Cowork, OpenAI Agents Compared

OpenAI Frontier: Model-Native Orchestration Is the Default in 2026

Gemini Enterprise vs Anthropic vs OpenAI Frontier: 2026 Comparison