Skip to content
Learn Agentic AI
Learn Agentic AI13 min read2 views

Building a Knowledge Base Chat Agent with OpenAI Vector Stores

Build a knowledge base chat agent using OpenAI's vector stores API for document upload, chunking, semantic search, citation-grounded answers, and automatic knowledge base maintenance.

The Knowledge Base Problem

Every organization has institutional knowledge scattered across documents, wikis, PDFs, and Slack threads. A knowledge base chat agent gives users a natural language interface to this information — ask a question, get a grounded answer with citations pointing to the source documents.

OpenAI's vector stores provide a fully managed RAG (Retrieval-Augmented Generation) pipeline: you upload documents, they are automatically chunked and embedded, and the FileSearchTool retrieves relevant chunks at query time. No Pinecone, no Chroma, no embedding pipeline to maintain.

Creating a Vector Store

from openai import OpenAI

client = OpenAI()

# Create a vector store for your knowledge base
vector_store = client.vector_stores.create(
    name="company-knowledge-base",
    expires_after={"anchor": "last_active_at", "days": 365},
    metadata={"team": "engineering", "version": "v2"},
)
print(f"Vector store created: {vector_store.id}")
# vs_abc123...

Uploading Documents

You can upload files in bulk. OpenAI supports PDF, Markdown, plain text, HTML, JSON, and many more formats. Documents are automatically chunked using a sensible default strategy:

flowchart TD
    START["Building a Knowledge Base Chat Agent with OpenAI …"] --> A
    A["The Knowledge Base Problem"]
    A --> B
    B["Creating a Vector Store"]
    B --> C
    C["Uploading Documents"]
    C --> D
    D["Chunking Configuration"]
    D --> E
    E["Building the Knowledge Base Agent"]
    E --> F
    F["Querying with Citations"]
    F --> G
    G["Answer Grounding and Hallucination Prev…"]
    G --> H
    H["Keeping the Knowledge Base Updated"]
    H --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
import os
from pathlib import Path

def upload_documents(vector_store_id: str, docs_directory: str):
    """Upload all documents from a directory to the vector store."""
    supported_extensions = {".pdf", ".md", ".txt", ".html", ".json", ".docx"}
    uploaded = []

    for filepath in Path(docs_directory).rglob("*"):
        if filepath.suffix.lower() not in supported_extensions:
            continue

        # Upload the file
        with open(filepath, "rb") as f:
            file_obj = client.files.create(
                file=f,
                purpose="assistants",
            )

        # Add file to vector store
        client.vector_stores.files.create(
            vector_store_id=vector_store_id,
            file_id=file_obj.id,
            metadata={
                "source": str(filepath),
                "filename": filepath.name,
            },
        )
        uploaded.append(filepath.name)
        print(f"Uploaded: {filepath.name}")

    return uploaded

# Upload everything in the docs folder
uploaded = upload_documents(vector_store.id, "./company-docs")
print(f"Uploaded {len(uploaded)} documents")

Chunking Configuration

For more control over how documents are split, configure the chunking strategy:

vector_store = client.vector_stores.create(
    name="technical-docs",
    chunking_strategy={
        "type": "static",
        "static": {
            "max_chunk_size_tokens": 800,
            "chunk_overlap_tokens": 400,
        }
    },
)

The overlap ensures that information spanning chunk boundaries is still retrievable. For technical documentation, 800 tokens with 400 overlap works well. For conversational content like FAQs, use smaller chunks (400 tokens, 200 overlap).

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Building the Knowledge Base Agent

With the vector store ready, create an agent that uses FileSearchTool to answer questions:

flowchart TD
    CENTER(("Core Concepts"))
    CENTER --> N0["Limit results — Set max_num_results to …"]
    CENTER --> N1["Use metadata filters — Tag documents wi…"]
    CENTER --> N2["Chunk size tuning — Smaller chunks for …"]
    CENTER --> N3["Cache frequent queries — Hash the query…"]
    style CENTER fill:#4f46e5,stroke:#4338ca,color:#fff
from agents import Agent, Runner, FileSearchTool

knowledge_agent = Agent(
    name="Knowledge Base Agent",
    instructions="""You are a helpful knowledge base assistant for Acme Corp.
    Answer questions using ONLY information found in the knowledge base.

    RULES:
    1. Always search the knowledge base before answering
    2. If the knowledge base does not contain relevant information,
       say "I don't have information about that in our knowledge base"
    3. Cite your sources — mention the document name and section
    4. If information from multiple documents conflicts, mention both
       perspectives and note the discrepancy
    5. Never make up information or fill gaps with general knowledge
    6. For procedural questions, provide step-by-step answers
    7. Keep answers concise but complete""",
    tools=[
        FileSearchTool(
            vector_store_ids=[vector_store.id],
            max_num_results=10,
            include_search_results=True,
        ),
    ],
)

Querying with Citations

async def ask_knowledge_base(question: str) -> dict:
    result = await Runner.run(knowledge_agent, input=question)

    # Extract citations from the response
    annotations = []
    if hasattr(result, "raw_responses"):
        for response in result.raw_responses:
            for item in response.output:
                if hasattr(item, "content"):
                    for block in item.content:
                        if hasattr(block, "annotations"):
                            for ann in block.annotations:
                                if ann.type == "file_citation":
                                    annotations.append({
                                        "file_id": ann.file_citation.file_id,
                                        "quote": ann.file_citation.quote,
                                    })

    return {
        "answer": result.final_output,
        "citations": annotations,
    }

# Example usage
import asyncio

response = asyncio.run(ask_knowledge_base(
    "What is our policy on remote work for engineering teams?"
))
print(response["answer"])
for cite in response["citations"]:
    print(f"  Source: {cite['quote'][:100]}...")

Answer Grounding and Hallucination Prevention

The FileSearchTool automatically grounds responses in retrieved documents, but you can add an additional verification layer:

from pydantic import BaseModel, Field
from typing import List, Optional

class GroundedAnswer(BaseModel):
    answer: str = Field(description="The answer based on knowledge base documents")
    confidence: str = Field(
        description="high, medium, or low based on source quality"
    )
    sources: List[str] = Field(
        description="List of source document names referenced"
    )
    gaps: Optional[str] = Field(
        default=None,
        description="Any information gaps or areas where the KB was incomplete"
    )

grounded_agent = Agent(
    name="Grounded KB Agent",
    instructions="""You answer questions strictly from the knowledge base.
    Rate your confidence based on how directly the sources address the question.
    High = sources directly answer the question.
    Medium = sources partially address it, some inference needed.
    Low = sources are tangentially related at best.
    If confidence is low, explicitly state the limitations.""",
    tools=[
        FileSearchTool(
            vector_store_ids=[vector_store.id],
            max_num_results=10,
        ),
    ],
    output_type=GroundedAnswer,
)

Keeping the Knowledge Base Updated

Documents change. You need a process to keep the vector store in sync:

import hashlib
from datetime import datetime

class KnowledgeBaseManager:
    def __init__(self, client: OpenAI, vector_store_id: str):
        self.client = client
        self.vs_id = vector_store_id
        self.file_hashes: dict[str, str] = {}

    def _hash_file(self, filepath: str) -> str:
        with open(filepath, "rb") as f:
            return hashlib.sha256(f.read()).hexdigest()

    async def sync_directory(self, docs_dir: str):
        """Sync a directory of documents with the vector store.
        Adds new files, updates changed files, removes deleted files."""
        current_files = {}
        docs_path = Path(docs_dir)

        # Scan local files
        for fp in docs_path.rglob("*"):
            if fp.is_file():
                rel_path = str(fp.relative_to(docs_path))
                current_files[rel_path] = self._hash_file(str(fp))

        # Find files to add or update
        for rel_path, file_hash in current_files.items():
            old_hash = self.file_hashes.get(rel_path)
            if old_hash is None:
                # New file — upload
                await self._upload_file(docs_path / rel_path, rel_path)
            elif old_hash != file_hash:
                # Changed file — remove old, upload new
                await self._remove_file(rel_path)
                await self._upload_file(docs_path / rel_path, rel_path)

        # Find files to remove
        for rel_path in list(self.file_hashes.keys()):
            if rel_path not in current_files:
                await self._remove_file(rel_path)

        self.file_hashes = current_files

    async def _upload_file(self, filepath: Path, rel_path: str):
        with open(filepath, "rb") as f:
            file_obj = self.client.files.create(file=f, purpose="assistants")
        self.client.vector_stores.files.create(
            vector_store_id=self.vs_id,
            file_id=file_obj.id,
            metadata={"source_path": rel_path},
        )
        print(f"Uploaded: {rel_path}")

    async def _remove_file(self, rel_path: str):
        # List files and find the one matching this path
        vs_files = self.client.vector_stores.files.list(
            vector_store_id=self.vs_id,
        )
        for vs_file in vs_files.data:
            file_detail = self.client.files.retrieve(vs_file.id)
            if file_detail.metadata.get("source_path") == rel_path:
                self.client.vector_stores.files.delete(
                    vector_store_id=self.vs_id,
                    file_id=vs_file.id,
                )
                self.client.files.delete(vs_file.id)
                print(f"Removed: {rel_path}")
                break

For large knowledge bases, split documents into themed collections and search across them:

# Create separate stores for different document types
product_store = client.vector_stores.create(name="product-docs")
engineering_store = client.vector_stores.create(name="engineering-docs")
hr_store = client.vector_stores.create(name="hr-policies")

# Agent searches across all collections
multi_kb_agent = Agent(
    name="Multi-KB Agent",
    instructions="""You have access to three knowledge bases:
    product documentation, engineering documentation, and HR policies.
    Search the most relevant knowledge base(s) for each question.
    If a question spans multiple domains, search all relevant stores.""",
    tools=[
        FileSearchTool(
            vector_store_ids=[product_store.id, engineering_store.id, hr_store.id],
            max_num_results=15,
        ),
    ],
)

Performance Optimization

  1. Limit results — Set max_num_results to the minimum needed (5-10 is usually sufficient)
  2. Use metadata filters — Tag documents with categories and filter at query time
  3. Chunk size tuning — Smaller chunks for FAQ-style content, larger for narrative documents
  4. Cache frequent queries — Hash the query and cache results with a short TTL (5-15 minutes)
  5. Monitor retrieval quality — Log which documents are retrieved for each query and review for relevance

This approach gives you a production-ready knowledge base agent with no external vector database to manage.

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.