Building a Knowledge Base Chat Agent with OpenAI Vector Stores
Build a knowledge base chat agent using OpenAI's vector stores API for document upload, chunking, semantic search, citation-grounded answers, and automatic knowledge base maintenance.
The Knowledge Base Problem
Every organization has institutional knowledge scattered across documents, wikis, PDFs, and Slack threads. A knowledge base chat agent gives users a natural language interface to this information — ask a question, get a grounded answer with citations pointing to the source documents.
OpenAI's vector stores provide a fully managed RAG (Retrieval-Augmented Generation) pipeline: you upload documents, they are automatically chunked and embedded, and the FileSearchTool retrieves relevant chunks at query time. No Pinecone, no Chroma, no embedding pipeline to maintain.
Creating a Vector Store
from openai import OpenAI
client = OpenAI()
# Create a vector store for your knowledge base
vector_store = client.vector_stores.create(
name="company-knowledge-base",
expires_after={"anchor": "last_active_at", "days": 365},
metadata={"team": "engineering", "version": "v2"},
)
print(f"Vector store created: {vector_store.id}")
# vs_abc123...
Uploading Documents
You can upload files in bulk. OpenAI supports PDF, Markdown, plain text, HTML, JSON, and many more formats. Documents are automatically chunked using a sensible default strategy:
flowchart TD
START["Building a Knowledge Base Chat Agent with OpenAI …"] --> A
A["The Knowledge Base Problem"]
A --> B
B["Creating a Vector Store"]
B --> C
C["Uploading Documents"]
C --> D
D["Chunking Configuration"]
D --> E
E["Building the Knowledge Base Agent"]
E --> F
F["Querying with Citations"]
F --> G
G["Answer Grounding and Hallucination Prev…"]
G --> H
H["Keeping the Knowledge Base Updated"]
H --> DONE["Key Takeaways"]
style START fill:#4f46e5,stroke:#4338ca,color:#fff
style DONE fill:#059669,stroke:#047857,color:#fff
import os
from pathlib import Path
def upload_documents(vector_store_id: str, docs_directory: str):
"""Upload all documents from a directory to the vector store."""
supported_extensions = {".pdf", ".md", ".txt", ".html", ".json", ".docx"}
uploaded = []
for filepath in Path(docs_directory).rglob("*"):
if filepath.suffix.lower() not in supported_extensions:
continue
# Upload the file
with open(filepath, "rb") as f:
file_obj = client.files.create(
file=f,
purpose="assistants",
)
# Add file to vector store
client.vector_stores.files.create(
vector_store_id=vector_store_id,
file_id=file_obj.id,
metadata={
"source": str(filepath),
"filename": filepath.name,
},
)
uploaded.append(filepath.name)
print(f"Uploaded: {filepath.name}")
return uploaded
# Upload everything in the docs folder
uploaded = upload_documents(vector_store.id, "./company-docs")
print(f"Uploaded {len(uploaded)} documents")
Chunking Configuration
For more control over how documents are split, configure the chunking strategy:
vector_store = client.vector_stores.create(
name="technical-docs",
chunking_strategy={
"type": "static",
"static": {
"max_chunk_size_tokens": 800,
"chunk_overlap_tokens": 400,
}
},
)
The overlap ensures that information spanning chunk boundaries is still retrievable. For technical documentation, 800 tokens with 400 overlap works well. For conversational content like FAQs, use smaller chunks (400 tokens, 200 overlap).
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
Building the Knowledge Base Agent
With the vector store ready, create an agent that uses FileSearchTool to answer questions:
flowchart TD
CENTER(("Core Concepts"))
CENTER --> N0["Limit results — Set max_num_results to …"]
CENTER --> N1["Use metadata filters — Tag documents wi…"]
CENTER --> N2["Chunk size tuning — Smaller chunks for …"]
CENTER --> N3["Cache frequent queries — Hash the query…"]
style CENTER fill:#4f46e5,stroke:#4338ca,color:#fff
from agents import Agent, Runner, FileSearchTool
knowledge_agent = Agent(
name="Knowledge Base Agent",
instructions="""You are a helpful knowledge base assistant for Acme Corp.
Answer questions using ONLY information found in the knowledge base.
RULES:
1. Always search the knowledge base before answering
2. If the knowledge base does not contain relevant information,
say "I don't have information about that in our knowledge base"
3. Cite your sources — mention the document name and section
4. If information from multiple documents conflicts, mention both
perspectives and note the discrepancy
5. Never make up information or fill gaps with general knowledge
6. For procedural questions, provide step-by-step answers
7. Keep answers concise but complete""",
tools=[
FileSearchTool(
vector_store_ids=[vector_store.id],
max_num_results=10,
include_search_results=True,
),
],
)
Querying with Citations
async def ask_knowledge_base(question: str) -> dict:
result = await Runner.run(knowledge_agent, input=question)
# Extract citations from the response
annotations = []
if hasattr(result, "raw_responses"):
for response in result.raw_responses:
for item in response.output:
if hasattr(item, "content"):
for block in item.content:
if hasattr(block, "annotations"):
for ann in block.annotations:
if ann.type == "file_citation":
annotations.append({
"file_id": ann.file_citation.file_id,
"quote": ann.file_citation.quote,
})
return {
"answer": result.final_output,
"citations": annotations,
}
# Example usage
import asyncio
response = asyncio.run(ask_knowledge_base(
"What is our policy on remote work for engineering teams?"
))
print(response["answer"])
for cite in response["citations"]:
print(f" Source: {cite['quote'][:100]}...")
Answer Grounding and Hallucination Prevention
The FileSearchTool automatically grounds responses in retrieved documents, but you can add an additional verification layer:
from pydantic import BaseModel, Field
from typing import List, Optional
class GroundedAnswer(BaseModel):
answer: str = Field(description="The answer based on knowledge base documents")
confidence: str = Field(
description="high, medium, or low based on source quality"
)
sources: List[str] = Field(
description="List of source document names referenced"
)
gaps: Optional[str] = Field(
default=None,
description="Any information gaps or areas where the KB was incomplete"
)
grounded_agent = Agent(
name="Grounded KB Agent",
instructions="""You answer questions strictly from the knowledge base.
Rate your confidence based on how directly the sources address the question.
High = sources directly answer the question.
Medium = sources partially address it, some inference needed.
Low = sources are tangentially related at best.
If confidence is low, explicitly state the limitations.""",
tools=[
FileSearchTool(
vector_store_ids=[vector_store.id],
max_num_results=10,
),
],
output_type=GroundedAnswer,
)
Keeping the Knowledge Base Updated
Documents change. You need a process to keep the vector store in sync:
import hashlib
from datetime import datetime
class KnowledgeBaseManager:
def __init__(self, client: OpenAI, vector_store_id: str):
self.client = client
self.vs_id = vector_store_id
self.file_hashes: dict[str, str] = {}
def _hash_file(self, filepath: str) -> str:
with open(filepath, "rb") as f:
return hashlib.sha256(f.read()).hexdigest()
async def sync_directory(self, docs_dir: str):
"""Sync a directory of documents with the vector store.
Adds new files, updates changed files, removes deleted files."""
current_files = {}
docs_path = Path(docs_dir)
# Scan local files
for fp in docs_path.rglob("*"):
if fp.is_file():
rel_path = str(fp.relative_to(docs_path))
current_files[rel_path] = self._hash_file(str(fp))
# Find files to add or update
for rel_path, file_hash in current_files.items():
old_hash = self.file_hashes.get(rel_path)
if old_hash is None:
# New file — upload
await self._upload_file(docs_path / rel_path, rel_path)
elif old_hash != file_hash:
# Changed file — remove old, upload new
await self._remove_file(rel_path)
await self._upload_file(docs_path / rel_path, rel_path)
# Find files to remove
for rel_path in list(self.file_hashes.keys()):
if rel_path not in current_files:
await self._remove_file(rel_path)
self.file_hashes = current_files
async def _upload_file(self, filepath: Path, rel_path: str):
with open(filepath, "rb") as f:
file_obj = self.client.files.create(file=f, purpose="assistants")
self.client.vector_stores.files.create(
vector_store_id=self.vs_id,
file_id=file_obj.id,
metadata={"source_path": rel_path},
)
print(f"Uploaded: {rel_path}")
async def _remove_file(self, rel_path: str):
# List files and find the one matching this path
vs_files = self.client.vector_stores.files.list(
vector_store_id=self.vs_id,
)
for vs_file in vs_files.data:
file_detail = self.client.files.retrieve(vs_file.id)
if file_detail.metadata.get("source_path") == rel_path:
self.client.vector_stores.files.delete(
vector_store_id=self.vs_id,
file_id=vs_file.id,
)
self.client.files.delete(vs_file.id)
print(f"Removed: {rel_path}")
break
Multi-Collection Search
For large knowledge bases, split documents into themed collections and search across them:
# Create separate stores for different document types
product_store = client.vector_stores.create(name="product-docs")
engineering_store = client.vector_stores.create(name="engineering-docs")
hr_store = client.vector_stores.create(name="hr-policies")
# Agent searches across all collections
multi_kb_agent = Agent(
name="Multi-KB Agent",
instructions="""You have access to three knowledge bases:
product documentation, engineering documentation, and HR policies.
Search the most relevant knowledge base(s) for each question.
If a question spans multiple domains, search all relevant stores.""",
tools=[
FileSearchTool(
vector_store_ids=[product_store.id, engineering_store.id, hr_store.id],
max_num_results=15,
),
],
)
Performance Optimization
- Limit results — Set
max_num_resultsto the minimum needed (5-10 is usually sufficient) - Use metadata filters — Tag documents with categories and filter at query time
- Chunk size tuning — Smaller chunks for FAQ-style content, larger for narrative documents
- Cache frequent queries — Hash the query and cache results with a short TTL (5-15 minutes)
- Monitor retrieval quality — Log which documents are retrieved for each query and review for relevance
This approach gives you a production-ready knowledge base agent with no external vector database to manage.
Written by
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.