---
title: "AI-Powered Search for SaaS Applications: Semantic Search Over Product Data"
description: "Build semantic search for your SaaS product using vector embeddings, enabling users to find records by meaning rather than exact keyword matches."
canonical: https://callsphere.ai/blog/ai-powered-semantic-search-saas-applications
category: "Learn Agentic AI"
tags: ["Semantic Search", "Vector Embeddings", "SaaS", "Search API", "Python", "pgvector"]
author: "CallSphere Team"
published: 2026-03-17T00:00:00.000Z
updated: 2026-06-01T06:16:24.855Z
---

# AI-Powered Search for SaaS Applications: Semantic Search Over Product Data

> Build semantic search for your SaaS product using vector embeddings, enabling users to find records by meaning rather than exact keyword matches.

## Why Keyword Search Falls Short

Traditional keyword search works by matching exact tokens. When a user in your CRM searches for "companies that are struggling financially," keyword search returns nothing — because no record contains those exact words. Semantic search uses vector embeddings to match by meaning, so that query finds records tagged "at risk," "payment overdue," or "churn likelihood: high."

For SaaS products with rich, structured data, semantic search transforms how users discover and interact with their information.

## Architecture: Indexing Pipeline

The indexing pipeline converts your product data into searchable vector embeddings. It runs on data changes (inserts, updates, deletes) and keeps the vector index in sync with your primary database.

```mermaid
flowchart TD
    DOC(["Document"])
    CHUNK["Chunker
recursive plus overlap"]
    EMB["Embedding model"]
    META["Attach metadata
source, page, tenant"]
    INDEX[("HNSW or IVF index
in vector store")]
    Q(["Query"])
    QEMB["Embed query"]
    SEARCH["ANN search
cosine similarity"]
    FILTER["Metadata filter
tenant or date"]
    HITS(["Top-k chunks"])
    DOC --> CHUNK --> EMB --> META --> INDEX
    Q --> QEMB --> SEARCH
    INDEX --> SEARCH --> FILTER --> HITS
    style INDEX fill:#4f46e5,stroke:#4338ca,color:#fff
    style HITS fill:#059669,stroke:#047857,color:#fff
```

```python
# Embedding indexer that processes data changes
from openai import OpenAI
import numpy as np
from dataclasses import dataclass

client = OpenAI()

@dataclass
class SearchDocument:
    entity_type: str
    entity_id: str
    tenant_id: str
    text: str
    metadata: dict

def create_embedding(text: str) -> list[float]:
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text,
    )
    return response.data[0].embedding

def build_search_text(entity_type: str, record: dict) -> str:
    """Convert a database record into searchable text."""
    builders = {
        "contact": lambda r: (
            f"Contact: {r['name']}. Company: {r.get('company', 'N/A')}. "
            f"Title: {r.get('title', 'N/A')}. Notes: {r.get('notes', '')}. "
            f"Tags: {', '.join(r.get('tags', []))}."
        ),
        "deal": lambda r: (
            f"Deal: {r['name']}. Value: ${r.get('value', 0):,.2f}. "
            f"Stage: {r.get('stage', 'unknown')}. "
            f"Description: {r.get('description', '')}."
        ),
        "ticket": lambda r: (
            f"Support ticket: {r['subject']}. Status: {r.get('status', 'open')}. "
            f"Priority: {r.get('priority', 'normal')}. Body: {r.get('body', '')}."
        ),
    }
    builder = builders.get(entity_type)
    if not builder:
        raise ValueError(f"Unknown entity type: {entity_type}")
    return builder(record)
```

## Storing Embeddings with pgvector

Use PostgreSQL with pgvector to keep embeddings alongside your existing data, avoiding the operational overhead of a separate vector database.

```python
# pgvector storage and retrieval
import asyncpg

EMBED_DIM = 1536  # text-embedding-3-small dimension

async def setup_vector_table(pool: asyncpg.Pool):
    async with pool.acquire() as conn:
        await conn.execute("CREATE EXTENSION IF NOT EXISTS vector;")
        await conn.execute(f"""
            CREATE TABLE IF NOT EXISTS search_embeddings (
                id SERIAL PRIMARY KEY,
                tenant_id UUID NOT NULL,
                entity_type VARCHAR(50) NOT NULL,
                entity_id UUID NOT NULL,
                content TEXT NOT NULL,
                embedding vector({EMBED_DIM}) NOT NULL,
                metadata JSONB DEFAULT '{{}}',
                updated_at TIMESTAMPTZ DEFAULT NOW(),
                UNIQUE(entity_type, entity_id)
            );
        """)
        await conn.execute("""
            CREATE INDEX IF NOT EXISTS idx_search_embed_tenant
            ON search_embeddings (tenant_id);
        """)

async def upsert_embedding(pool: asyncpg.Pool, doc: SearchDocument):
    embedding = create_embedding(doc.text)
    embedding_str = "[" + ",".join(str(x) for x in embedding) + "]"
    async with pool.acquire() as conn:
        await conn.execute("""
            INSERT INTO search_embeddings
                (tenant_id, entity_type, entity_id, content, embedding, metadata)
            VALUES ($1, $2, $3, $4, $5::vector, $6)
            ON CONFLICT (entity_type, entity_id)
            DO UPDATE SET content = $4, embedding = $5::vector,
                          metadata = $6, updated_at = NOW();
        """, doc.tenant_id, doc.entity_type, doc.entity_id,
             doc.text, embedding_str, doc.metadata)
```

## Search API

The search endpoint accepts a natural language query, embeds it, and performs a cosine similarity search scoped to the user's tenant.

```python
from fastapi import FastAPI, Depends, Query
from pydantic import BaseModel

app = FastAPI()

class SearchResult(BaseModel):
    entity_type: str
    entity_id: str
    content: str
    score: float
    metadata: dict

@app.get("/api/search", response_model=list[SearchResult])
async def semantic_search(
    q: str = Query(..., min_length=2, max_length=500),
    entity_type: str | None = Query(None),
    limit: int = Query(10, ge=1, le=50),
    tenant_id: str = Depends(get_current_tenant),
    pool: asyncpg.Pool = Depends(get_db_pool),
):
    query_embedding = create_embedding(q)
    embedding_str = "[" + ",".join(str(x) for x in query_embedding) + "]"

    type_filter = "AND entity_type = $3" if entity_type else ""
    params = [tenant_id, embedding_str]
    if entity_type:
        params.append(entity_type)

    async with pool.acquire() as conn:
        rows = await conn.fetch(f"""
            SELECT entity_type, entity_id, content, metadata,
                   1 - (embedding  $2::vector) AS score
            FROM search_embeddings
            WHERE tenant_id = $1 {type_filter}
            ORDER BY embedding  $2::vector
            LIMIT {limit};
        """, *params)

    return [
        SearchResult(
            entity_type=r["entity_type"],
            entity_id=str(r["entity_id"]),
            content=r["content"],
            score=round(float(r["score"]), 4),
            metadata=r["metadata"],
        )
        for r in rows
    ]
```

## Relevance Tuning

Combine vector similarity with keyword matching and recency boosting for better results.

```python
# Hybrid scoring: vector similarity + keyword BM25 + recency
async def hybrid_search(pool: asyncpg.Pool, query: str,
                        tenant_id: str, limit: int = 10):
    query_embedding = create_embedding(query)
    embedding_str = "[" + ",".join(str(x) for x in query_embedding) + "]"

    async with pool.acquire() as conn:
        rows = await conn.fetch("""
            SELECT entity_type, entity_id, content, metadata,
                   1 - (embedding  $2::vector) AS vector_score,
                   ts_rank(to_tsvector('english', content),
                           plainto_tsquery('english', $3)) AS keyword_score,
                   EXTRACT(EPOCH FROM (NOW() - updated_at)) AS age_seconds
            FROM search_embeddings
            WHERE tenant_id = $1
            ORDER BY (
                0.7 * (1 - (embedding  $2::vector)) +
                0.2 * ts_rank(to_tsvector('english', content),
                              plainto_tsquery('english', $3)) +
                0.1 * (1.0 / (1.0 + EXTRACT(EPOCH FROM (NOW() - updated_at)) / 86400))
            ) DESC
            LIMIT $4;
        """, tenant_id, embedding_str, query, limit)
    return rows
```

## FAQ

### How do I keep the vector index in sync with my primary data?

Use database triggers or change data capture (CDC) to detect inserts, updates, and deletes. Queue these changes to a background worker that recomputes embeddings and upserts them. For deletes, remove the corresponding row from the search_embeddings table. A 30-second indexing delay is acceptable for most SaaS applications.

### Should I use pgvector or a dedicated vector database?

pgvector is the right choice for most SaaS products under 10 million records. It keeps your stack simple — one database, one backup strategy, one connection pool. Switch to a dedicated vector database like Pinecone or Weaviate only if you need sub-10ms latency at scale or advanced filtering that pgvector does not support.

### How do I handle multi-language search?

Use a multilingual embedding model like `text-embedding-3-small` (which supports 100+ languages natively). Index all content as-is without translation. The embedding model maps semantically similar content to nearby vectors regardless of language, so a query in Spanish will find relevant records written in English.

---

#SemanticSearch #VectorEmbeddings #SaaS #SearchAPI #Python #Pgvector #AgenticAI #LearnAI #AIEngineering

---

Source: https://callsphere.ai/blog/ai-powered-semantic-search-saas-applications
