Skip to content
Learn Agentic AI
Learn Agentic AI13 min read4 views

Pinecone Getting Started: Cloud-Native Vector Database for AI Applications

A hands-on guide to setting up Pinecone, creating serverless indexes, upserting embeddings, running similarity queries, and filtering results with metadata for production AI applications.

Pinecone is a fully managed, cloud-native vector database designed specifically for AI applications. Unlike self-hosted solutions, Pinecone handles index scaling, replication, and infrastructure management automatically. You get a simple API for upserting vectors and querying by similarity, without provisioning servers or tuning storage engines.

For teams that want to ship a RAG pipeline or semantic search feature quickly without operating database infrastructure, Pinecone is one of the fastest paths to production.

Account Setup and Installation

Sign up at pinecone.io and grab your API key from the dashboard. Then install the Python client:

flowchart TD
    START["Pinecone Getting Started: Cloud-Native Vector Dat…"] --> A
    A["Why Pinecone for Vector Search"]
    A --> B
    B["Account Setup and Installation"]
    B --> C
    C["Creating a Serverless Index"]
    C --> D
    D["Upserting Vectors"]
    D --> E
    E["Querying by Similarity"]
    E --> F
    F["Metadata Filtering"]
    F --> G
    G["Namespaces for Data Isolation"]
    G --> H
    H["Deleting Vectors"]
    H --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
pip install pinecone-client openai

Initialize the client:

from pinecone import Pinecone

pc = Pinecone(api_key="your-api-key-here")

Creating a Serverless Index

An index in Pinecone is where your vectors live. Create a serverless index specifying the dimension and distance metric:

from pinecone import ServerlessSpec

pc.create_index(
    name="documents",
    dimension=1536,
    metric="cosine",
    spec=ServerlessSpec(
        cloud="aws",
        region="us-east-1"
    )
)

The dimension must match your embedding model output. OpenAI text-embedding-3-small uses 1536 dimensions. Serverless indexes scale automatically based on usage — you pay per query rather than for always-on pods.

Upserting Vectors

Connect to the index and upsert vectors with optional metadata:

from openai import OpenAI

index = pc.Index("documents")
openai_client = OpenAI()

def embed_text(text: str) -> list[float]:
    response = openai_client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

# Upsert a single document
index.upsert(vectors=[
    {
        "id": "doc-001",
        "values": embed_text("Pinecone is a vector database for AI."),
        "metadata": {
            "source": "tutorial",
            "category": "databases",
            "word_count": 8
        }
    }
])

For bulk ingestion, batch your upserts to reduce API calls:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

def batch_upsert(documents: list[dict], batch_size: int = 100):
    vectors = []
    for doc in documents:
        vectors.append({
            "id": doc["id"],
            "values": embed_text(doc["content"]),
            "metadata": doc["metadata"]
        })

    # Upsert in batches
    for i in range(0, len(vectors), batch_size):
        batch = vectors[i:i + batch_size]
        index.upsert(vectors=batch)
        print(f"Upserted batch {i // batch_size + 1}")

Querying by Similarity

Pass a query vector and get back the most similar results:

def search(query: str, top_k: int = 5):
    query_vector = embed_text(query)
    results = index.query(
        vector=query_vector,
        top_k=top_k,
        include_metadata=True
    )
    for match in results["matches"]:
        print(f"ID: {match['id']}, Score: {match['score']:.4f}")
        print(f"Metadata: {match['metadata']}")
    return results

search("How do vector databases work?")

The score represents cosine similarity (0 to 1 for cosine metric), where higher means more similar.

Metadata Filtering

One of Pinecone's strengths is combining vector similarity with metadata filters. Filters are applied before the ANN search, so they do not degrade performance:

results = index.query(
    vector=embed_text("database performance tuning"),
    top_k=10,
    include_metadata=True,
    filter={
        "category": {"$eq": "databases"},
        "word_count": {"$gte": 100}
    }
)

Supported filter operators include $eq, $ne, $gt, $gte, $lt, $lte, $in, and $nin. You can combine them with $and and $or:

filter={
    "$and": [
        {"category": {"$in": ["databases", "ai"]}},
        {"source": {"$ne": "deprecated"}}
    ]
}

Namespaces for Data Isolation

Namespaces partition an index into separate segments. Each query only searches within one namespace:

# Upsert into a specific namespace
index.upsert(
    vectors=[{"id": "doc-1", "values": embedding, "metadata": meta}],
    namespace="tenant-abc"
)

# Query within that namespace
results = index.query(
    vector=query_vec,
    top_k=5,
    namespace="tenant-abc"
)

This is useful for multi-tenant applications where each customer's data must be isolated without creating separate indexes.

Deleting Vectors

Remove vectors by ID or by metadata filter:

# Delete specific IDs
index.delete(ids=["doc-001", "doc-002"])

# Delete all vectors in a namespace
index.delete(delete_all=True, namespace="tenant-abc")

FAQ

How much does Pinecone serverless cost compared to pod-based indexes?

Serverless indexes charge per query and per GB stored, with no minimum monthly cost. For workloads under a few million queries per month, serverless is significantly cheaper than pod-based indexes. Pod indexes make sense when you need guaranteed low-latency at sustained high throughput.

Can I update the metadata on an existing vector without re-embedding?

Yes. Use the update method with the vector ID and new metadata. You do not need to re-upload the vector values unless the underlying content has changed.

What happens if I upsert a vector with an ID that already exists?

Pinecone overwrites the existing vector with the new values and metadata. This is an upsert (update or insert) operation, so you do not need to check for existence before writing.


#Pinecone #VectorDatabase #Cloud #Embeddings #RAG #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

guides

Understanding AI Voice Technology: A Beginner's Guide

A plain-English guide to AI voice technology — LLMs, STT, TTS, RAG, function calling, and latency budgets. Learn how modern voice agents actually work.

Technical Guides

How to Train an AI Voice Agent on Your Business: Prompts, RAG, and Fine-Tuning

A practical guide to training an AI voice agent on your specific business — system prompts, RAG over knowledge bases, and when to fine-tune.

AI Interview Prep

8 LLM & RAG Interview Questions That OpenAI, Anthropic & Google Actually Ask

Real LLM and RAG interview questions from top AI labs in 2026. Covers fine-tuning vs RAG decisions, production RAG pipelines, evaluation, PEFT methods, positional embeddings, and safety guardrails with expert answers.

Learn Agentic AI

Semantic Search for AI Agents: Embedding Models, Chunking Strategies, and Retrieval Optimization

Comprehensive guide to semantic search for AI agents covering embedding model selection, document chunking strategies, and retrieval optimization techniques for production systems.

Learn Agentic AI

AI Agents for IT Helpdesk: L1 Automation, Ticket Routing, and Knowledge Base Integration

Build IT helpdesk AI agents with multi-agent architecture for triage, device, network, and security issues. RAG-powered knowledge base, automated ticket creation, routing, and escalation.

Learn Agentic AI

Vector Database Selection for AI Agents 2026: Pinecone vs Weaviate vs ChromaDB vs Qdrant

Technical comparison of vector databases for AI agent RAG systems: Pinecone, Weaviate, ChromaDB, and Qdrant benchmarked on performance, pricing, features, and scaling.