Skip to content
Learn Agentic AI
Learn Agentic AI12 min read6 views

Embedding Dimensions and Distance Metrics: Cosine, Euclidean, and Dot Product

Learn when to use cosine similarity, Euclidean distance, or dot product for vector search, how normalization affects results, and practical guidance on choosing dimensions and metrics.

Why Distance Metrics Matter

When you search a vector database, the engine compares your query embedding against stored embeddings using a distance or similarity function. The choice of metric determines what "similar" means mathematically. Two vectors with the same content can rank differently depending on whether you use cosine similarity, Euclidean distance, or dot product.

Choosing the wrong metric does not produce errors — it produces subtly wrong results. Your search will return items, but they may not be the most semantically relevant ones. Understanding these metrics is essential for building accurate vector search.

Cosine Similarity and Cosine Distance

Cosine similarity measures the angle between two vectors, ignoring their magnitudes:

flowchart TD
    START["Embedding Dimensions and Distance Metrics: Cosine…"] --> A
    A["Why Distance Metrics Matter"]
    A --> B
    B["Cosine Similarity and Cosine Distance"]
    B --> C
    C["Euclidean Distance L2"]
    C --> D
    D["Dot Product Inner Product"]
    D --> E
    E["Normalization: Making Metrics Equivalent"]
    E --> F
    F["Choosing Embedding Dimensions"]
    F --> G
    G["Practical Recommendations"]
    G --> H
    H["FAQ"]
    H --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
cosine_similarity(A, B) = (A . B) / (||A|| * ||B||)

The result ranges from -1 (opposite) to 1 (identical direction). Cosine distance is 1 - cosine_similarity, so lower distance means more similar.

import numpy as np

def cosine_similarity(a: np.ndarray, b: np.ndarray) -> float:
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

def cosine_distance(a: np.ndarray, b: np.ndarray) -> float:
    return 1.0 - cosine_similarity(a, b)

# Example
vec_a = np.array([1.0, 2.0, 3.0])
vec_b = np.array([2.0, 4.0, 6.0])  # same direction, different magnitude
vec_c = np.array([3.0, 1.0, 0.5])  # different direction

print(cosine_similarity(vec_a, vec_b))  # 1.0 — identical direction
print(cosine_similarity(vec_a, vec_c))  # 0.59 — somewhat similar

When to use: Cosine is the default choice for text embeddings. Most embedding models (OpenAI, Cohere, sentence-transformers) are optimized so that semantic similarity aligns with angular similarity. It naturally ignores vector magnitude, which means document length does not bias results.

Euclidean Distance (L2)

Euclidean distance is the straight-line distance between two points in the vector space:

euclidean(A, B) = sqrt(sum((A_i - B_i)^2))
def euclidean_distance(a: np.ndarray, b: np.ndarray) -> float:
    return np.linalg.norm(a - b)

# Same direction but different magnitude
print(euclidean_distance(vec_a, vec_b))  # 3.74 — large distance despite same direction
print(euclidean_distance(vec_a, vec_c))  # 3.04

Notice that vectors A and B point in the same direction but have different magnitudes. Cosine similarity sees them as identical (1.0), but Euclidean distance shows a gap of 3.74. This is the core difference.

When to use: Euclidean distance works well when vector magnitudes carry meaningful information — for example, in recommendation systems where a larger magnitude indicates stronger preference, or in spatial applications. It is less common for text search because document length can skew magnitudes.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Dot Product (Inner Product)

Dot product is the simplest computation — just multiply corresponding elements and sum:

dot_product(A, B) = sum(A_i * B_i)
def dot_product(a: np.ndarray, b: np.ndarray) -> float:
    return np.dot(a, b)

print(dot_product(vec_a, vec_b))  # 28.0
print(dot_product(vec_a, vec_c))  # 6.5

Dot product considers both direction and magnitude. It equals ||A|| * ||B|| * cos(theta). For normalized vectors (unit length), dot product is identical to cosine similarity.

When to use: When your embeddings are already normalized (unit vectors), dot product is the fastest metric because it skips the normalization step that cosine requires. OpenAI's text-embedding-3-small and text-embedding-3-large produce normalized embeddings, so dot product and cosine yield identical rankings with dot product being slightly faster.

Normalization: Making Metrics Equivalent

When vectors are L2-normalized (unit length), all three metrics produce equivalent rankings:

def normalize(v: np.ndarray) -> np.ndarray:
    norm = np.linalg.norm(v)
    return v / norm if norm > 0 else v

a_norm = normalize(vec_a)
b_norm = normalize(vec_b)

print(cosine_similarity(a_norm, b_norm))  # 1.0
print(dot_product(a_norm, b_norm))         # 1.0
print(euclidean_distance(a_norm, b_norm))  # 0.0

Many embedding models already output normalized vectors. Check your model's documentation. If your vectors are not normalized, normalizing them before storage lets you use dot product (the fastest operation) while getting cosine-equivalent results.

# Normalize before inserting into your vector database
import numpy as np

def prepare_embedding(raw_embedding: list[float]) -> list[float]:
    vec = np.array(raw_embedding, dtype=np.float32)
    norm = np.linalg.norm(vec)
    if norm > 0:
        vec = vec / norm
    return vec.tolist()

Choosing Embedding Dimensions

Modern embedding models offer configurable dimensions. OpenAI's text-embedding-3-small supports 512 or 1536 dimensions. Larger dimensions preserve more information but consume more memory and slow down search.

from openai import OpenAI

client = OpenAI()

# Full 1536 dimensions
response = client.embeddings.create(
    model="text-embedding-3-small",
    input="Vector search tutorial",
    dimensions=1536
)

# Reduced to 512 dimensions — faster search, less memory
response_small = client.embeddings.create(
    model="text-embedding-3-small",
    input="Vector search tutorial",
    dimensions=512
)

The tradeoff is straightforward: lower dimensions mean faster queries and lower storage costs but slightly reduced search quality. For many applications, 512 dimensions perform within 1-2% of 1536 on retrieval benchmarks while using one-third the memory.

Practical Recommendations

  1. Start with cosine distance. It is the safest default for text embeddings and what most tutorials assume.
  2. Check if your embeddings are normalized. If they are, switch to dot product for a small speed gain.
  3. Use 1536 dimensions unless memory or latency constraints require reduction. Drop to 512 or 768 if you need to scale to tens of millions of vectors.
  4. Match the metric to your vector database config. Creating a cosine index but searching with dot product produces wrong results in most databases.

FAQ

Does cosine similarity always produce better search results than Euclidean distance for text?

For text embeddings from models like OpenAI or sentence-transformers, cosine similarity almost always outperforms Euclidean distance because these models are trained to align semantic similarity with angular proximity. The exception is when you have explicitly trained embeddings where magnitude carries meaning, such as popularity or relevance scores baked into the vector.

Can I change the distance metric after creating a vector index?

In most vector databases, the distance metric is set at index creation and cannot be changed without recreating the index. Pinecone, pgvector, Weaviate, and Chroma all require you to specify the metric upfront. Changing it means creating a new index and re-inserting all vectors.

Is there a meaningful performance difference between cosine and dot product?

For normalized vectors, the rankings are identical. Dot product is marginally faster because it skips the normalization division, but the difference is typically under 5% and negligible for most applications. If you are optimizing at billion-scale, dot product on pre-normalized vectors gives you a small but measurable latency improvement.


#Embeddings #DistanceMetrics #CosineSimilarity #VectorSearch #Math #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Technical Guides

How to Train an AI Voice Agent on Your Business: Prompts, RAG, and Fine-Tuning

A practical guide to training an AI voice agent on your specific business — system prompts, RAG over knowledge bases, and when to fine-tune.

Learn Agentic AI

Semantic Search for AI Agents: Embedding Models, Chunking Strategies, and Retrieval Optimization

Comprehensive guide to semantic search for AI agents covering embedding model selection, document chunking strategies, and retrieval optimization techniques for production systems.

Learn Agentic AI

pgvector Tutorial: Adding Vector Search to Your Existing PostgreSQL Database

Learn how to install pgvector, create vector columns, build IVFFlat and HNSW indexes, and run similarity queries directly inside PostgreSQL without adding another database to your stack.

Learn Agentic AI

Embeddings and Vector Representations: How LLMs Understand Meaning

Learn what embeddings are, how they capture semantic meaning as vectors, how to use embedding models for search and clustering, and the role cosine similarity plays in AI applications.

Learn Agentic AI

OpenAI Embeddings API: Creating Vector Representations of Text

Learn how to generate text embeddings with OpenAI's API, understand embedding dimensions, implement batch embedding, and build practical search and similarity applications.

Learn Agentic AI

Pinecone Getting Started: Cloud-Native Vector Database for AI Applications

A hands-on guide to setting up Pinecone, creating serverless indexes, upserting embeddings, running similarity queries, and filtering results with metadata for production AI applications.