prev next

Chapter 10.1: Embeddings & Vector Stores

The foundation of any RAG (Retrieval-Augmented Generation) system is the ability to efficiently store and retrieve relevant information based on semantic similarity. Embeddings are dense vector representations of text that capture semantic meaning, and Vector Stores are specialized databases designed to efficiently search through these high-dimensional vectors. Together, they enable AI agents to access vast knowledge bases and find contextually relevant information in real-time.

What Are Embeddings?

An embedding is a learned vector representation that maps discrete objects (words, sentences, documents) into a continuous vector space. In this space, semantically similar items are positioned close to each other according to some distance metric, typically cosine similarity or Euclidean distance.

Mathematical Foundation

Given a text input T, an embedding model E produces a vector v ∈ ℝᵈ, where d is the dimensionality of the embedding space (typically 384, 768, 1536, or higher).

v = E(T)
where ||v|| = 1 (unit vector after normalization)

Vector Similarity & Search

Let Q be the user's query and T be the set of available tools, where each tool t ∈ T has a defined function signature t(args).

The agent's decision process can be modeled as a function Decide(Q, T) that outputs either a direct answer or a function call:

Decide(Q, T) = { Answer(Q)
{ Call(t, args) } }

If a tool is called, the final answer is generated based on the original query and the tool's output: FinalAnswer = Answer(Q, Result), where Result = t(args).

Visualization: The Tool Use Workflow

The D3.js visualization below illustrates the step-by-step process of an agent using a tool. It shows the flow from the user's query to the agent's decision, the external tool execution, and the final, informed response.