Retrieval Engineering
Embeddings & cosine similarity
Lesson 1 of 5
What you'll learn
- Treat embeddings as coordinates of meaning, not opaque blobs
- Compute cosine similarity and rank documents by it
- Understand why direction matters more than magnitude
An embedding maps text to a fixed-length vector. The model is trained so that text with similar meaning lands at nearby points in that space, and unrelated text lands far apart. Retrieval is then a geometry problem: embed the query, embed the corpus, and find the corpus vectors closest to the query.
"Closest" almost always means cosine similarity — the cosine of the angle between two vectors. It ranges from -1 (opposite) through 0 (orthogonal / unrelated) to 1 (identical direction).
// cosine(a, b) = dot(a, b) / (|a| * |b|)
Why direction, not length
Embedding magnitude tends to track incidental things — document length, token frequency, formatting — not meaning. Cosine throws magnitude away and compares only orientation, which is exactly the signal you want. A short answer and a long answer about the same topic should rank as similar, and cosine makes that happen.
The ranking loop in practice
At query time you compute one similarity per candidate, then sort descending and keep the top results. With millions of vectors you wouldn't scan them all (that's the next few lessons), but the scoring math underneath is always this same dot-product-over-norms.
Normalize once, score cheap
If you L2-normalize every vector at index time, each magnitude becomes 1 and cosine similarity collapses to a plain dot product. That removes two square roots and a division from the hot path of every query.
Run it. It scores three candidate document vectors against a query vector and prints them ranked from most to least similar.
Why does cosine similarity ignore vector magnitude when ranking embeddings?
Saved on this device. Sign in to sync your progress everywhere.