Retrieval Engineering

Hybrid search & reranking

Lesson 4 of 5

What you'll learn

Understand why dense and sparse retrieval are complementary
Fuse keyword and vector scores into one hybrid ranking
Apply a reranking stage over the top candidates

Vector search captures meaning but can fumble exact tokens — error codes, SKUs, function names, rare proper nouns. Keyword search (BM25, a sparse term-frequency method) nails those exact matches but is blind to paraphrase. Hybrid search runs both and fuses their scores, so "how do I get my money back" can still surface a chunk that literally says "refund."

Fusing two score scales

Dense and sparse scores live on different scales, so you can't just add them. Two common fixes: normalize each score to [0, 1] and take a weighted sum, or use Reciprocal Rank Fusion (RRF), which combines ranks instead of raw scores and is pleasantly scale-free.

// weighted fusion: final = alpha * denseNorm + (1 - alpha) * sparseNorm
// RRF: sum over retrievers of 1 / (k + rank)

Then rerank the shortlist

Hybrid fusion is cheap and runs over the whole index, but it's coarse. So you over-fetch — pull the top ~50 — and hand them to a reranker: a cross-encoder that reads the query and each candidate together and scores true relevance. It's far more accurate than embedding similarity but too expensive to run over millions of docs, which is why it only sees the shortlist. Retrieve wide and cheap, rerank narrow and precise.

Recall first, precision second

The first stage's only job is recall: get every relevant chunk into the candidate pool. The reranker's job is precision: order that pool correctly. If a document never makes the shortlist, no reranker can save it — so tune the first stage to over-fetch.

Hybrid fusion + rerank

Run it. It normalizes a keyword score and a vector score into a hybrid ranking, then reranks the top-k with a combined relevance signal.

Loading editor…

Knowledge check

Why is a cross-encoder reranker run only over a small shortlist rather than the whole index?

Saved on this device. Sign in to sync your progress everywhere.

PreviouskNN & approximate nearest neighbors Next RAG quality & evaluation