Vector Search & RAG Engineering
RAG is the most in-demand production AI pattern of 2026, and retrieval quality is what separates a convincing demo from a system that holds up under real traffic. This course goes past 'embed and pray' into the engineering layer: chunking strategy, nearest-neighbor search, hybrid retrieval, reranking, and rigorous evaluation. You'll model each concept in runnable code so the trade-offs become muscle memory.
5 lessons · ~2 hours
1. Retrieval Engineering
Embeddings & cosine similarity
Embeddings encode meaning as vectors, and cosine similarity ranks how semantically close two pieces of text are.
Chunking & indexing
How you split documents into chunks largely determines retrieval quality before a single embedding is computed.
kNN & approximate nearest neighbors
Brute-force kNN is exact but slow; ANN indexes like HNSW trade a little recall for orders-of-magnitude speed.
Hybrid search & reranking
Combine dense vector search with sparse keyword scoring, then rerank the top candidates for a sharper final order.
RAG quality & evaluation
Assemble context with citations and measure retrieval rigorously with precision@k, recall@k, and faithfulness.