Retrieval-Augmented Generation

RAG

RAG gives language models access to external knowledge they were never trained on. Instead of relying on what the model memorised during training, RAG retrieves relevant documents at query time, making responses more accurate, current, and grounded in real sources.

7 lessons ~25 min totalIntermediate

Lessons

Why RAG Exists: The Knowledge Problem

The core limitation of static training data and how retrieval-augmented generation solves it.

The RAG Pipeline: Index, Retrieve, Generate

Walk through every stage of a working RAG system with clear diagrams and real examples.

Vector Databases: Your AI's Memory

How Pinecone, Weaviate, and ChromaDB store and query embeddings at scale.

Vector-less RAG: BM25 Without a Vector Database

When keyword search beats embeddings - how BM25 works, its tradeoffs against vector DBs, and when to skip the vector store entirely.

Chunking Strategies That Actually Work

Fixed-size vs semantic chunking, overlap windows, and when to use each one.

Hybrid Search: Dense and Sparse Together

Combine keyword search (BM25) with semantic search to get the best of both approaches.

Re-ranking: Quality Over Quantity

Use cross-encoders to improve the quality of retrieved context before it reaches the model.