All Topics
Retrieval-Augmented Generation
RAG
RAG gives language models access to external knowledge they were never trained on. Instead of relying on what the model memorised during training, RAG retrieves relevant documents at query time, making responses more accurate, current, and grounded in real sources.
Lessons
1
Why RAG Exists: The Knowledge Problem
The core limitation of static training data and how retrieval-augmented generation solves it.
2
The RAG Pipeline: Index, Retrieve, Generate
Walk through every stage of a working RAG system with clear diagrams and real examples.
3
Vector Databases: Your AI's Memory
How Pinecone, Weaviate, and ChromaDB store and query embeddings at scale.
4
Vector-less RAG: BM25 Without a Vector Database
When keyword search beats embeddings - how BM25 works, its tradeoffs against vector DBs, and when to skip the vector store entirely.
5
Chunking Strategies That Actually Work
Fixed-size vs semantic chunking, overlap windows, and when to use each one.
6
Hybrid Search: Dense and Sparse Together
Combine keyword search (BM25) with semantic search to get the best of both approaches.
7
Re-ranking: Quality Over Quantity
Use cross-encoders to improve the quality of retrieved context before it reaches the model.