Vector-less RAG: BM25 Without a Vector Database
When keyword search beats embeddings - how BM25 works, its tradeoffs against vector DBs, and when to skip the vector store entirely.
Vector databases and embedding models are powerful, but they bring dependency, indexing latency, and financial costs. For localized applications, you can build a highly accurate RAG system using pure lexical (keyword) search. This approach is called vectorless RAG, and it uses the BM25 (Best Matching 25) algorithm.
BM25 builds on TF-IDF by scoring how well a document matches query keywords. It balances term frequency (rewarding documents where a term appears multiple times, with saturation diminishing returns) and inverse document frequency (rewarding rare, specific terms over common words).
A vector database is a librarian who understands what your book is *about*. BM25 is the index at the back of a textbook. It doesn't understand meaning at all, it just knows exactly which page every specific word appears on, and how often. Ask it to find 'invoice number INV-2049' or a product SKU, and it will find the exact page instantly, something a meaning-based search can actually struggle with because those strings carry no semantic content to compare. Ask it to find 'a document about billing problems' when the text says 'payment disputes', though, and it comes up empty, because it only ever matches the words that are literally printed on the page.
import string
from rank_bm25 import BM25Okapi
def _tokenise(text: str) -> list[str]:
"""Lowercase, strip punctuation, split on whitespace: classic BM25 tokens."""
text = text.lower().translate(str.maketrans("", "", string.punctuation))
return text.split()
def retrieve(question: str, top_k: int = 5) -> list[dict]:
"""Return top-K chunks ranked by BM25 score."""
if _bm25 is None or not _corpus:
return []
tokens = _tokenise(question)
scores = _bm25.get_scores(tokens)
ranked = sorted(zip(scores, _corpus), key=lambda x: x[0], reverse=True)
return [{**doc, "score": float(score)} for score, doc in ranked[:top_k] if score > 0]The tradeoff. The weakness of keyword search is synonym blindness. If a user asks 'how do I update my profile?' but your document uses the phrase 'modify account details', BM25 will score it near zero because there are no literal matching keywords. This is why advanced search systems combine BM25 and vector search together, a pattern called hybrid search, covered a couple of lessons from now.