RAG
3 / 7
RAG 3 min

Vector Databases: Your AI's Memory

How Pinecone, Weaviate, and ChromaDB store and query embeddings at scale.

Traditional databases index tables for exact match queries (like checking if a username exists). Vector databases index high-dimensional floats, grouping vectors by similarity using indexing structures like Hierarchical Navigable Small World (HNSW). They enable searching for semantic concepts rather than literal characters.

The library organised by topic, not by title

A traditional database is a library catalogued strictly alphabetically by title. Looking up 'Machine Learning Basics' is instant. Looking up 'books about teaching computers to recognise patterns' returns nothing, even though that is the exact same book, because you didn't use the exact title. A vector database reorganises the entire library by *meaning* instead of by title. Every book gets physically shelved next to the books it is conceptually closest to, in a space with hundreds of dimensions instead of just one alphabet. Ask for 'teaching computers to recognise patterns' and you land in the same aisle as 'Machine Learning Basics', because the database is searching by proximity in meaning-space, not by matching letters.

The role of payload metadata and multi-tenancy. In a shared company cabinet or workspace, users must not access documents they are unauthorized to see. If you run a raw vector similarity search, the database will return documents based purely on cosine distance, potentially leaking private data. To prevent this, metadata fields (payloads) are stored alongside the vectors, including tenant IDs and user Access Control Lists (ACLs) to enforce security filtering.

Our Project implementations: ChromaDB vs. Qdrant
Our sibling projects implement two very different vector database architectures:

* ChromaDB (Semantic Vector Bot): A lightweight developer-focused database. We initialize it as a persistent local store (`chromadb.PersistentClient`). By default, it auto-embeds text using a local ONNX edition of the `all-MiniLM-L6-v2` model, running entirely on local CPU for free.
* Qdrant (Enterprise Search): A high-performance vector database. We initialize it as an in-memory instance (`QdrantClient(':memory:')`) configured for cosine distance. Qdrant is selected because it supports fast HNSW graph index rebuilds and metadata payload filters for multi-tenant isolation.
python
from qdrant_client.models import PointStruct

# Point creation in enterprise-search/database.py
qdrant.upsert(
    collection_name="kb",
    points=[
        PointStruct(
            id=cid,
            vector=emb,
            payload={
                "chunk_id": cid,
                "text": raw_text,
                "source": source,
                "tenant_id": tenant_id,
                "allowed_users": allowed_users, # Whitelist e.g., ["employee"]
            },
        )
    ],
)

For enterprise security, every point inserted into the database must store structural columns: the vector floats, the payload text string to inject into the LLM, and metadata filters containing file sources, tenant IDs, and ACL authorization tags.

What's next
Vector databases are powerful, but they are not free. They cost money to run, add a dependency, and need an embedding model. Is a vector store always necessary? Next up: Vector-less RAG, where we build accurate retrieval using nothing but keyword search.