Foundations 3 min

Why AI Hallucinates (and What to Do About It)

Understand the root cause of hallucinations and the architectural patterns that reduce them significantly.

Last updated July 3, 2026

Hallucination is the most misunderstood failure mode in AI. People often assume it means the model is broken, or lying, or confused. It is none of those. Hallucination is the model doing exactly what it was trained to do - predict the most statistically plausible next token - without any mechanism to verify whether what it is saying is actually true. Confidence and correctness are completely independent in a language model.

The expert who never says 'I don't know'

Picture a highly fluent, supremely confident expert witness in a courtroom who has never been trained to say 'I don't know.' They have read everything ever written about their domain. When asked about a case they have no specific knowledge of, they do not hesitate - they construct a plausible-sounding narrative from patterns they have seen, delivered with the same confident tone as when they actually know the answer. The jury cannot tell the difference. The testimony sounds authoritative. The citations sound real. The statistics sound precise. But the expert is pattern-matching, not recalling. They have no internal alarm that fires when they cross from knowledge into fabrication. This is an LLM answering a question it does not actually know the answer to.

Why it happens mechanically. The model has no knowledge store it looks things up in. It has weights - billions of numerical parameters that encode statistical patterns from its training data. When it generates a response, it is sampling from probability distributions at each token position. If you ask it to name the CEO of a small company it has rarely encountered in training, it does not consult a database. It finds the most likely continuation of the sequence 'The CEO of [company] is...' based on patterns. It might guess right. It might generate a plausible but completely fabricated name - and have no idea the difference.

The four conditions that maximise hallucination risk

1. Asking for precise facts it rarely encountered in training - specific statistics, obscure historical dates, niche product details.
2. Asking about events after its training cutoff - it will confidently fill the gap with plausible-sounding fiction.
3. Long-chain reasoning without external verification - small errors compound across 15-step calculations.
4. Asking it to cite sources - it will fabricate specific paper titles, journal names, and page numbers that sound real but do not exist. This is called citation hallucination and it is extremely common.

Interactive: Hallucination vs Grounding Simulator

Truth Tester

LLMs always attempt to predict the next word, even if they don't know the facts. Toggle **RAG Context (Grounding)** on and off to see how external facts prevent confident lies.

1. Choose a Query

2. Toggle RAG Grounding

Injecting search results into the prompt forces the model to summarize reference texts rather than guess from frozen weights.

Enable Grounding (RAG Context)

Model Generation Output

Click Run Query to view response...

The fix is architectural, not prompting. You cannot reliably prompt your way out of hallucination. 'Do not make things up' does not work - the model has no awareness that it is making things up. The real solutions are structural:

RAG: Give the model the actual facts in the context. Instruct it to answer only from the provided documents and to say 'I don't know' if the answer is not in them. This dramatically reduces hallucination on factual tasks because the model is now summarising provided text rather than recalling from weights.

Tool calling: Give the model tools that can fetch live, verified data - a calculator, a database query, a web search. Instead of guessing a number, it can compute it.

Structured output with citations: Require the model to quote the specific passage it used to answer. If it cannot quote it, the answer is suspect.

Confidence thresholds: Build systems that flag low-confidence responses for human review rather than surfacing them directly to users.

What's next

Now you understand the core limitation: the model's knowledge is baked in at training time and it has no mechanism to verify its own outputs against reality. The next lesson explores this from a different angle - why that frozen-knowledge constraint matters in production and what the full toolkit looks like for solving it.