Blog

June 6, 2025

5 minutes

A Practical Guide to LLM Hallucinations and Misinformation Detection

Explore how false content is generated by AI and why it's critical to understand LLM vulnerabilities for safer, more ethical AI use.

Understanding Hallucination and Misinformation in LLMs

David Berenstein

Understanding Hallucination and Misinformation in LLMs

A Practical Guide to LLM Hallucinations and Misinformation Detection

This blog is part of a series on vulnerabilities of LLMs.

Hallucination and misinformation in LLMs refer to generating fabricated or false content. This can result in the generation of misleading information or malicious narratives with potentially serious consequences for you, your clients and your business. Although most people know more or less what a hallucination is, the people involved in AI projects, like you and I, should have a deeper understanding of the concepts and a broad idea of its causes and prevention.

A Taxonomy of LLM Hallucinations and Misinformation Types

Luckily, Huawei and Harbin published a paper introducing a taxonomy for the different types of hallucinations in LLMs. There are two main categories, factuality and faithfulness, which contain several sub-categories.

Factuality is defined as the alignment of generated text to world knowledge.
Faithfulness is the alignment of generated text to source knowledge, like the instruction and the context.

So, factuality deals with the difference between generated content and verifiable real-world facts, typically arising as factual inconsistencies. Models can struggle with factuality issues in two ways: 1) by contradicting facts or 2) by making things up entirely. The table underneath gives a more specific overview.

Category	Description	Instruction & Response (with Explanation)
Contradiction	Occurs when the response contradicts established facts or common knowledge.	Instruction: What are Thomas Edison’s main contributions to science and technology? Response: Thomas Edison developed the first practical telephone and invented the light bulb. Explanation: The response is factually incorrect. Edison improved the light bulb, building on earlier designs, and Alexander Graham Bell invented the telephone.
Fabrication	Involves generating information that is completely fabricated or doesn't exist.	Instruction: What are the major environmental impacts of the construction of the Eiffel Tower? Response: The Eiffel Tower's construction led to the Parisian tiger's extinction and sparked the global green architecture movement. Explanation: The "Parisian tiger" is fictional, and attributing the green architecture movement to the Eiffel Tower is an unfounded exaggeration.

Faithfulness, on the other hand, deals with the alignment of the provided user input (prompt+context) with the generated response. Hallucinations can also occur within the generated content, indicating a lack of self-consistency. In practice, models can struggle with faithfulness in three distinct ways: 1) by being inconsistent with the instructions, 2) by being inconsistent with the context or 3) by introducing inconsistent logic during response generation.

Type	Description	Instruction & Response (with Explanation)
Instruction	Occurs when the model fails to follow a specific instruction, creating inconsistency between what was asked and the generated response.	Instruction: Translate the English question into Spanish: "What is the capital of France?" Response: The capital of France is Paris. Explanation: The model fails to translate the question into Spanish, instead providing an answer to the question.
Context	Happens when the model provides inconsistent information that doesn't align with prior context or retrieval.	Retrieved Context: The Nile is a major north-flowing river in northeastern Africa. It's traditionally considered the longest river in the world. Originating in the Great Lakes region of central Africa, it flows through several countries before emptying into the Mediterranean Sea. Instruction: Summarize the retrieval document above. Response: The Nile originates from the mountain ranges of central Africa and flows into the Mediterranean. Explanation: The summary incorrectly states that the Nile originates from the "mountain ranges of central Africa," while the retrieved context mentioned the Great Lakes region.
Logical	Refers to a situation where the response is not true to the facts, deviating from logical or mathematical consistency.	Instruction: Solve the equation 2x + 3 = 11 step by step. Response: Step 1: Subtract 3 from both sides to get: 2x = 8. Step 2: Divide both sides by 2 to get: x = 3. Explanation: The first step correctly isolates the term with x, but the second step is incorrect. Dividing 8 by 2 results in x = 4, not x = 3.

Now you've got a better understanding of the main categories in hallucinations and misinformation, we can now continue with understanding potential causes and prevention of hallucinations and misinformation in LLMs.

Why do LLMs hallucinate? Causes and Evaluation Techniques

Even the simplest implementations of the most popular LLMs hallucinate, which is a problem. We train models to be as helpful and assertive as possible, but whenever it comes to hallucinations, we want to avoid this exact behaviour, making the model's training objective a double-edged sword.

Enhancing LLMs with Up-to-Date World Knowledge

A lack of fine-tuning plays a significant role in determining the factual knowledge a model acquires during pre-training. By repeating and re-introducing the same or similar information, a language models, optimises the feature for a model to re-generate this data correctly. It is therefore important to properly curate data that is used for model training, ensuring it is of high enough quality and diversity.Curating a “better” dataset isn't always as simple as it seems, and often isn't even possible because model providers only have access to data that is currently available and don't have access to any private data or data that will be created in the future.

Effective Prompting Strategies for LLM Hallucinations

Ambiguous or open-ended prompts are another cause of hallucinations, as vague or poorly structured prompts can cause LLMs to generate speculative responses instead of fact-based ones. For example, adding a simple statement like “be concise" can cause models to start hallucinating because the model seems to be optimised to reproduce user preference for conciseness instead of factuality. Whether the model often perceives intentional or unintentional false statements in prompts to be correct, as mentioned above, models are not optimised to debunk or question their end-users and creators.

The recent popularity of agents and their tools introduces a new issue with hallucinations caused by prompting strategies. Because these tools rely on forwarding a complete set of arguments to a specific function, providing incomplete information might lead the models to hallucinate. For example, when a tool typically requires a person's first name, surname, and age, a user could only provide partial information, causing the model to hallucinate the remainder of the argument.

Phare Benchmark Results - Hallucination resistance scores (higher is better)

Do you want to learn more on hallucinations? Our research team worked on Phare: a multilingual benchmark for evaluating LLMs, where we dedicated a testing suite, specifically to hallucination.

Understanding Uncertainties in Generative AI for Better LLM Evaluation

Language models don't understand things like humans do. They don't “know” when they're unsure. So when they give you a wrong answer, they might say it with just as much confidence as a right one.

Also, LLMs try to be fast, so they usually pick the next word that seems most likely, one step at a time. But just because each word looks like a good guess, doesn't mean the whole sentence ends up being a good answer.

It's like guessing one word at a time in a sentence, always picking the word that fits best right now, without checking if the whole sentence actually makes sense.

Remember how we talked about LLMs sometimes sounding confident even when they're wrong? Well, agentic systems take that one step further.

Instead of answering one question, they think in steps, like solving a puzzle piece by piece. But if one step makes a little mistake, the next step builds on that mistake, and the step after that builds on that mistake. So minor errors can stack up, like a wobbly tower getting shakier the higher it goes.

This is called compounded errors. Even if each step seems okay on its own, the final result might be way off because every tiny mistake adds up and throws off the logic by the end.

How to reduce Hallucinations with Better LLM Evaluation and Prompting

So, unless we come up with a fundamentally different way our LLMs function, hallucinations are here to stay. Still, we can take some measures to ensure hallucinations occur less frequently or at least don't get forwarded to the end-user.

Improving LLM World Knowledge with RAG and External Retrieval

We can improve the world's knowledge of LLMs in two ways: 1) by fine-tuning the LLM or by hooking the LLM up to a Retrieval Augmented Generation (RAG) system. Correctly fine-tuning LLMs is challenging to master, and as we don't fully understand knowledge addition and retention of LLMs, it might lead to a decrease in general information, while only slightly improving the intended, more specific information.

Instead, we often implement RAG systems, which nicely balance effort, risk and result. These systems retrieve useful information and forward it as context to a prompt, after which the LLM uses it to create a better and up-to-date response to a question. A drawback of this approach is the uncertainty of the part of the context used to generate the response. Still, citation features to source the used documents have been introduced by companies like Anthropic to deal with this.

Better Prompting Strategies for LLM Hallucination Reduction

Prompting strategies are closely tied to uncertainties. They can help a model generate more nuanced and evaluated answers. Simply using higher-quality prompts with factual information and correct phrasing can significantly reduce hallucinations. Additionally, adding examples to the prompt helps models pick up on the intended way to interpret instructions and context.

Another method that uses few-shot prompting is Chain-of-Though (CoT) prompting. The CoT paper found that by adding reasoning demonstration examples and stating the model should “think step-by-step” they could add nuanced reasoning to model outputs, often improving the overall quality. This approach to using reasoning to enhance performance has been the fundamental inspiration for creating reasoning models, such as OpenAI-o1 and its open-source competitor DeepSeek-R1.

LLM Evaluation and Guardrails for Controlling Hallucinations

As introduced in our previous blog, guardrails are a fundamental part of reducing risks in LLM. They can be implemented specifically to reduce LLM hallucinations, specifically focused on verifying the LLM's output given the input, the context, and verifiable factual information.

Guardrail techniques can be implemented in many different ways, but they share some characteristics of being lightweight and quick compared to the LLM they are supposed to guardrail. Guardrails can be models to verify translations, estimate model uncertainty, and simply fact-check claims made in a response based on the provided context.

Instead of guardrails, you could ensure a more thorough quality assessment of your models. This would allow you to conduct more specific and tailored tests and comparisons without being constrained by response speed considerations. Our LLM Evaluation Hub also offers a service where we continuously adapt and feed evaluation tests with real-world knowledge, which ensures that tests are exhaustive and proactively adapted to the most recent and relevant risks.

Key Insights on Hallucinations in LLMs and How to Evaluate Them

Hallucinations in LLMs are a real and recurring issue rooted in how models are trained, prompted, and used. Whether they're factual errors or misalignment with instructions, these failures can undermine trust and introduce serious risks to your AI system.

Understanding the causes, from outdated data to ambiguous prompts and agentic uncertainty, is key to reducing them. Techniques like RAG, chain-of-thought prompting, fine-tuning, and strong guardrails can help, but ongoing testing and evaluations remain essential for truly understanding your AI systems.

Want to go deeper? Explore our Phare benchmark and our blog on hallucination testing to see how different models perform under pressure.

‍

Integrate | Scan | Test | Automate

Giskard: Testing platform to secure LLM Agents

Get alerted of new vulnerabilities

Protect against AI risks

Identify security vulnerabilities & hallucination

Enable cross-team collaboration

GET STARTED