Knowledge
Blog
January 8, 2026
4 min read
Blanca Rivera Campos

AI hallucinations and the AI failure in a French Court

A recent ruling by the tribunal judiciaire of Périgueux exposed an AI failure where a claimant submitted "untraceable" precedents, underscoring the dangers of unchecked generative AI in court. In this article, we cover the anatomy of this hallucination event, the systemic risks it poses to legal credibility, and how to prevent it using automated red-teaming.
AI hallucinations and the AI failure in a French Court

On December 18, 2025, a French court formally noted that a claimant's legal arguments contained "untraceable or erroneous" precedents. This AI incident is one of the documented cases of AI hallucinations in legal filings tracked worldwide.

In this article we cover a specific AI failure that occurred in the tribunal judiciaire of Périgueux, analyzing the mechanism of the failure, its consequences, and how teams can operationalize prevention of hallucinations in AI.

How an AI law assistant failed in the French Tribunal

The incident in question occurred during a legal proceeding where a claimant submitted written arguments containing serious irregularities.

A claimant before the tribunal judiciaire of Périgueux submitted written arguments that referenced judicial decisions the court could not locate, or whose dates and subjects did not match the citations provided.

The court's decision, dated 18 December 2025, noted that some of the cited precedents were "introuvables ou erronées" (untraceable or erroneous). Consequently, the court invited the claimant and their lawyer, even though they won on the merits, to verify that their sources were not hallucinated by AI from search engines or generative tools.

This incident was an example of AI hallucination, a failure mode where a language model asserts false information. Such hallucinations are intrinsic to generative models and can only be effectively detected or prevented by validating outputs against external sources.

The failure was not malicious in nature, originating instead from a design flaw or accidental misuse rather than an adversarial attack. Crucially, the system lacked the necessary safeguards to detect these errors, allowing the fabricated precedents to reach the courtroom unchecked.

{{cta}}

Risks of an unchecked AI law assistants

While the claimant in this specific AI incident was successful on the merits, the technical and reputational fallout highlights severe risks for any AI law assistant. The court explicitly called out the use of hallucinated jurisprudence. This public acknowledgement undermines the credibility of the claimant and their lawyer, highlighting significant deficiencies in their due diligence.

If these AI fails are not addressed, they create risks that scale far beyond a single courtroom:

  • Contamination of legal records: There is a risk of a systematic spread of false legal precedents in filings, briefs, and internal memos. This can mislead judges, clients, or colleagues if not systematically verified.
  • Erosion of trust: Repeated failures lead to the erosion of trust in AI-assisted legal tools. This potentially leads to overreaction or, worse, silent reliance on flawed tools without proper oversight, increasing the risk of miscarriages of justice or ethical breaches.
  • Global scale: This decision is one of hundreds of documented cases worldwide where courts or tribunals had to deal with AI hallucinations in legal filings.

This signals that any general-purpose or poorly grounded AI law assistant, especially when used directly by litigants or lawyers, can inject fabricated law into formal proceedings unless strong verification workflows and domain-specific safeguards are in place.

Preventing AI hallucinations

To prevent these failures, relying on prompt engineering or manual review is insufficient. Before deployment in legal workflows, the AI should undergo targeted hallucination and reliability testing.

Here is how you can operationalize this testing methodology using the Giskard Hub:

1. Automated red teaming with the LLM vulnerability scanner

The first line of defense is the Giskard Vulnerability Scanner, which automates the red-teaming process by launching dynamic attacks against your model.

The scanner deploys specialized **probes (**structured adversarial tests) that attempt to force the model into failure modes. It specifically targets the Hallucination & Misinformation category (OWASP LLM 09), testing for both factuality (alignment with world knowledge) and faithfulness (alignment with provided source documents).

This automated step detects intrinsic weaknesses providing a baseline security grade (A-D) and a detailed report of where the model invents facts.

2. Ground-truth checks

Once the automated scan is complete, you must validate the specific logic of your application using Ground-truth checks.

These checks automatically validate each generated reference (case name + court + date + number) against a ground-truth index or knowledge base. Any citation that does not strictly match the verified index is flagged as a hallucination, preventing "invented" case law from reaching the final output.

3. Safety tests on critical tasks

Finally, convert the findings from your vulnerability scan into a persistent regression testing suite.

You can create test suites where the model is instructed to perform high-risk tasks, such as "draft a legal argument for X with case law". Every output is automatically scanned for unverifiable citations or "sycophancy" - the tendency of the model to agree with a user's false premise (e.g., inventing a case because the user asked for it).

Conclusion

As this ruling in Périgueux demonstrates, AI hallucinations in the legal sector can have a huge impact in professional credibility and judicial integrity. Organizations must move beyond passive monitoring to active vulnerability scanning to ensure their AI assistants remain an asset rather than a liability. By implementing robust testing frameworks like Giskard, legal tech leaders can build the necessary trust to deploy Generative AI safely at scale.

Ready to test your AI agents before they fail in production? Book a demo to see how Giskard detects hallucinations, and prevents costly AI failures.

Continuously secure LLM agents, preventing hallucinations and security issues.
Book a Demo

You will also like

When AI financial advice goes wrong: ChatGPT, Copilot, and Gemini failed UK consumers

When AI financial advice goes wrong: ChatGPT, Copilot, and Gemini failed UK consumers

In November 2025, ChatGPT, Microsoft Copilot, Google Gemini, and Meta AI were caught giving UK consumers dangerous financial advice: recommending they exceed ISA contribution limits, providing incorrect tax guidance, and directing them to expensive services instead of free government alternatives. In this article, we analyze the UK incident, explain why these chatbots failed, and show how to prevent similar failures in your AI systems.

View post
Phare LLM benchmark V2: Reasoning models don't guarantee better security

Phare LLM benchmark V2: Reasoning models don't guarantee better security

Phare (Potential Harm Assessment & Risk Evaluation) is an independent, multilingual benchmark designed to evaluate AI models across four critical dimensions, or “modules”: hallucination, bias, harmfulness, and vulnerability to jailbreaking attacks. This second version expands our evaluation to include reasoning models from leading providers, allowing us to assess whether these advanced systems represent a meaningful improvement in AI safety.

View post

Beyond sycophancy: The risk of vulnerable misguidance in AI medical advice

Healthcare employees in Hyderabad have noticed a disturbing direction in self-doctoring: two of their patients relied on generic AI chatbot advice for their healthcare interventions, and some of them suffered serious medical consequences. Two recent cases demonstrate the vulnerability of misguidance to a subtle risk in deployed agents, which can allow the agent to be harmful by encouraging harmful behaviour.

View post
Get AI security insights in your inbox