On December 18, 2025, a French court formally noted that a claimant's legal arguments contained "untraceable or erroneous" precedents. This AI incident is one of the documented cases of AI hallucinations in legal filings tracked worldwide.
In this article we cover a specific AI failure that occurred in the tribunal judiciaire of Périgueux, analyzing the mechanism of the failure, its consequences, and how teams can operationalize prevention of hallucinations in AI.
How an AI law assistant failed in the French Tribunal
The incident in question occurred during a legal proceeding where a claimant submitted written arguments containing serious irregularities.
A claimant before the tribunal judiciaire of Périgueux submitted written arguments that referenced judicial decisions the court could not locate, or whose dates and subjects did not match the citations provided.
The court's decision, dated 18 December 2025, noted that some of the cited precedents were "introuvables ou erronées" (untraceable or erroneous). Consequently, the court invited the claimant and their lawyer, even though they won on the merits, to verify that their sources were not hallucinated by AI from search engines or generative tools.
This incident was an example of AI hallucination, a failure mode where a language model asserts false information. Such hallucinations are intrinsic to generative models and can only be effectively detected or prevented by validating outputs against external sources.
The failure was not malicious in nature, originating instead from a design flaw or accidental misuse rather than an adversarial attack. Crucially, the system lacked the necessary safeguards to detect these errors, allowing the fabricated precedents to reach the courtroom unchecked.
{{cta}}
Risks of an unchecked AI law assistants
While the claimant in this specific AI incident was successful on the merits, the technical and reputational fallout highlights severe risks for any AI law assistant. The court explicitly called out the use of hallucinated jurisprudence. This public acknowledgement undermines the credibility of the claimant and their lawyer, highlighting significant deficiencies in their due diligence.
If these AI fails are not addressed, they create risks that scale far beyond a single courtroom:
- Contamination of legal records: There is a risk of a systematic spread of false legal precedents in filings, briefs, and internal memos. This can mislead judges, clients, or colleagues if not systematically verified.
- Erosion of trust: Repeated failures lead to the erosion of trust in AI-assisted legal tools. This potentially leads to overreaction or, worse, silent reliance on flawed tools without proper oversight, increasing the risk of miscarriages of justice or ethical breaches.
- Global scale: This decision is one of hundreds of documented cases worldwide where courts or tribunals had to deal with AI hallucinations in legal filings.
This signals that any general-purpose or poorly grounded AI law assistant, especially when used directly by litigants or lawyers, can inject fabricated law into formal proceedings unless strong verification workflows and domain-specific safeguards are in place.
Preventing AI hallucinations
To prevent these failures, relying on prompt engineering or manual review is insufficient. Before deployment in legal workflows, the AI should undergo targeted hallucination and reliability testing.
Here is how you can operationalize this testing methodology using the Giskard Hub:
1. Automated red teaming with the LLM vulnerability scanner
The first line of defense is the Giskard Vulnerability Scanner, which automates the red-teaming process by launching dynamic attacks against your model.
The scanner deploys specialized **probes (**structured adversarial tests) that attempt to force the model into failure modes. It specifically targets the Hallucination & Misinformation category (OWASP LLM 09), testing for both factuality (alignment with world knowledge) and faithfulness (alignment with provided source documents).
This automated step detects intrinsic weaknesses providing a baseline security grade (A-D) and a detailed report of where the model invents facts.
2. Ground-truth checks
Once the automated scan is complete, you must validate the specific logic of your application using Ground-truth checks.
These checks automatically validate each generated reference (case name + court + date + number) against a ground-truth index or knowledge base. Any citation that does not strictly match the verified index is flagged as a hallucination, preventing "invented" case law from reaching the final output.
3. Safety tests on critical tasks
Finally, convert the findings from your vulnerability scan into a persistent regression testing suite.
You can create test suites where the model is instructed to perform high-risk tasks, such as "draft a legal argument for X with case law". Every output is automatically scanned for unverifiable citations or "sycophancy" - the tendency of the model to agree with a user's false premise (e.g., inventing a case because the user asked for it).
Conclusion
As this ruling in Périgueux demonstrates, AI hallucinations in the legal sector can have a huge impact in professional credibility and judicial integrity. Organizations must move beyond passive monitoring to active vulnerability scanning to ensure their AI assistants remain an asset rather than a liability. By implementing robust testing frameworks like Giskard, legal tech leaders can build the necessary trust to deploy Generative AI safely at scale.
Ready to test your AI agents before they fail in production? Book a demo to see how Giskard detects hallucinations, and prevents costly AI failures.

.png)



