The Hallucination Tax: Why We Need to Put AI on Trial

When being wrong has actual consequences, single-model AI is no longer enough.

Mar 31, 2026

AI, like lawyers, can get things confidently wrong — which is why one usually needs more than one. Image generated with Leonardo AI

Imagine you’re sitting at the dinner table with your kids, and they’re telling you about the day they had. They recount a fascinating story about a history lesson, complete with dates, names, and a dramatic conclusion. It sounds perfectly plausible. But then you ask a follow-up question, and they hesitate. They add a detail that contradicts the first part. You realise they aren’t lying maliciously; they just want to give you a good story. They are optimising for your approval, not for the truth.

This is exactly what happens every time you ask a frontier AI model a complex question. When ChatGPT or Claude confidently tells you something wrong, it hasn’t made a mistake. The model is doing exactly what its neurons are optimised to do: produce an answer you will accept.

In the financial services sector, we have a massive problem with this. We are deploying these models to summarise regulatory documents, extract ESG data, and draft client reports. But when being wrong has actual consequences—when a fabricated citation or a hallucinated regulatory requirement makes it into a final document—the cost is immense. We are paying a hidden “hallucination tax” in the form of manual verification, eroded trust, and potential liability.

The Over-Compliance Problem

Recent research from Tsinghua University found that fewer than 0.01% of neurons in a language model are responsible for hallucination. They call them H-Neurons. What these neurons encode isn’t wrong information. It’s the drive to give you an answer, any answer, rather than say “I don’t know.”

The researchers tested these neurons against four failure types: hallucination, sycophancy (agreeing with you when you’re wrong), false premise acceptance, and jailbreak vulnerability. The same neurons drove all four. Amplify them, and all four get worse. Suppress them, and all four improve. Hallucination and sycophancy are the same behaviour at the neuron level. It is simply over-compliance.

And here is the kicker: safety training doesn’t fix it. The researchers measured what happens to these neurons during alignment—the safety training every AI company performs before release. The H-Neurons showed a parameter stability score of 0.97 out of 1.0 through the entire process. Safety training doesn’t restructure them. The models are fundamentally built to please us, even if it means making things up.

The Post-Hoc Verification Shift

At Wangari, we spend a lot of time building error-checking into our systems. We design causal inference models that force the data to explain the “why” before we trust the “what.” But the broader market is starting to realise that for general-purpose AI, we need a different approach: post-hoc verification.

Instead of trying to build a single, perfect model that never hallucinates, the new paradigm is to assume the model will hallucinate, and build systems to catch it after the fact. We are moving from “trust the AI” to “put the AI on trial.”

I was recently inspired by one of our fantastically creative readers of this very blog, Maarten Rischen, who built a product called Triall AI. (Full disclosure: this isn’t sponsored, I’m just trying the next hot thing in AI built by someone in our community). Maarten kept catching models fabricating sources in his own work, so he started manually cross-referencing outputs between Claude, Grok, and ChatGPT. When they disagreed, he knew he had a problem. So, he automated the process.

Three Models, One Verdict

Triall doesn’t try to fix individual models. It puts them on trial through a nine-stage process. Three different AI models (like Claude Opus, Grok, and GPT) answer your question independently. Because they have different architectures, they have different failure patterns. When they disagree, that is valuable information.

But it goes further. The models then blind peer-review each other. They check for false confidence and unchallenged assumptions. The best answer gets attacked by an adversarial critic. A different model refines what survives. Finally, specific claims are checked against live web sources. Not AI checking AI, but real sources checking AI.

This is the exact workflow lawyers and researchers are starting to adopt manually. As Adam David Long recently argued, AI isn’t “a thing” that gets smarter; it’s a pool of capabilities. You don’t hire “an AI” to write a legal brief. You use one model to draft, a second to punch holes in it, a third to check the citations, and a fourth to flag risky claims. You review the final set of options, not a single raw output.

Research Shoutout — Scaling Sustainable Digital Platforms

We are conducting academic research on how sustainable digital platforms grow and scale responsibly. If your company embeds environmental or social goals into its core business model, we’d love to speak with you.

The study involves 2–3 short interviews with key employees. Participation is anonymous, confidential, and low time commitment — and you’ll receive early access to our findings.

Interested? Reach out to us directly:

•Ari Joury, Cofounder & CEO, Wangari Global — ari.joury@wangari.global

•Melanie Gertschen, PhD Candidate, University of Bern — melanie.gertschen@unibe.ch

The Bottom Line

We need to stop treating AI like a junior associate whose work we must painstakingly review, and start treating it like a system of components that can check each other. The lawyers, academics, and financial professionals who thrive in the next five years won’t be the ones who find “trustworthy AI.” They will be the ones who build verification structures that work even when any single component is wrong.

Whether you use a dedicated tool like Triall, enterprise platforms like Maxim AI, or simply build your own multi-model workflows, the era of accepting a single AI’s output at face value is over. If the stakes are high, one model is a liability. Three models, arguing it out, is a strategy.

Reads of the Week

LLM Hallucinations Still Exist, Just on a Higher Level: In this piece for his newsletter, Florin Andrei argues that while current models are mostly hallucination-free at the syntax level, they still fail spectacularly when coordinating complex, multi-component systems. This is highly relevant for financial data pipelines where a single logic error cascades through the entire workflow. It is a sobering reminder that as our systems get more complex, our testing must evolve.
Trust but Verify: How to Get Reliable Work From AI: In this essay for LawSnap, Adam David Long breaks down why professionals need to stop looking for “trustworthy AI” and start designing verification workflows. He argues that AI is a pool of capabilities, not a single entity, and we should use multiple models to draft, attack, and verify work. This is the exact mindset shift required for anyone using AI for high-stakes financial or regulatory analysis.
How to Stop AI From Making Things Up: In this practical guide, Avi Hakhamanesh explores how models are optimised to be “good test-takers” who guess rather than admit uncertainty. She provides concrete prompting strategies to force models to cite sources, flag uncertainty, and welcome missing data. If you are building internal AI tools for your team, these prompt additions are mandatory reading.

Discussion about this post

Ready for more?