Causal Inference is Finally There

Why causal inference is finally arriving in industry — thirty years after it was invented

Apr 28, 2026

Causal inference is like the lighthouse of statistics: You can see further, even over uncharted waters. Image generated with Leonardo AI

For three decades, a small community of researchers has known something that the rest of the data world is only now beginning to absorb: the most important question you can ask about your data is not “what correlates with what?” It is “what causes what?” The methodology to answer that question rigorously has existed since the early 1990s. The tools to apply it at enterprise scale have not — until now.

This is the story of a thirty-year gap between a scientific breakthrough and its practical arrival. And it is a story that matters enormously for anyone in financial services who is trying to build AI systems they can actually trust.

The problem with correlation

Correlation is the workhorse of modern data analysis. It is fast, scalable, and surprisingly powerful. When you train a model on historical data and ask it to predict future outcomes, you are essentially asking it to find and exploit correlations — patterns that held in the past and might hold in the future. This works well enough in stable conditions. It fails, often silently, when conditions change.

The deeper problem is that correlation cannot answer the question that actually matters in many regulated industries: what happens if we intervene? If you change your underwriting criteria, restructure a portfolio, or shift your ESG policy — you are not observing the world, you are changing it. A correlation-based model has nothing principled to say about what happens next. It can only extrapolate from the past. So if unprecedented conditions occur, it’s flummoxed: It was trained on a world where things co-occurred; it has no mechanism for reasoning about a world you have deliberately altered in previously unseen ways.

This is not a data quality problem. It is not a model size problem. It is a fundamental limitation of correlation as a mode of reasoning. And it is why industry professionals like actuaries, risk officers, and investment analysts have always maintained a healthy scepticism toward purely statistical models — even when they perhaps could not always articulate exactly why.

What causal inference actually does

Causal inference, in the technical sense developed by Judea Pearl and colleagues, is a framework for reasoning about interventions. Instead of asking “what tends to happen when X is high?”, it asks “what would happen if we set X to a specific value — holding everything else constant?” The two questions sound similar. They have very different answers, and they require very different mathematics.

The key tool is the Structural Causal Model: a formal representation of a system as a set of variables and the directional mechanisms that connect them. Not just correlations, but causes. The model encodes which variables drive which outcomes, through which pathways, and with what structure. Once you have that model, you can answer interventional questions directly — not by extrapolating from historical patterns, but by reasoning through the causal structure of the system.

For industry and financial services, this matters in ways that are immediately practical. A model of a manufacturing plant that’s built on causal structure can tell you whether improving your sustainability practices will actually improve your financial performance — or whether both are driven by a third factor, like management quality or regulatory environment. A risk model built on causal structure can tell you which interventions will actually reduce tail risk — not just which variables happen to be correlated with it. These are the questions that senior decision-makers are actually asking. Correlation-based models cannot answer them.

Why it took thirty years

If the methodology was ready in the 1990s, why are we only now seeing it arrive in enterprise software? The honest answer is that applying causal inference at scale has always required an enormous amount of expert labor.

Building a causal model is not like training a neural network. You cannot simply feed it data and let it find patterns. You need to specify the causal structure of the system — which variables are causes, which are effects, which are confounders. This requires domain expertise, iterative validation, and careful reasoning about the mechanisms at play.

For a complex system with dozens of interacting variables, this process could take weeks of expert workshops. And that was before you got to the question of how to translate the resulting model into answers to specific business questions.

The bottleneck was never the mathematics. It was the cost of applying the mathematics to real-world problems. Causal inference was tractable in academic settings, where a team of specialists could spend months on a single model. It was not tractable in enterprise settings, where you need answers in days, not months, and where the domain experts who could validate the causal structure are also the people running the business.

What changed: AI agents, of course

The emergence of capable AI agents has changed this equation in a way that is genuinely new. Tasks that previously required weeks of expert time — synthesising domain literature to identify candidate variables, proposing and testing causal graph structures, running systematic validation checks, translating business questions into formal interventional queries — can now be completed in hours. The methodology has not changed. The infrastructure for applying it at scale has.

This is not the same as saying that AI agents can replace domain expertise. They cannot, and they should not. The judgment layer — validating the causal structure against real-world knowledge, deciding which interventions are worth modelling, interpreting results in context — remains human. What agents automate is the process layer: the high-volume, well-defined, error-prone work that was consuming most of the expert’s time without requiring most of the expert’s judgment.

The combination of mature causal methodology and modern agentic AI infrastructure is genuinely new. It is not a marginal improvement on existing approaches. It is a different class of tool — one that can answer questions that correlation-based systems cannot, at a cost that is now commercially viable for the first time.

The Bottom Line

The organisations that build causal AI capabilities now are not just getting better analytics. They are building a fundamentally different relationship with their data — one where the question “why?” has a rigorous, auditable answer, not just a plausible-sounding one. In regulated industries, where the cost of a wrong answer is measured in capital requirements, regulatory penalties, and reputational damage, that difference is not academic. It is the whole game.

The methodology has been ready for thirty years. The infrastructure just caught up.

The window for first-mover advantage is open. It will not stay open.

Reads of the Week

The problem with agentic AI in 2025: In this essay, Sangeet Paul Choudary argues that most organisations are treating agentic AI as a faster version of robotic process automation — and missing the point entirely. His central claim is that the real value of agents is not in executing workflows more cheaply, but in eliminating the logic of workflows altogether, and that governance — not execution speed — is the primary performance driver of a well-designed agentic system. Directly relevant to anyone thinking about how AI agents should be deployed in regulated, high-stakes environments.
Correlation vs. Causation: Why It Matters for Investors: Alessio Sancetta’s take makes the core argument with unusual clarity: correlation describes a pattern, but without a causal anchor, even robust-looking relationships can collapse the moment conditions change. The 2022 equity-bond drawdown is the worked example — a correlation that held for two decades, built on a conditional relationship that most practitioners had mistaken for a structural one. A useful complement to this week’s post, written for a portfolio construction audience rather than a technical one.
Causation Does Not Imply Variation: John H. Cochrane offers a useful corrective to the other direction: just because you have identified a causal effect does not mean it explains much of the variation in the outcome you care about. Cochrane’s argument — that the causality revolution in econometrics has produced many well-identified but tiny effects, and that practitioners often jump from “this causes that” to “this explains that” without stopping to think — is an important caveat for anyone building causal models in production. Read it as a reminder that causal inference is a tool for answering specific questions, not a general-purpose explanation of the world.

Discussion about this post

Ready for more?