Causal Discovery in Insurance: A pyagrum Case Study

Learning causal structure when domain expertise is incomplete

Feb 06, 2026

Maybe present-day reality is very different than your past experience tells you. Image generated with Leonardo AI

There’s a strategic value to challenging your assumptions. To ask: what if the relationships you’ve believed for years are no longer true? What if the data is trying to tell you something different?

Today, we’re going to build that capability. We’ll learn how to discover causal relationships directly from your reporting data using causal discovery algorithms, implement them with pyagrum, and validate the results against domain expertise.

This is different than other forms of causal inference like encoding what you know into a Bayesian network. Today, we’re talking how to learn what you don’t know.

When You Don’t Know the Structure

There’s a tried-and-tested approach to causal inference: defining a DAG based on domain expertise and then fitting parameters to data—works well when you have strong prior beliefs about how your system works. But what happens when you don’t?

Maybe you’re entering a new market and you don’t fully understand the local dynamics. Maybe your market has shifted and your old assumptions are stale. Maybe you’re working with a complex, multi-dimensional dataset and you’re not sure which relationships matter. Maybe you’re seeing patterns in your data that contradict your domain knowledge, and you want to investigate.

In these situations, causal discovery is invaluable. Instead of starting with a DAG, you start with data. You apply algorithms that search for causal relationships, and you let the data guide you toward a plausible causal structure.

This is fundamentally different from correlation analysis. Correlation tells you what’s associated with what. Causal discovery tells you what causes what. The distinction is crucial. Two variables can be highly correlated without one causing the other. But if you can establish a causal relationship, you can make predictions about what happens when you intervene.

How Causal Discovery Works

There are two main families of causal discovery algorithms: constraint-based and score-based. Both have strengths and limitations, and both are useful in different contexts.

Constraint-based algorithms work by testing conditional independence relationships in the data. The logic is: if two variables are independent given a third variable, then that third variable likely blocks the causal path between them. By systematically testing these conditional independencies, constraint-based algorithms can narrow down the set of plausible DAGs.

For example, if you observe that claims frequency and driver age are independent given territory, that suggests territory might be a common cause of both. Constraint-based algorithms build up a causal structure by identifying these blocking relationships.

The advantage of constraint-based approaches is that they’re theoretically grounded. If the conditional independence assumptions hold, the algorithm will find the correct causal structure. The disadvantage is that they’re sensitive to violations of those assumptions. If your data is noisy or your sample size is small, the algorithm might make mistakes.

Score-based algorithms work differently. They search over the space of possible DAGs and score each one based on how well it explains the data. The algorithm tries to find the DAG that maximizes the score—typically using metrics like the Bayesian Information Criterion (BIC) or Minimum Description Length (MDL).

The advantage of score-based approaches is that they’re more robust to noise and small sample sizes. They don’t rely on perfect conditional independence testing; they just try to find the structure that best explains the data. The disadvantage is that they’re computationally expensive (the space of possible DAGs is enormous) and they can get stuck in local optima.

In practice, many modern causal discovery implementations use hybrid approaches that combine constraint-based and score-based methods. They start with constraint-based testing to narrow down the search space, then use score-based optimization to find the best structure within that space.

Introducing pyagrum

pyagrum is a Python library for probabilistic graphical models. It’s less well-known than bnlearn, but it’s particularly powerful for causal discovery and complex inference tasks. Where bnlearn excels at parameter learning and inference for a known structure, pyagrum excels at structure learning and more sophisticated inference.

pyagrum provides implementations of several causal discovery algorithms, including constraint-based methods (like PC and GES) and score-based methods. It also provides tools for visualizing the learned structures and running inference queries.

Here’s the conceptual workflow:

import pyagrum as gum
import pandas as pd

# Load your data
data = pd.read_csv('actuarial_data.csv')

# Run causal discovery (constraint-based example)
learner = gum.BNLearner(data)
learned_bn = learner.learnBN()

# Visualize the learned structure
gum.showBN(learned_bn)

# Run inference on the learned structure
inference = gum.LazyPropagation(learned_bn)
inference.setEvidence({'driver_age': 45, 'territory': 'Urban'})
inference.makeInference()
posterior = inference.posterior('claims_frequency')

What’s happening here is that pyagrum is learning the causal structure directly from your data. You don’t specify the DAG; the algorithm discovers it. The result is a Bayesian network that represents the causal relationships the algorithm found most plausible given your data.

Causal Discovery in Insurance

pyagrum is particularly useful for insurance applications for several reasons:

First, it implements multiple discovery algorithms. Different algorithms make different assumptions and work better in different contexts. pyagrum gives you access to several, so you can try different approaches and compare results. This is important because causal discovery is not a solved problem; different algorithms can produce different results on the same data.

Second, it handles complex inference scenarios. Once you’ve discovered a causal structure, you might want to run inference queries that are more sophisticated than simple prediction. You might want to compute causal effects (what happens if we intervene on this variable?). You might want to do counterfactual reasoning (what would have happened if we’d made a different decision?). pyagrum provides tools for these advanced inference tasks.

Third, it’s designed for transparency. Like bnlearn, pyagrum outputs are interpretable. You can inspect the learned structure, visualize it, and understand what the algorithm discovered. This is essential for validation and communication.

Fourth, it integrates with the broader probabilistic programming ecosystem. pyagrum works well with other Python libraries for data analysis, visualization, and statistical testing. This makes it easy to build a complete discovery-to-validation pipeline.

The Discovery Process: A Practical Example

Let’s walk through a realistic scenario. You’re an actuary at a multi-line insurer, and you’re seeing unexpected volatility in your reserves across all lines. You have a hypothesis that it’s driven by claims volatility and external inflation, but the pattern doesn’t quite match what you’d expect.

You decide to use causal discovery to investigate. You gather data on:

Underwriting characteristics (age, territory, coverage type, line of business)
Premium (by line and segment)
Claims frequency and severity
Loss ratios
Reserve levels
External factors (inflation indices, market conditions)

You have 5 years of quarterly data, so 20 time periods, across multiple segments. That’s a rich dataset with many potential causal relationships.

Step 1: Prepare the data

First, you need to discretize continuous variables. Causal discovery algorithms typically work with discrete variables. You bin premium into categories (low, medium, high), loss ratio into categories, inflation into categories, etc. You also need to handle missing data and outliers.

Step 2: Run discovery

You run a causal discovery algorithm (let’s say a constraint-based method like PC). The algorithm tests conditional independence relationships in your data and builds up a causal structure. It might discover, for example, that:

Underwriting characteristics cause premium
Premium and underwriting characteristics cause claims frequency
Claims frequency and severity cause loss ratio
Loss ratio and premium cause reserves
Inflation affects loss ratio

But it might also discover unexpected relationships:

Geographic mix shift causes loss ratio (not just external inflation)
Underwriting tightness in year N predicts claims frequency in year N+1
Reserve levels in year N affect underwriting decisions in year N+1 (a feedback loop)

Step 3: Visualize and inspect

You visualize the learned DAG. You look at the structure and ask: does this make sense? Are there relationships that contradict domain knowledge? Are there relationships that surprise you?

Step 4: Validate against domain expertise

This is the critical step. You take the discovered structure to domain experts. You ask: “The algorithm found this relationship. Does it make sense? Is it plausible? Can you think of a mechanism?”

Some discovered relationships will pass this validation. Others won’t. You might refine the structure based on expert feedback. You might add constraints that reflect domain knowledge (e.g., “inflation can’t be caused by underwriting decisions”).

Step 5: Compare to alternatives

You run different discovery algorithms and compare results. Do they converge on the same structure? Or do they disagree? If they disagree, that’s a signal that the structure is uncertain, and you need to be careful about how you use it.

Step 6: Test on holdout data

You fit the discovered structure to a subset of your data, then test it on holdout data. Does the model generalize? Or does it overfit? This helps you assess whether the discovered relationships are robust.

The Challenges of Discovery

Causal discovery is powerful, but it’s not magic. There are several challenges you need to be aware of:

Identifiability: Multiple different causal structures can produce the same statistical relationships in the data. This is called non-identifiability. The data alone can’t tell you which structure is correct. You need domain knowledge to break the tie.

Sample size: Causal discovery algorithms need sufficient data to reliably estimate conditional independence relationships. With small samples, the algorithms can make mistakes. For insurance, this is usually not a problem (you typically have years of data), but it’s worth knowing.

Faithfulness: Causal discovery algorithms assume that the observed conditional independencies in the data reflect the true causal structure. If there are hidden confounders or complex feedback loops, this assumption can be violated, and the algorithm can produce incorrect results.

Computational complexity: The space of possible DAGs grows exponentially with the number of variables. For large datasets with many variables, discovery can become computationally expensive. For most actuarial applications (typically 10-50 variables), this is manageable, but it’s something to keep in mind.

Validation burden: The discovered structure is only as good as your validation. If you don’t validate carefully against domain expertise, you can end up with a plausible-looking but incorrect causal structure.

When to Trust the Output

So when should you trust a discovered causal structure? Here are some guidelines:

Strong signals: Relationships that are discovered by multiple algorithms, that are statistically strong, and that are consistent with domain knowledge are high-confidence discoveries.

Weak signals: Relationships that are discovered by only one algorithm, that are statistically marginal, or that contradict domain knowledge are low-confidence. You should be skeptical of these.

Red flags: Relationships that violate domain knowledge or business logic (e.g., future events causing past events) are probably errors. You should investigate.

Feedback loops: If the discovered structure includes feedback loops (A causes B, B causes A), that’s interesting but needs careful interpretation. Feedback loops are possible in real systems, but they’re also a sign that your data might be capturing dynamic relationships that are hard to interpret causally.

Practical Considerations

A few practical points for implementing causal discovery in an insurance context:

Start with domain knowledge. Even though you’re doing discovery, you should start with constraints based on domain expertise. For example, you know that future events can’t cause past events. You know that certain variables are exogenous (like external inflation). You can encode these constraints into the discovery algorithm to guide it toward more plausible structures.

Use multiple algorithms. Don’t rely on a single discovery algorithm. Try constraint-based, score-based, and hybrid approaches. If they converge on the same structure, that’s reassuring. If they disagree, that’s a signal to investigate further.

Validate extensively. Take the discovered structure to domain experts. Ask them to critique it. Ask them to suggest alternative structures. Use their feedback to refine the result.

Test on holdout data. Fit the structure to a subset of your data, then test it on holdout data. This helps you assess generalization and robustness.

Document your assumptions. Causal discovery makes many assumptions (about data quality, stationarity, faithfulness, etc.). Document these assumptions and be clear about when they might be violated.

The Bigger Picture

What we’ve described here is how to move from the known to the unknown. Instead of assuming that we understand the causal structure and focus on learning the parameters, we focus on learning this causal structure from data.

In reality, most actuarial work involves a mix of both. You have some domain knowledge about how your system works, but you’re also uncertain about some relationships. You have data that can teach you, but you know the data might be biased or incomplete.

The best approach is iterative. You start with domain knowledge to define a rough structure. You fit a Bayesian network to your data. You validate the results. If something doesn’t make sense, you use causal discovery to investigate. You discover new relationships, validate them, and refine your model. You repeat.

This is how you build causal models that are both theoretically sound and practically useful. You combine human expertise with algorithmic discovery. You let domain knowledge guide the process, but you also let the data surprise you.

The big question of causal inference is: are you willing to challenge your assumptions? Today, we’ve shown you how. The tools are available. The methodology is sound. The question is now only whether you’re willing to use them.

Discussion about this post

Ready for more?