Giving Companies a Machine-Readable Theory of Themselves
From Knowledge Graphs to Causal DAGs

In recent years, knowledge graphs have become the go-to method for organizing corporate data. They allow us to link entities, attributes, and relationships into a coherent web of meaning. If you’ve ever seen a graph connecting a company to its CEO, industry, ESG score, or carbon emissions, you’ve interacted with one.
But as powerful as these graphs are for describing a company, they fall short in one crucial area:
They don’t tell us why anything happens.
At Wangari, we believe companies need more than a graph of what is.
They need a machine-readable theory of themselves—a model that not only reflects their current state but explains how that state came to be, and how it might evolve under different conditions.
That requires moving beyond knowledge graphs and into the realm of causal DAGs:
Directed Acyclic Graphs that encode not just semantic links, but structural cause-and-effect relationships.
This article is about that shift—and why it matters.
Knowledge Graphs: The Semantic Backbone
A knowledge graph (KG) represents entities (like “Company A” or “CEO B”) as nodes, and their relationships (e.g. “has CEO”) as edges. These structures are ideal for organizing facts, powering search, or enriching AI agents with contextual awareness.
For example, a corporate KG might encode:
Company A → has CEO → Person BCompany A → is in sector → IndustrialsCompany A → has ESG score → 62
Such graphs form the semantic foundation of many modern LLM applications, recommendation engines, and corporate data lakes. They answer questions like:
Who is the CEO?
What sector is this firm in?
Which suppliers are exposed to which countries?
But here’s the problem: none of these relationships imply causality.
They’re connective tissue, not explanatory structure.
Knowing that a company has a low ESG score and declining revenue tells us nothing about whether or how one influences the other.
Causal DAGs: Modeling "Why"
A causal DAG (Directed Acyclic Graph) looks similar to a knowledge graph on the surface. It also has nodes (variables) and directed edges (arrows). But these arrows have a very different meaning:
They encode hypothesized causal relationships.
That is, a causal DAG says: “Variable X causes changes in Variable Y”—not just that X is related to Y.
For example, a simple causal DAG might include:
Leadership Diversity → Innovation OutputInnovation Output → Revenue GrowthBrand Trust → Talent Retention
These aren’t just visualizations—they’re computational models.
We can use them to:
Simulate interventions (“What if leadership diversity increases?”)
Estimate causal effects (“How much does innovation output affect revenue?”)
Predict counterfactuals (“What would have happened if trust hadn’t declined?”)
Crucially, causal DAGs allow for reasoning under change.
They form the backbone of strategic foresight tools that aim to go beyond observation into simulation.
Why Companies Need Causal Structure
Let’s bring this down to earth.
Imagine two firms—similar in size, industry, and product—but one is bouncing back after a downturn, and the other is stagnating. Their dashboards look similar. Their KPIs are flat. The difference isn’t in what’s measured, but in the underlying causal dynamics.
Maybe one has:
a tighter feedback loop between customer complaints and product adjustments
or a culture where internal knowledge flows freely
or more resilient supplier relationships triggered by early warning systems
None of this shows up in a typical knowledge graph. But a causal DAG can model these mechanisms—and allow us to test which ones actually matter.
That’s what we mean by a machine-readable theory of the company:
A model that encodes not just structure but influence.
It’s not just a map of “what connects to what.”
It’s a map of what drives what.
From Data to DAG: How We Build It
Constructing a causal DAG isn’t trivial. It requires navigating two big challenges:
Variable Selection and Abstraction
In knowledge graphs, nodes often represent entities (e.g. companies, people, products).
In causal DAGs, nodes typically represent variables—quantities that can change and exert influence.
So we first define a rich, multi-level ontology:
Structural variables: team size, reporting lines, funding rounds
Cultural variables: trust, openness, psychological safety
Behavioral variables: decision-making cycle time, turnover rate
External signals: public sentiment, regulation pressure, peer performance
These are derived from structured data, but also from unstructured sources: filings, earnings calls, sustainability reports, Glassdoor reviews, etc.
We use a hybrid RAG (Retrieval-Augmented Generation) pipeline to extract and structure these variables.
Causal Discovery with Priors
We don’t just assume causal links—we discover them, using a blend of:
Expert priors: curated assumptions about what drives what
LLM-generated hypotheses: educated guesses from foundation models trained on vast corpora
Causal discovery algorithms: statistical methods like PC, GES, or NOTEARS to validate directionality
This hybrid approach allows us to construct DAGs that are both interpretable and grounded in data.
And because our data updates in real-time, the DAGs evolve.
Our models don’t just describe a company once—they track and simulate it over time.
The Endgame: A Causal Digital Twin
When you combine:
a well-scoped ontology
a structured data pipeline
and a dynamic causal DAG—
You don’t just get analytics.
You get what we call a causal digital twin.
It’s a system-aware, intervention-ready model of an organization that reflects both its internal structure and external context.
It can be queried like a knowledge graph, but also simulated like a scientific model.
And that unlocks a new frontier:
What drives resilience?
What happens if a key team dissolves?
How does external scrutiny affect internal dynamics?
Which strategic moves actually change long-term outcomes?
These are the questions most executives want to answer—but until now, lacked a formal way to ask.
The Bottom Line: A Mirror, Not a Prediction
Causal DAGs are not fortune tellers.
They don’t give certainties.
What they offer is something subtler—and, arguably, more powerful:
A structured way to think.
By representing a company’s most meaningful internal and external relationships in a causal form, they give analysts, leaders, and investors a new kind of visibility—one that invites exploration, not just explanation.
At Wangari, we believe the future of strategic intelligence won’t come from bigger dashboards.
It will come from better models—models that let organizations see themselves clearly, simulate with purpose, and act with intention.


