How to read these
The clusters are organized by where in the work they apply. The first two cover what the framework is and when it fits — start here if you are deciding whether to use causal modeling at all. The middle two cover the operational arc of an engagement — how you build a model, and how you query it once it exists. The last four cover advanced operations, methodological pairs, and reference material.
Each section below opens with framing on its sub-hub, then a collapsible list of the pages in it. Expand any block to read the explanation inline, or follow the link at the bottom to the full page.
Foundations 2 sub-hubs
Causal modeling rests on a small number of structural ideas, and most misuse of “AI for risk” comes from mistaking these ideas for statistical ones. The pages in this cluster establish the floor: what is a causal model, what kinds of questions can it answer, and what reasoning operations does the framework expose?
Read in order, the cluster takes you from “what is this thing” through “how does evidence enter it” to “what categories of question does it answer.” If you have not seen this material, you are reading every other section of the site at a disadvantage.
▾Hide / show all
▸A causal model simulates, it doesn’t predict.
Prediction is a Rung 1 operation: it describes the world as it is. Simulation is a Rung 2 operation: it reasons about the world as it would be under a specific intervention. The distinction is not technical — it is logical, and different kinds of decisions require different kinds of formal structures. A tool that predicts can give you a calibrated forecast and still be useless for a question that requires an intervention.
▸What a Bayesian network is.
Not a statistical summary of what has happened. A formal representation of the mechanism — which variables cause which others, with what strength, under what conditions. The map can be queried in any direction: forward to predict consequences, backward to diagnose causes. The structure is what carries the causal claim; the conditional probability tables quantify it.
▸What a Structural Causal Model is.
A system of equations — one per variable — that encodes both the causal structure (which variables affect which) and the individual-level variation (what makes each case unique). It is the only formal object proven sufficient to answer interventional and counterfactual questions from observational data. A Bayesian network gets you to Rung 2; an SCM is what makes Rung 3 computable rather than rhetorical.
▸How Bayesian updating incorporates new evidence.
Organizations accumulate evidence continuously — operational data, market signals, audit findings, incident reports. Most of it is used informally, if at all. Bayesian updating is the mechanism for using it formally: starting from a prior belief, incorporating the evidence, and arriving at a posterior belief that is the correct basis for the next decision. The procedure is mechanical once the causal structure is specified; the work is in specifying it.
▸Pearl’s ladder of causation.
A logical hierarchy, not a technical one. Three question types — what is correlated, what would happen under intervention, what would have happened differently — are categorically distinct kinds of reasoning, not levels of sophistication. The page covers the formal proof that the gap between rungs cannot be closed by scale, data, or algorithmic improvement; only by a different kind of model.
▸The Rung 3 procedure in practice.
Every population model can tell you what tends to happen. Only a counterfactual model can tell you what would have happened to this specific case under different conditions — because it anchors the individual’s unobserved background from what actually happened before running the hypothetical. The page walks the three-step procedure on a real causal model: abduction, action, prediction.
▸Pearl’s ladder is the headline. The rest of the menu is wider.
The three rungs are a useful taxonomy, not the whole framework. Once an SCM exists, an unexpectedly wide set of substantive questions reduce to inferential procedures on the model — mediation, sensitivity to unmeasured confounding, transportability, selection-bias correction, attribution, dynamic treatment regimes, fairness, mechanism interventions, distribution-shift robustness. The page is the menu of what the framework can be asked.
A causal analysis can fail two ways. The first is structural — the model is built on a wrong assumption about how variables relate, and the answer is precisely wrong rather than approximately right. The second is conceptual — a Bayesian network is brought to a problem it was never the right tool for, and the analysis would be better served by something else entirely.
These four pages address both failure modes. The first two cover the structural pitfalls — confounding, mediation, moderation, and the paradox they produce when handled poorly. The other two name when a Bayesian network is the right answer and when it isn’t.
▾Hide / show all
▸Three ways variables relate causally. Adjusting for the wrong one destroys the analysis.
Four structural relationships between variables determine which ones to adjust for and which to leave alone. Getting this wrong does not produce an approximate answer — it produces a precisely wrong one. The correct classification is a logical question about the graph, not a statistical question about the data.
▸Simpson’s Paradox: the aggregate says one thing. Every subgroup says the opposite.
Simpson’s Paradox is the clearest possible demonstration that a confounder omitted from an analysis can reverse the sign of an effect — not attenuate it, reverse it. The same data supports opposite conclusions depending on whether you adjust for the right variable.
▸When to use a Bayesian network.
Four conditions distinguish a Bayesian network from every other modeling tool. When all four are present, nothing else fits the question. Most modeling tools are predictive; the Bayesian network is the one structure built for the question what would happen if we changed something? — and that question is only answerable when the causal graph is on the table.
▸When not to use a Bayesian network.
A Bayesian network is the right structure for causal reasoning under uncertainty. It is not the right structure for everything. The case for it is strong, which is exactly why it is worth being precise about where it is the wrong tool — so the recommendation retains credibility when it matters most.
The engagement arc 2 sub-hubs
Most modeling work the field calls “model building” is parameter fitting on a structure that was assumed. Causal modeling reverses that: the structure is the deliverable, the parameters are the tail end. The work is in the elicitation, the disagreement reconciliation, and the explicit declaration of what the model does not cover.
These three pages cover the building process from three angles — from the artifact that often starts the engagement (a risk register), through the method itself, to the part that turns out to be hardest in practice (elicitation).
▾Hide / show all
▸From risk register to causal graph.
A risk register is an inventory built around the wrong question. It records what might go wrong and how bad it would be. It does not model how risks connect, which failures cause which others, or what happens to the portfolio if you intervene on one variable. Every board question that matters is a causal question; the register cannot answer them.
▸How to build a causal model, step by step.
A step-by-step walkthrough, not a conceptual overview. It covers the three-step counterfactual procedure as written derivation rather than live demo. Your data tells you what happened; it cannot tell you what would have happened. Closing that gap is the work the page walks through.
▸Why elicitation is hard.
A causal model encodes mechanism knowledge data alone cannot recover. That mechanism knowledge lives in expert heads, and getting it out is the work the modeling literature spends the least time on and gets wrong the most. Probability elicitation has been studied formally for fifty years; the findings are unflattering.
A causal model is not a static report. Once built, it accepts queries — formal questions about what would happen under interventions, what evidence would change a decision, where in the system a failure is most likely originating. The pages in this cluster cover the principal query types and what each one answers.
These five pages walk the major queries: the canonical architecture of a Bayesian risk model, the conversion of a risk model into a decision model, the value of gathering more evidence before acting, the diagnostic direction of inference, and how sensitive a conclusion is to the no-unmeasured-confounding assumption it rests on.
▾Hide / show all
▸The canonical architecture of a Bayesian risk model.
Every domain is different. But the structure of a Bayesian risk model is not. Five variable types appear in every well-formed risk model, in a specific causal order, with specific relationships between them. Recognizing this architecture is what lets a domain expert build a model systematically rather than staring at a blank sheet.
▸A Bayesian network models uncertainty. An influence diagram models decisions.
Adding decision nodes and utility nodes to a Bayesian network converts a risk model into a decision model. The graph no longer just describes what might happen — it computes what you should do.
▸Value of Information: should you act now, or gather more evidence first?
Value of Information is the formal answer to the most common executive decision: is it worth investigating further before committing? A causal model computes it precisely — the expected improvement in the decision if a specific uncertainty were resolved, compared to the cost of resolving it.
▸Diagnostic reasoning in a causal model.
Bayes’ theorem is symmetric. Set evidence on a cause, and the network predicts the effect. Set evidence on the effect, and the same network infers the most probable cause. The graph does not change. The parameters do not change. Only the direction of the query changes — and the diagnostic direction is what every post-mortem needs.
▸Sensitivity analysis: what if the no-confounding assumption is wrong?
Almost every observational causal claim rests on a load-bearing assumption: no unmeasured confounding. The conclusion is right if there is no hidden variable doing the work being attributed to the treatment. The assumption cannot be tested directly — if it could, you would measure the variable. Sensitivity analysis quantifies how much hidden confounding would have to exist to overturn the conclusion.
Advanced & reference 4 sub-hubs
Selective abduction picks the best explanation from a fixed list of candidates. Creative abduction proposes a new explanation that was not on the list to begin with — typically by positing a latent variable. Both are core to senior expert reasoning. Both can be made explicit in a causal model.
These two pages take each in turn: the formal mechanics of abductive inference on a built model, and the latent-variable construction that converts a senior expert’s intuition into a structural claim the team can examine.
▾Hide / show all
▸Diagnosis finds the most probable cause. Abduction finds the best complete explanation.
When a known cause almost fits but not quite — when several anomalies arrive simultaneously and no single hypothesis accounts for all of them — diagnosis reaches its limit. Abduction is the reasoning mode that generates explanations, not just selects among them. It is how experts think when they encounter something genuinely new.
▸Creative abduction: what your senior experts are actually doing.
The thirty-year expert glances at a chart and says the three patterns are driven by the same unnamed thing. Everyone in the room has seen this moment. It is not intuition in the vague sense, and it is not a slower version of what a structure-learning algorithm does. It is a specific reasoning operation: introducing a latent variable that was not in the model to begin with.
A static Bayesian network assumes the structure is the same at every moment in time. For many engineering and risk-management problems this is fine — the question is about steady-state behavior. For others — sepsis management, equipment aging, market regime change — the structure itself evolves, and the model must capture that evolution as a first-class property.
These two pages cover the two relevant extensions: dynamic Bayesian networks for continuous evolution, and gated Bayesian networks for discrete regime change.
▾Hide / show all
▸Dynamic Bayesian networks: how a system evolves over time.
A static Bayesian network models a system at one point in time. A dynamic Bayesian network models how that system evolves — today’s state producing tomorrow’s. Stress testing, early warning indicators, and scenario simulation all require this temporal structure, without abandoning causal reasoning.
▸Gated Bayesian networks: one model, multiple regimes.
A standard Bayesian network is calibrated on a single distribution — the one that prevailed in the training data. When the regime changes (market moves from trending to mean-reverting, a system shifts from normal to degraded operation, a counterparty moves from solvent to distressed), the conditional probabilities no longer hold. A gated network encodes the regime boundary explicitly.
Most organizational dysfunction at scale comes from KPIs that were sensibly chosen as observational measures and then catastrophically pressed into service as causal targets. The structural reason is identifiable, and the solution is to design measures with their downstream optimization in mind.
These two pages take the diagnosis first — Goodhart’s Law as a causal identification failure — then the prescription: how to design KPIs that don’t break under optimization pressure.
▾Hide / show all
▸Goodhart’s Law is a causal identification failure.
When a measure becomes a target, it ceases to be a good measure. In causal terms: optimizing a Rung 1 proxy is not the same as intervening on the Rung 2 cause. This applies to every KPI in every organization — and almost no KPI system ever checks whether the claim holds.
▸Designing KPIs that don’t break.
Most KPIs are selected by asking what correlates with the desired outcome. That is a Rung 1 question. The moment the KPI becomes a target, the organization begins intervening on it — a Rung 2 operation. The two answers are not the same. A KPI designed without checking whether those two answers agree will break under optimization.
These three resources serve different needs. Four Paradigms compares causal AI to other AI traditions. The Literature page is the annotated reading list for practitioners. Download Models contains the working Bayes Server files for every case study on the site.
▾Hide / show all
▸Four paradigms for combining LLMs and causality.
There are at least four ways to combine LLMs and causality. Three place the LLM in different roles — assistant, reasoner, or generator of causal knowledge. The page lays out the comparison and names the bet this practice is making.
▸Research that moves the field. Annotated for practitioners.
Paper titles and abstracts are not readable. Each entry on the page states what the work does, why it matters for applied causal modeling, and what it changes or confirms in how the models on this site are built.
▸The model files: download and run.
Not diagrams — working models with conditional probability tables, causal arrows, and full probability distributions. Download, set evidence on any node, run inference, test interventions. Each model is the working version of a case study on this site.
Next
For the worked cases that show these methods in operation, see Cases — twenty-eight engagements organized by industry. For the same cases organized by risk type instead, see About Risk.
For the argument for why current tools fail in a way these methods address, see Why Current Tools Fail.