The three rungs are a useful taxonomy for the most common causal questions: what is correlated, what works on average, what works for a particular case. They are not the whole framework.

An Structural Causal Model (SCM) is a complete description of how the world generates data. Once you have one, an unexpectedly wide set of substantive questions reduce to inferential procedures on the model.

Dialog: This sounds like overengineering. Why not just use machine learning? — answered: ML answers what tends to happen. Most decisions hinge on what would happen if you acted differently — a question ML cannot answer regardless of dataset size. ML and causal models are complementary: ML for prediction, SCMs for the counterfactual claims that drive consequential decisions.

Most introductions to causal inference present Pearl’s ladder — association, intervention, counterfactual — and stop there. The taxonomy is genuinely useful, especially for distinguishing what observational data can and cannot answer. But the three-rung framing leaves a number of important capabilities of the SCM framework underemphasized or invisible. This page lists them.

Each of the capabilities below is a real piece of the framework, with its own theory and its own algorithmic implementations. Most have been used somewhere in the case studies on this site. The intent here is to make them legible as distinct things you can do with an SCM, rather than burying them inside whichever case study happened to need them.

The most foundational SCM capability is also the simplest to state: a correlation between two variables is consistent with at least six different causal structures, and the data alone does not distinguish them. Identifying which structure is true requires bringing something other than the data — temporal precedence, mechanistic theory, intervention, instrumental variables, or a causal graph.

This is the question that motivates the entire framework. If correlations identified causes, no SCM would be necessary; the structure could be read off the data. They don’t, and it can’t.

Decompose a total causal effect into the part that flows through a specific mediator versus the part that bypasses it. Classic question: “does this drug work because it lowers blood pressure, or for some other reason?” SCMs make the decomposition rigorous through natural direct effects and natural indirect effects, quantities defined by nested counterfactuals (Rung 3+) that have no Rung 1 analog.

Useful for: regulatory submissions where the mechanism matters, fairness audits where you want to separate “effect through legitimate predictors” from “effect through proxies for protected attributes,” policy evaluation where you want to know which lever did the work.

Most observational analyses assume “no unmeasured confounding” without quantifying what would happen if there were some. SCMs let you add a hypothetical unmeasured confounder — call it U — with specified strength, and ask: how strong would U need to be to overturn the conclusion? The E-value (VanderWeele & Ding) and omitted-variable bias bounds (Cinelli & Hazlett) operationalize this.

Indispensable for any causal claim from observational data. The Drug Repurposing case study computes an e-value implicitly when it asks how strong an unmeasured selection mechanism would have to be to overturn the recommendation.

When a study is done in one population and you want to apply the result to a different one, the selection diagram — due to Bareinboim and Pearl — encodes the assumed differences between source and target as edges from a special node S. Identification theory tells you when transport is possible, and what formula computes it.

Survivor bias, attrition, missing-not-at-random data, healthy-worker effect, censoring. SCMs treat selection as a node in the graph — the same S machinery used in transportability — and ask whether the target effect is recoverable from the selected sample. Often it is, with the right adjustment. Often it isn’t, and the SCM tells you which.

This is closely related to transportability; both are special cases of the broader data fusion framework (Bareinboim & Pearl 2016).

Given a DAG, the back-door criterion tells you exactly which variables you need to condition on to identify a causal effect — and equally importantly, which ones you should not condition on. The notorious M-bias graph is the example everyone remembers: there are graphs where adjusting for an apparently helpful covariate actively breaks identification.

Algorithmic implementations (e.g. DAGitty) read a graph and return all minimal sufficient adjustment sets. The Causal Identification page covers this in depth.

Given a person who took a drug and developed an adverse outcome, what’s the probability that the drug caused the outcome (probability of necessity) versus would have happened anyway? Tian-Pearl bounds give partial identification when full counterfactual identifiability fails.

Critical for tort liability, insurance attribution, and any setting where a defendant or claimant needs an individual-level causal claim, not a population-level one. The Pharmacovigilance case study uses this directly: probability of necessity that an NSAID caused a particular AKI episode.

Sequential decision-making under uncertainty — what to do at each time step given how the patient or system has evolved. Methods like G-computation, G-estimation, and inverse-probability weighting identify optimal policies from observational sequences. The Q-learning literature in reinforcement learning is doing essentially the same thing under different vocabulary, with SCM theory providing the conditions under which the policy is identifiable from the data we have.

The Sepsis Dynamic Treatment case study walks through a two-time-step DTR end to end.

Several fairness criteria in machine learning are SCM constructs. Counterfactual fairness, path-specific fairness, and demographic parity through a structural lens all reduce to questions on a graph. The question “would this loan have been denied if the applicant had been a different race, holding everything causally downstream of race fixed?” is a Rung 3 counterfactual.

Standard ML fairness metrics — demographic parity, equal opportunity — are Rung 1 quantities. They cannot distinguish between effects flowing through legitimate paths versus illegitimate ones. SCMs give the vocabulary to do that.

Going the other direction: from data, infer the graph. Algorithms like PC, GES, NOTEARS, and LiNGAM attempt to recover the structure. None of these are reliable in the way that identification given a graph is reliable — discovery requires strong assumptions (faithfulness, Markov equivalence, parametric form) and gets weaker as variable counts grow.

But discovery algorithms are genuinely useful as a starting point for expert review: hypothesize a structure, then bring it to a domain expert who refines or rejects it. Particularly valuable when you have hundreds of variables and don’t want to start from a blank canvas.

Beyond do(X = x), SCMs support changing the mechanism rather than the value — replacing one structural equation with another. “What if we kept the variable but changed the function determining it?” This is what policy reform looks like (changing how a decision is made, not what it equals); what regulatory intervention looks like (changing the rule that determines emissions, not emissions directly); what protocol improvement looks like (changing the algorithm that triages patients, not which patients are triaged how).

The same machinery that handles transportability also handles covariate shift, concept drift, and dataset shift more generally. If you know which mechanisms are stable across deployment contexts and which aren’t, you can train models that rely only on the stable mechanisms.

This is the bridge between SCMs and modern out-of-distribution generalization research: invariant prediction (Peters, Bühlmann, Meinshausen), causal representation learning (Schölkopf and collaborators), and the broader effort to build ML systems that don’t silently fail when the world they were trained on shifts.

What ties these together: an SCM is a machine for generating the world, with structural assumptions exposed as a graph. Once you have that, almost any causal question reduces to an inferential procedure on the model.

The three rungs are a taxonomy for the most common queries. Plenty of important questions live at the boundaries:

  • Mediation needs nested counterfactuals (between Rungs 2 and 3).
  • Sensitivity analysis is a meta question about Rung 1/2/3 robustness.
  • Discovery is a question about the model itself, not queries on it.
  • Attribution requires partial identification when full identifiability fails.

None of these are exotic extensions. They are core capabilities of the framework, used routinely in practice. Pearl’s ladder is the headline; the rest of this page is the rest of the menu.

In the cases
Healthcare
Pharmacovigilance Attribution
Probability of necessity. The patient took the NSAID, then developed AKI. What’s the chance the drug caused it?
Healthcare
Sepsis Dynamic Treatment
Two-time-step optimal dynamic regime. When to give pressors, conditional on how the patient is evolving.
Healthcare
Drug Repurposing & Transport
Transportability + sensitivity bound. Trial result moved to a target population, with an e-value on the recommendation.
Healthcare
Treatment-Resistant Depression
Sequential treatment lines. Latent type abducted from response history. Counterfactual on first-line choice.
Next Step

Most organizations underuse what their SCM can do once they have one. A causal audit names the questions you can already answer with the model you already have.

info@rung3.ai