Confounding, Mediation, and Moderation

Confounders: adjust. Mediators: do not adjust. Colliders: never condition on. Moderators: interact.

The correct choice requires a causal graph. The wrong choice produces a precisely wrong conclusion — not an approximately right one.

Three causal structures: Confounder (AGE causes both Treatment and Outcome), Collider (Treatment and Outcome both cause a third variable), Mediator (Treatment causes mediator which causes Outcome) — The three structures that determine your adjustment set. Adjust for the confounder. Leave the mediator alone. Never condition on the collider.

Dialog: What's the difference between a confounder, a mediator, and a moderator? — answered: different roles in the graph, different rules. Confounders: adjust for them. Mediators: don't — that blocks the effect you're measuring. Moderators: they change the size of the effect, not whether it exists. The structure tells you which is which.

Confounding

A confounder is a variable Z that causally affects both X (the treatment or intervention) and Y (the outcome). The classic structure: Z → X and Z → Y. Because Z causes both, X and Y will be correlated even if X has no causal effect on Y whatsoever.

Example: ice cream sales and drowning deaths are correlated. Neither causes the other. Temperature — the confounder — causes both. Any regression of drowning on ice cream sales that omits temperature will find a spurious positive coefficient.

In a causal graph, a confounder creates a back-door path from X to Y: a path that runs through the parents of X rather than through X’s descendants. The back-door criterion identifies the minimal set of confounders to adjust for in order to close all such paths and recover the true causal effect.

Adjusting for a confounder: correct

Include Z in a regression, or stratify by Z, or use it to reweight the sample. Any of these closes the back-door path and removes the spurious correlation. The residual association between X and Y — after removing Z’s influence — is the causal effect.

Mediation

A mediator is a variable M that lies on a causal path from X to Y: X → M → Y. The mediator transmits — or partially transmits — the effect of X on Y. If you include M in a regression, you block the indirect pathway and your estimate of X’s effect will be attenuated or eliminated — even though X genuinely causes Y.

Example: a training program (X) improves job performance (Y) partly by increasing employee confidence (M). A regression that controls for confidence will underestimate the training effect, because confidence is the mechanism by which training works. The “total effect” of training flows partly through confidence; controlling for confidence estimates only the “direct effect” that bypasses it.

Mediation analysis — decomposing total effects into direct and indirect components — is valuable and legitimate. But it requires an explicit decision to decompose, not an inadvertent inclusion of mediators in a control set.

The practical trap

Most regression modellers include “relevant” variables without distinguishing confounders from mediators. If a variable is caused by X, it is a potential mediator or collider — not a confounder. Including it does not improve the estimate; it distorts it. The distinction between “caused by X” and “causes X” is structural, not statistical, and cannot be inferred from correlation coefficients.

Colliders: the hidden danger

Of the four causal structures, colliders are the most dangerous and the least intuitive. Confounders and mediators produce biased estimates if you get them wrong. A collider, conditioned on, creates an association that does not exist in the population — and every additional control variable added to a regression is a potential collider.

A collider is a variable C that is caused by both X and Y: X → C ← Y. Unlike confounders, colliders do not create an association between X and Y in the full population. But if you condition on C — by including it in a regression, or by selecting a sample where C has a particular value — you open a spurious path between X and Y.

The intuition: if you know that C occurred, and C is caused by both X and Y, then observing that X did not occur raises the probability that Y must have occurred to explain C. This creates a negative correlation between X and Y within the conditioned sample, even if X and Y are independent unconditionally.

Berkson’s Bias — a collider in clinical data

A hospital study finds that among hospitalised patients, patients with disease A are less likely to have disease B — suggesting A protects against B. But hospitalisation is a collider: you are hospitalised if you have A or B (or both). Conditioning on hospitalised patients opens the A → Hospitalisation ← B path and creates a spurious negative association. In the general population, A and B may be independent.

The rule

Never condition on a collider. Never include a variable in a regression or a filter if it is caused by both the treatment and the outcome, or by two variables both of which affect the outcome. Whether a variable is a collider cannot be determined from data — it requires the causal graph.

Bayesian network showing collider bias in police stops data: Race and Contraband both cause Stopped by Police (the collider). Conditioning on Stopped=True creates a spurious association between Race and Use of Force. — Collider bias in practice. Race and Contraband both cause Stopped by Police. Conditioning on stopped encounters (the collider) creates a spurious association — Race appears to shift from 50/50 to 60% Black within the conditioned sample, even though the unconditional distribution is 50/50. The bias is introduced by the sample selection, not the underlying population. — Fenton, N. & Neil, M., *Risk Assessment and Decision Analysis with Bayesian Networks*, 2nd ed., CRC Press, 2018.

Moderation

A moderator (also called an effect modifier) is a variable W that changes the magnitude of X’s effect on Y. The causal structure is: W modifies the X → Y relationship. This is represented in a regression as an interaction term: Y = β₀ + β₁X + β₂W + β₃XW + ε.

Example: a marketing campaign (X) increases sales (Y), but the effect is larger among customers under 35 (W=young) than among customers over 55. Age moderates the treatment effect. Unlike confounding — where omitting W biases the average estimate — omitting a moderator averages over heterogeneous effects and may produce a correct average estimate but mislead you about who to target.

In a Bayesian network, moderation is encoded directly in the conditional probability table: P(Y | X, W) will show different conditional distributions for different values of W. The BN naturally represents effect heterogeneity that a regression model requires explicit interaction terms to capture.

The practical rule

The practical rule is disarmingly simple: draw the causal graph before choosing your control set. The graph makes the distinction between confounders, mediators, and colliders unambiguous. Without it, the choice of what to adjust for is a guess — and a wrong guess produces conclusions that are precisely wrong rather than approximately right.

The formal tool is d-separation: given a causal DAG, two variables are independent conditional on a set Z if and only if all paths between them are blocked by Z. A path is blocked if it contains a non-collider that is in Z, or a collider that is not in Z (and has no descendants in Z). The back-door criterion uses d-separation to identify valid adjustment sets.

For practitioners who cannot build a full causal graph, the minimum viable version is a list of variables and a question for each: does this variable cause the treatment, does it cause the outcome, or is it caused by both? That three-way classification distinguishes confounders (causes both), mediators (caused by treatment, causes outcome), and colliders (caused by both) well enough to avoid the most damaging errors.

In the cases

Healthcare

Statins & Hospitalisation

Severity confounds the statin–hospitalisation relationship. Controlling for it without a causal graph reverses the apparent treatment effect.

Healthcare

Iatrogenic Medications

Medication mediates BMI and LDL — the path must not be adjusted out, or the drug-induced contribution becomes invisible.

Financial

Bank Churn

Customers who would have churned anyway confound the campaign's measured effect. Controlling for the wrong variable reports success where there is none.

Policy

Elective Sequencing & Mastery

Academic preparation confounds elective track and mastery rate. The causal model separates the effect of the curriculum from the effect of the intake.

Next Step

Adjust for the right variables or introduce bias that no amount of data can correct. A structured elicitation session maps your model’s causal structure.

info@rung3.ai