Adverse Event Attribution — Individual Counterfactual Cause

Why a Causal Model

Pharmacoepidemiology has a well-developed Rung 2 toolkit. Cohort studies, case-control studies, self-controlled case series, target-trial emulation — these all answer the population-level question: across many patients, how does drug exposure shift the distribution of an adverse outcome? That is sufficient for regulatory approval and for population-level risk-benefit analysis.

It is not sufficient for the individual question that drug-injury cases, M&M conferences, and causality-assessment frameworks (WHO-UMC, Naranjo, RUCAM) actually ask: in this specific patient, with this specific exposure, did the drug cause the injury? That is a Rung 3 counterfactual claim — and answering it requires structural commitments that go beyond what a Bayesian network can offer on its own.

The structural problem

A standard Bayesian network can compute P(AKI | NSAID, covariates) at Rung 1 and P(AKI | do(NSAID), covariates) at Rung 2. It cannot, by itself, compute P(AKI = No | do(NSAID = No), this patient's observed factual outcome and covariates) — the probability of necessity, the formal version of "but-for cause." That computation requires committing to a structural-equation interpretation of the graph, with explicit exogenous variables that absorb the residual variation. Those exogenous variables — the U-nodes — are abducted from the factual observation, the intervention is applied, and the counterfactual outcome is read.

The Causal Structure

The model is a Structural Causal Model in the strict sense: every endogenous variable has its structural parents AND a corresponding exogenous U-node. The U-nodes are not optional cosmetic additions — they are what makes the counterfactual computation tractable.

Observable Node	States	Role
BaselineRenalFunction	Normal · CKD_stage_2_3 · CKD_stage_4_plus	Pre-existing risk
SurgeryType	Minor · Major · None	Acute insult
PeriOpDehydration	Low · Moderate · High	Modulator
ConcomitantNephrotoxin	None · Vancomycin · IV_Contrast · Multiple	Co-cause
NSAID_Exposure	Yes · No	Exposure of interest
IntraopHemodynamics	Stable · Hypotensive	Mediating variable
AKI_outcome	Yes · No	Adverse event of interest

Each observable also has a U-node — U_NSAID, U_AKI, etc. — modeled as a 2-state exogenous variable with a 50/50 prior. These U-nodes are the residual variation that the structural parents do not explain. They are what an SCM has and a vanilla BN does not.

Edges among observables: BaselineRenalFunction, SurgeryType, PeriOpDehydration → IntraopHemodynamics. IntraopHemodynamics, NSAID_Exposure, ConcomitantNephrotoxin, BaselineRenalFunction → AKI_outcome. (Optionally) BaselineRenalFunction → NSAID_Exposure (CKD reduces NSAID prescribing — confounding by indication in the protective direction).

Identifiability

Rung 2 is identifiable under standard back-door adjustment. Rung 3 — the probability of necessity — is identifiable given the SCM structural assumptions: that the U-nodes are independent across observables, that the structural equations are deterministic given (parents, U), and that the parameterization of those equations is correct. These are testable in some cases (cross-validation against held-out cohorts, comparison to instrumental-variable estimates) and untestable in others. The page-level message: Rung 3 always requires assumptions that go beyond what data alone can support, and naming those assumptions is a precondition for using the inference responsibly in a regulatory or legal context.

The Three Queries

The three queries here are particularly important to distinguish, because they are routinely conflated in pharmacovigilance practice. Each panel below is a screenshot of the actual Bayes Server model — click a button to step through the slides.

How to read the diagrams. An arrow shows the causal direction. An arrow from A to B means A causes an effect — a change — in B.

Two operators appear repeatedly below. obs(X = value) means we learned that X had this value — like filtering the chart-review down to only patients where X was that value. do(X = value) means we imposed this value — like a randomization in a trial, where we control X regardless of what the patient would naturally have. The difference matters: filtering down to "patients who got the drug" tells you something about which patients tend to receive it; imposing the drug tells you only what the drug does.

Rung 1 — Association

Among patients with this covariate profile who took the NSAID, what fraction developed AKI?

Read directly from the data as a conditional probability. As covariates accumulate, the AKI rate climbs — but watch P(NSAID = Yes) drop in the first slide. Clinicians prescribe NSAID_Exposure less in patients with CKD stage 2-3, precisely because they fear AKI. The covariate is informative about both the outcome AND the prescribing decision; the conditional cannot tell you which is which.

In plain language: Looking at chart-review data alone, this patient profile shows a 56% AKI rate after NSAID exposure. That number is misleading because clinicians already steer NSAID use away from CKD patients — so the apparent association mixes drug effect with patient selection.

Prior — no evidence set

Population baseline before any patient data is entered. AKI, NSAID exposure, and the covariates all sit at their marginal priors.

Rung 2 — Population Effect

If we forced NSAID exposure on a random patient — without conditioning on the prescribing rule — what would the AKI rate be?

The do-operator severs the BaselineRenalFunction → NSAID_Exposure edge, breaking the back-door from CKD into the prescribing rule. What remains is the unconfounded population-level effect of the NSAID on AKI_outcome. This is the standard regulatory question — sufficient for label warnings and prescribing guidelines.

In plain language: The actual causal effect of NSAID exposure across the population is +9.3 percentage points in AKI risk. That's real, and it justifies the label warning — but it does not tell us whether the NSAID caused this specific patient's AKI.

Prior — no intervention

Population baseline. NSAID is at its prior — some patients exposed, most not, driven by the natural prescribing rule that depends on renal status.

Rung 3 — Individual Counterfactual (the central question)

This patient took the NSAID and developed AKI. Would the AKI have occurred if they had not taken the NSAID?

Three operations on the same graph: (1) Set the patient's full evidence — BaselineRenalFunction = CKD_stage_2_3, SurgeryType = Major, PeriOpDehydration = Moderate, ConcomitantNephrotoxin = None, IntraopHemodynamics = Hypotensive, NSAID_Exposure = Yes, AKI_outcome = Yes. The U_AKI posterior shifts away from 50/50 — that's the abduction step, encoding this patient's idiosyncratic AKI susceptibility. (2) Carry the abducted U_AKI forward as soft evidence. (3) Apply do(NSAID_Exposure = No) to read the counterfactual outcome.

In plain language: For this specific patient, the model estimates a 50% chance the AKI would have occurred even without the NSAID — given their CKD, major surgery, and intraoperative hypotension. So the probability of necessity — the strength of the claim that the NSAID caused this AKI — is 50%. That's a defensible patient-specific attribution.

Prior — no evidence set

Population baseline before any patient data is entered. All nodes at their marginal priors.

Download the Model

The Bayes Server file below encodes the DAG and the conditional probability tables described above. Each observable node has a corresponding U-node — the exogenous noise variable that absorbs residual variation — which is what makes Rung 3 counterfactual abduction possible. The CPTs are populated with clinically defensible illustrative priors; the qualitative behavior they encode is what makes the failure mode visible when running Rung 1 versus Rung 2 queries on the same data.

↓

PharmacovigilanceAttribution.bayes

Structural Causal Model with explicit U-nodes (background-noise variables) for each observable. The U-nodes are essential here — they are what makes counterfactual abduction possible: from the factual observation (this patient, this exposure, this outcome) the U-values are inferred, then a do-intervention is applied, and the counterfactual outcome is read. The probability of necessity — the formal version of 'but-for cause' — is what the model computes.

The NSAID raises AKI risk in the population.
Did it cause the injury in this patient?

On this page

Why a Causal Model

The Causal Structure

The Three Queries

Download the Model

The NSAID raises AKI risk in the population.Did it cause the injury in this patient?

On this page

Why a Causal Model

The Causal Structure

The Three Queries

Download the Model

Related

The NSAID raises AKI risk in the population.
Did it cause the injury in this patient?