Bayesian Updating

The Question

You weigh your cat on three different scales. The first reads 10 pounds and you trust it: N(10, 1). The second is older and noisier: N(15, 2). The third is the oldest and the noisiest: N(17, 3). Each reading is a Gaussian belief about the cat’s true weight — a mean, and a variance that says how sure you are.

What number do you report? And how sure are you of it?

Most people, asked quickly, give one of two answers: the average, which is 14, or the most recent reading, which is 17. Both are defensible intuitions. Both are wrong in a specific, recoverable way. The correct answer is 12.64, with variance 0.545 — and the gap between that answer and the intuitive ones is where Bayesian updating lives.

Four Ways to Combine

There are several reasonable-sounding ways to fuse three measurements. Each corresponds to a different implicit assumption about what the numbers mean.

Method	Mean	Variance	What it assumes
Simple average	14.00	—	All three measurements are equally trustworthy.
Keep only the most precise	10.00	1.00	Discard the other two entirely.
Keep only the most recent	17.00	3.00	The latest reading supersedes earlier ones.
Precision-weighted (Bayesian)	12.64	0.545	Each measurement’s influence is proportional to its precision.

Each of these is what someone would reach for. An engineer under time pressure takes the simple average. Someone who just calibrated a scale trusts only that reading. Someone with a “fresh data wins” bias keeps the last one. These are not stupid heuristics — they are defaults people fall back on when they don’t have a principled rule for combining beliefs.

Bayesian updating is the principled rule. The answer it produces, 12.64, sits closer to 10 than to 17, even though 10 and 17 are equidistant from the simple average of 14. That is not a quirk. It is the rule working.

The Rule Precision Obeys

The formula for combining Gaussian beliefs has two parts, and both are short.

Precision is the reciprocal of variance: τ = 1/σ². A reading with variance 1 has precision 1. A reading with variance 3 has precision 1/3.

Posterior precision is the sum of individual precisions: τ₁ + τ₂ + τ₃. Posterior mean is the precision-weighted average of the individual means: (μ₁τ₁ + μ₂τ₂ + μ₃τ₃) / (τ₁ + τ₂ + τ₃).

Worked out for the cat:

Posterior precision = 1 + ½ + ⅓ = 11/6 ≈ 1.833
Posterior variance = 6/11 ≈ 0.545
Posterior mean = (10·1 + 15·½ + 17·⅓) / (11/6) = 23.17 / 1.833 ≈ 12.64

Two things in that arithmetic deserve to be noticed.

The mean is not the center of the data; it is pulled toward the more precise readings. The reading with variance 1 counts three times as much as the reading with variance 3. That is why 12.64 sits closer to 10 than to 17. Less noisy measurements count more — by an exact amount, set by their variances.

The posterior variance is smaller than any individual variance. The best single reading had variance 1. The fused answer has variance 0.545. Three uncertain measurements produced a belief tighter than any one of them on its own. Most people’s intuition, pushed, is that combining noisy readings gives you something about as noisy as the cleanest one. That intuition is wrong. Precisions add. Evidence accumulates.

Why It Matters

This is the operation that runs silently at every node in a Bayesian network, every time evidence enters. When a diagnostic inspector shows the posterior distribution on a latent variable after some observations are entered, precision-weighted fusion is what produced the answer. When an abductive inspector infers the unobserved state of a U-variable from the factual evidence, it is Bayesian updating running in reverse across the graph.

The cat example is deliberately small. One latent variable. Three direct observations. No causal graph to navigate. That minimal setting is enough to show the principle that makes the rest of the machinery work: combining uncertain information is not averaging — it is adding precisions. Once that is clear, a Bayesian network is what happens when the same operation is run across a graph of related variables, with each node doing the same fusion on whatever evidence propagates to it from its neighbors.

For practitioners, the operational consequence is direct. When your model combines an expert prior with a data-driven likelihood, the expert’s influence depends on how certain the expert was, not just on what the expert said. When three risk indicators all point in slightly different directions, the fused assessment is not the average of their estimates — it is weighted by how reliable each indicator is. If a reviewer asks why your model weights one input more than another, the answer is in the variance, not in the mean. That is the vocabulary Bayesian updating gives you.

The Engagement

If your process for combining expert estimates, prior evidence, and new data does not weight each input by its precision — the fused answer is already wrong, regardless of how defensible each input was on its own.

info@rung3.ai

How Bayesian updating incorporates new evidence.

On this page

The Question

Four Ways to Combine

The Rule Precision Obeys

Why It Matters