← Back to archive

Bayes from Counting

Partial quotients, GCD decomposition, and the symmetric learning function on E = I × O

E = I×O

Joint event space
factorization

GCD

Separates common from
differential evidence

r_I/r_O

GCD bridge forces
Bayes' theorem

Q = λ

Luck = marginal
× conditional

1. The E → N → Q Chain

Every joint event follows a three-step map: identify it, count it, compute the quotient (luck).

Joint Event

(i, o) ∈ I × O

→

Count

c(i, o) ∈ N

→

Quotient (Luck)

λ = N / c(i,o)

At the N level: Counting

Step	Object	Meaning
E	(i, o)	What happened
N	c(i, o)	How often
Q	N/c(i,o) = λ	How surprising

At the Q level: Luck Decomposition

Component	Formula
Q_joint(i,o)	N / c(i,o)
Q_I(i)	N / c(i)
Q(o\|i)	R_I(i) / r_I(i,o)
g_I(i)	gcd_o c(i,o)
r_I(i,o)	c(i,o) / g_I(i)

Key idea: The partial quotient divides E by a single atomic event i₀ (not all of I), yielding a conditional distribution over O scaled by an equivalence class of size c(i₀). The reconstruction identity: c(i₀, o) = P(o|i₀) · c(i₀).

2. Interactive Joint Count Table

Edit the counts below and watch the GCD decomposition, partial quotients, conditionals, and Bayes verification update in real time.

Joint Count Table c(i, o)

Click any count to change it. All derived quantities update automatically.

	o = x	o = y	c(i)	g_I(i)	R_I(i)
i = a
i = b
c(o)
g_O(o)
S_O(o)

Row-Reduced Counts r_I(i, o) = c(i,o) / g_I(i)

	o = x	o = y	gcd check
i = a
i = b

Column-Reduced Counts r_O(i, o) = c(i,o) / g_O(o)

	o = x	o = y
i = a
i = b
gcd check

3. GCD Decomposition

Each joint count c(i,o) decomposes two ways: by row GCD and by column GCD. The common evidence (GCD) separates from the differential evidence (reduced counts).

Row Decomposition: c(i,o) = g_I(i) × r_I(i,o)

Column Decomposition: c(i,o) = g_O(o) × r_O(i,o)

In log-support: s(i,o) = log₂ g_I(i) + log₂ r_I(i,o) = log₂ g_O(o) + log₂ r_O(i,o)

Common vs differential evidence: The GCD g_I(i) is the part of the log-support shared by ALL outputs given input i -- it tells us about the prevalence of i without distinguishing which output accompanied it. The reduced count r_I(i,o) is the irreducible evidence for this specific joint event.

4. Conditionals (GCD-Free)

The conditional probability depends ONLY on reduced counts. The GCD cancels completely:

Proposition: GCD cancels from the conditional

P(o|i) = c(i,o) / c(i) = [g_I(i) · r_I(i,o)] / [g_I(i) · R_I(i)] = r_I(i,o) / R_I(i)

Conditional Probabilities (computed from your counts)

I-side: P(o|i)

	P(x\|i)	P(y\|i)
i = a
i = b

O-side: P(i|o)

	P(a\|o)	P(b\|o)
o = x
o = y

Marginals

P(a)	P(b)	P(x)	P(y)

5. Bayes as GCD Consistency

The same joint count admits two GCD decompositions. Their consistency IS Bayes' theorem.

GCD Bridge Equation

r_I(i,o) / r_O(i,o) = g_O(o) / g_I(i)

GCD Bridge Verification (for each cell)

Bayes' Theorem falls out

P(o|i) / P(i|o) = [r_I(i,o)/R_I(i)] / [r_O(i,o)/S_O(o)] = [g_O(o)/g_I(i)] · [S_O(o)/R_I(i)] = c(o)/c(i) = P(o)/P(i)

Bayes Verification (for each cell)

What the proof says: Bayes' theorem is the consistency condition between the two partial quotients of E = I × O. Dividing by I = i and dividing by O = o must give compatible conditionals, because they decompose the same joint count. The GCD mediates: the ratio g_O(o)/g_I(i) converts between the two reduced-count representations.

6. Log-Support Decomposition

The log-support form reveals the symmetry: s(i,o) = log₂ g_I + log₂ r_I = log₂ g_O + log₂ r_O.

(i,o)	s(i,o)	log₂ g_I	log₂ r_I	log₂ g_O	log₂ r_O	Bridge check
(a,x)
(a,y)
(b,x)
(b,y)

Log-Bayes Corollary: The asymmetry of the conditionals equals the asymmetry of the priors

s(o|i) - s(i|o) = s_O(o) - s_I(i)

Log-Support Bar Chart

Each bar shows s(i,o) = log₂ c(i,o). The GCD decomposition splits each into common (gold) and differential (green) evidence.

7. Luck Decomposition: Q = Q_marginal × Q_conditional

The joint luck λ(i,o) = 1/P(i,o) decomposes into marginal luck and conditional luck. The conditional luck depends only on the reduced counts -- the GCD cancels.

Luck Table

(i,o)	Q_joint = N/c(i,o)	Q_I(i) = N/c(i)	Q(o\|i) = R_I/r_I	Q_O(o) = N/c(o)	Q(i\|o) = S_O/r_O

Luck Ratio Check: Q(o|i)/Q(i|o) = P(i)/P(o) = c(i)/c(o)

Luck is relative: The same event has different luck from different conditioning directions, balanced by the priors. Q(o|i)/Q(i|o) = P(i)/P(o). An event that is "lucky" from one direction is "ordinary" from the other.

8. Why Correlation Captures Causation

The log contingency table M_io = log₂ c(i,o) is a single matrix that encodes BOTH directions. The same counts encode the causal direction (I → O) and the evidential direction (O → I). No additional data is needed to reverse the direction.

Pointwise Mutual Information (PMI)

PMI(i,o) = log₂[c(i,o)·N / (c(i)·c(o))]. Positive = more association than independence predicts. PMI is symmetric in its role: it measures correlation that encodes both causal and evidential directions.

The Symmetric Matrix

The log contingency table encodes:

Causal: I → O

Fix row i, read off log supports for each o.

P(o|i) = 2^s(i,o) / Σ_o' 2^s(i,o')

Evidential: O → I

Fix column o, read off log supports for each i.

P(i|o) = 2^s(i,o) / Σ_i' 2^s(i',o)

CMP's claim justified: "Learning takes the directed causal arrow from rain to wet ground into a bidirectional one; we may infer that it has rained from the ground being wet just as well." The GCD structure guarantees that the causal and evidential readings are consistent via Bayes.

9. The Tropical Approximation

The UM's standard update uses the (max, min) tropical semiring. The tropical GCD (= min) upper bounds the integer GCD. When min divides all entries, the UM is exact.

Integer GCD vs Tropical GCD (= min)

Row/Col	Integer GCD	Tropical GCD (min)	Gap (log₂(min/gcd))	Exact?

When the gap is zero, the UM's tropical computation is exact Bayesian inference.

Tropical GCD approximation: min(t_i, p_ij) in the UM is the tropical version of the integer GCD. For integer counts where the smallest count divides all others (common in generative processes), the gap is zero and the UM is exact.