Commentary on the February 11 Archive. Sk acts on Z0k by permuting coordinates.
S128
Symmetry group 128! ≈ 10215 elements
d ≈ 20
Effective dimension « k = 128
60–128
Prime factors in typical macrostate N
20
Papers traced through Feb 11 archive
129
Orbits under S128 (w = 0..128)
1. The Equal-Dimension Factor Permutation
When the macrostate space factors as k copies of a single factor Z ≅ Z0k, the symmetric group Sk acts by permuting coordinates:
Core construction: Pπ(z1, ..., zk) = (zπ-1(1), ..., zπ-1(k)). A factor map φ: E → Sk × Z0k must carry an explicit alignment π ∈ Sk.
In the E → N encoding, this is a permutation of primes:
Pπ: ∏ pixi ↦ ∏ pπ(i)xi — an alignment is a bijection on prime indices.
For the sat-rnn: Z0 = {0,1}, k = 128, and |S128| = 128! ≈ 10215.
2. Interactive Permutation Action
Watch how a permutation π ∈ Sk acts on binary coordinates. Each cell represents a neuron's binary state. The permutation relabels which neuron holds which value.
Coordinate Permutation on Z0k
Original state (z1, ..., zk):
↓ Pπ
Permuted state (zπ-1(1), ..., zπ-1(k)):
Permutation: identity
The binary state has the same Hamming weight before and after permutation. The content (which bits are on) is the alignment information π ∈ Sk.
3. Most Problems Have Low-Dimensional Inner Product Structure
The prediction depends on Wyh — a linear map from R128 to R256. If Wy has effective rank d « 128, only d linear combinations of the 128 coordinates matter.
d ≈ 20 neurons suffice h28 alone = 99.7% of compression. Top 15 neurons > 100%. Redux with 20 neurons + 36% Wh = 4.81 bpc (0.15 better than full 128).
Alignment search collapses
Not S128 (10215 elements) but C(128,20) · 20! ≈ 1039. The remaining 108 dimensions are gauge freedom.
Effective Dimensionality: d vs k
For k equal-dimension features with effective rank d, the alignment has C(k,d) · d! relevant configurations rather than k!. For d/k « 1, this is an enormous reduction.
128-Bit Binary State: Signal vs Gauge
Click to toggle bits. Blue = signal dimensions (d ≈ 20), gold = gauge dimensions (108). The permutation acts on all 128, but only the blue ones affect the prediction.
Hamming weight: 0/128 | Signal bits on: 0/20 | Gauge bits on: 0/108
4. Most Numbers Are Not Prime
In the E → N encoding, the macrostate integer is N(σ) = ∏ pibi where bi ∈ {0,1}. With mean margin 60.5, typical macrostates have ~60–128 prime factors. They are maximally composite.
Factorization IS interpretation. The unique prime factorization of N(σ) recovers (b1, ..., b128) exactly. Composite numbers always decompose.
Orbits indexed by Hamming weight. S128 permutes which primes appear. Two macrostates in the same orbit have the same number of prime factors. 129 orbits total.
Interactive: Macrostate Integer Factorization
Enter the number of active neurons (Hamming weight w) to see the macrostate integer N = ∏ pi for the first w primes:
w = 5 active bits (using first 20 primes for display)
Orbit size: C(20, 5) = 15504 | log2(N) = 0 bits
Each active bit contributes one prime factor to N. The prime factorization is unique (FTA), so the factorization IS the interpretation. At k=128, the full product ∏pi > 10400.
Orbit Size Distribution: C(128, w) under S128
129 orbits indexed by Hamming weight w = 0..128. Maximum at w = 64 with C(128,64) ≈ 1037. Mean margin 60.5 places typical states near the peak.
5. The Alignment Collapse
Alignment Search Space: k! vs C(k,d) · d!
Full Sk search (red) vs low-dimensional alignment (green) for k=128. At d=20, the reduction is ~10176. This is why total interpretation is possible.
The overarching insight: The alignment problem is low-dimensional (d ≈ 20 « k = 128), and the state is always decomposable (composite N factors uniquely). Together: total interpretation = resolving the factor permutation in a low-dimensional subspace of a highly composite macrostate space.
6. Trace Through the Twenty Papers
Click each paper to see how the factor-permutation structure manifests. Grouped by category: FoundationStructureEmpiricalSynthesis
7. The Three Pieces
Equal dim factors
+
Low-d inner product
+
Highly comp- osite N
=
Total interp- retation
1. Equal-dimension factors
Z0 = {0,1}, k = 128 creates S128 symmetry. Boolean automaton validates this is exact.
2. Low-dimensional inner product
d ≈ 20 collapses alignment from S128 to tractable subproblem. Redux, factor map, and single-neuron dominance confirm d « k.
3. Composite macrostate integers
Typical Hamming weight ~60–128. FTA guarantees every composite number factors uniquely. Interpretability is structural, not contingent.
The CMP thesis:
"Interpretability and efficiency are the same problem, both resolved by recovering the correct factorization." Efficiency = bypass Sk search. Interpretability = compositeness of N.
Cost Advantage: Analytic Construction vs SGD
SGD searches Sk implicitly. Analytic construction bypasses alignment entirely. Gap widens from 104.6 to 107.2 at H=4096.