Q2–Q4: Offsets, Neurons, and Saturation

Experiment: q234_results — 2026-02-11
"One neuron captures 99.7% of the compression. All 128 are volatile."

Three Questions, One Automaton

This experiment answers three of the seven questions from the total-interpretation program. Together they paint a picture of an RNN that uses deep memory, concentrates prediction in a single neuron, and maintains fully volatile Boolean dynamics.

d=25
Q2: dominant perturbation depth
99.7%
Q3: compression from 1 neuron (h28)
3.3
Q4: mean dwell time (steps)
128/128
Q4: all neurons are volatile
Q2: Offsets Q3: Neurons Q4: Saturation

Q2: The RNN Uses Deep Offsets

Method: For each test position and depth d = 1...30, flip the input byte at t−d (XOR with 128), re-run the RNN forward, and measure sign changes + output KL at position t. Average over 13 test positions.

Depth dMean sign changesMean output KL (bits)Visual

Depth Perturbation Profile

Deep memory: Input perturbations from 25 steps back cause more sign changes (49.5) and larger prediction shifts (0.809 bits KL) than from 1 step back (8.1 changes, 0.296 bits). The MI-greedy offsets [1, 3, 8, 20] capture only 9.4% of the total sign-change signal.
"The RNN maintains information about inputs 20–30 steps in the past, consistent with BPTT-50 training."
— q234-results.tex
Readout ≠ dynamics: The factor-map found 52/128 neurons dominated by offsets (1,7). But offsets 1 and 7 together account for only 3.3% of the sign-change signal. The factor-map measures readout sensitivity; this experiment measures dynamical sensitivity. The readout depends on recent inputs; the dynamics integrates over long history.

Q3: Which Neurons Carry the Signal?

Method: For each neuron j, zero out the j-th column of Wy and measure the bpc change. This is a "readout knockout" — the neuron still participates in dynamics but cannot contribute to prediction.

Top Neurons by Knockout Importance

NeuronΔbpc||Wy|| normMean |hj|Impact
h28+0.03084.80.987
h105+0.02530.10.985
h54+0.02366.60.990
h17+0.02149.20.995
h49+0.02055.20.991
h10+0.01951.60.987
h97+0.01881.10.983
h3+0.0182.50.991
h3 is important despite tiny Wy: h3 has ||Wy|| = 2.5 — the smallest in the top 8 by a factor of 10. Its importance comes entirely from dynamics: through Wh, it influences other neurons' signs.

Minimal Subset Analysis

Keep only the top-k neurons by knockout importance and zero the rest of Wy:

Compression vs Number of Neurons

Neurons kept kbpc% of compression gapNote
14.97499.7%h28 alone
64.966100.0%
104.948100.5%
154.903102.0%Best region
204.882102.7%Peak performance
304.857103.6%Still improving
128 (full)4.965100.0%113 neurons add noise
The full model is suboptimal: Keeping 15–30 neurons achieves better bpc than all 128. The remaining 98–113 neurons contribute negative signal through Wy — they add noise to the prediction. No individual neuron's mantissa contributes more than 0.002 bpc.

What Do Top Neurons Predict?

h28
Promotes: q, h, f, r
Demotes: *, #, /, [
Letters vs symbols
h54
Promotes: , , ], e, h
Demotes: f, 9, n
Punctuation context
h97
Promotes: 6, q, 9, e
Demotes: *, #, {
Digits/letters vs symbols

Q4: All Neurons Are Volatile

Method: Track each neuron's sign across all 520 positions. Count sign flips, measure dwell times, identify co-flip pairs.

Most Volatile Neurons

NeuronFlipsMean |h|Min |h|% satDwell modeVolatility
h542340.9900.04992.3%1
h372060.9860.08991.0%1
h472010.9820.03590.2%1
h1101970.9900.19193.1%1
h31610.9910.00495.8%1
h751530.9960.50895.6%4
h401500.9960.54297.3%4
All 128 neurons are volatile: Every neuron flips sign more than 50 times in 520 positions. Zero frozen neurons, zero settled neurons. The least volatile still flips ~100 times. This contradicts the static picture of "112 settled + 16 active" neurons.
"At any single time step, ~123 neurons are saturated (|h| > 0.999) and ~5 are unsaturated. But the identity of the unsaturated neurons changes every step. Over 520 positions, every neuron passes through the unsaturated regime many times."
— q234-results.tex

Dwell Time Distribution

Steps Between Consecutive Sign Flips

Mean dwell time: 3.3 steps. 95% of dwells ≤ 10 steps. The Boolean state is rapidly mixing.

Co-Flip Structure

Neurons that flip at the same position form a co-flip graph. High Jaccard similarity (> 0.5) means the pair flips together more often than apart:

PairCo-flipsJaccardIndividual flipsCoupling
h17, h1091000.510148, 148
h37, h541000.294206, 234
h30, h31980.508144, 147
h40, h46980.476150, 154
h46, h116980.490154, 144
h1, h58960.508141, 144
h30, h34950.487144, 146
h36, h86930.449145, 155
Neuron clusters encode shared features:
  • h30, h31, h34 — a triple with pairwise Jaccard ~0.5
  • h40, h46, h116 — a triple with pairwise co-flips ~98
  • h17, h109 — the tightest pair (Jaccard 0.51)
  • h37, h54, h47, h110, h52 — a loose cluster around h54 (most volatile)

These co-flip groups likely correspond to feature detectors: a context change (e.g., entering/leaving an XML tag) causes a coordinated sign flip across a group of neurons that encode the same feature.

Synthesis

"The sat-rnn is a 128-bit Boolean automaton where: every neuron participates in every prediction (through Wh), only ~15 neurons matter for readout (through Wy), all neurons flip frequently (dwell time ~3 steps), the dynamics propagates information over 20–30 steps, and co-flip groups encode shared contextual features."
— q234-results.tex