← Back to Archives
Archive 2026-02-06
Synthesis: Eight Days of RNN Interpretability
Papers
Summary: Comprehensive review of all findings from 20260131 through 20260204.
Collects validated discoveries, refuted hypotheses, theoretical framework, and next steps.
Validated Discoveries
ES1: Word Boundary (h2)
Δh2 > 0.95 marks word-start with 99.6% accuracy. Space → h2- (strength 8), letters → h2+ (strength 1).
ES2: Syllable Momentum (h35)
Tracks CV syllable structure. Decays over word length via negative self-connection (-0.184).
Pattern Injection via SVD
1 bit/char head start (5.46 → 4.47 bpc) without any training. Effective dimension ~64.
18 Natural Event Spaces
128 neurons cluster into ~18 groups with r > 0.9. Massive redundancy in doubled-E representation.
Refuted Hypotheses
53% Compression Claim
Actually ≤3.7% theoretical max. ES captures only 15.9% of Markov MI.
Spectral Radius ~ 1
Measured |λmax| = 2.52. Stability via tanh saturation, not eigenvalue tuning.
Word Identity Encoding
Hidden states achieve only 4.9-6.4% word recognition accuracy. Words are emergent, not explicit.
Coverage
- 302 significant patterns (49 input→hidden, 253 hidden→output)
- 4,183 recurrent patterns (largely unexplored)
- ~31% of events touched, <5% of bpc explained
- Model: 5.69 bpc (SOTA: 1.1 bpc)
Theoretical Framework
- Q = λ: Quotient equals luck, unifying Bayes/Thermo/AC/RNN
- Tick-Tock: RNN training ↔ UM interpretation cycle
- Isomorphic UM: Exact match (0.00% bpc diff) via doubled-E