← Back to Archives

Archive 2026-02-06

Synthesis: Eight Days of RNN Interpretability

Papers

Summary: Comprehensive review of all findings from 20260131 through 20260204. Collects validated discoveries, refuted hypotheses, theoretical framework, and next steps.

Validated Discoveries

ES1: Word Boundary (h2)
Δh2 > 0.95 marks word-start with 99.6% accuracy. Space → h2- (strength 8), letters → h2+ (strength 1).
ES2: Syllable Momentum (h35)
Tracks CV syllable structure. Decays over word length via negative self-connection (-0.184).
Pattern Injection via SVD
1 bit/char head start (5.46 → 4.47 bpc) without any training. Effective dimension ~64.
18 Natural Event Spaces
128 neurons cluster into ~18 groups with r > 0.9. Massive redundancy in doubled-E representation.

Refuted Hypotheses

53% Compression Claim
Actually ≤3.7% theoretical max. ES captures only 15.9% of Markov MI.
Spectral Radius ~ 1
Measured |λmax| = 2.52. Stability via tanh saturation, not eigenvalue tuning.
Word Identity Encoding
Hidden states achieve only 4.9-6.4% word recognition accuracy. Words are emergent, not explicit.

Coverage

Theoretical Framework