← Back to Hutter

Archive 2026-02-10

The Research Journey: A Comprehensive Narrative

Papers

Interactive Visualizations

Summary: This paper chronicles ten days of intensive research into RNN interpretability through the Universal Model framework. It explains what happened at each stage, what insights were picked up, what hypotheses were dropped, and how the understanding evolved from initial confusion to a coherent framework.

Key Themes

Phase 1: The Doubled-E Isomorphism (31 Jan)
RNNs are already Universal Models. The tanh↔softmax equivalence gives exact translation (0.000% bpc diff).
Phase 2: The Tock Methodology (31 Jan)
Tick-tock cycle: train RNN, extract patterns. Refutation of 53% claim. Coverage ≠ explanatory power.
Phase 3: Pattern Injection and Q=λ (31 Jan)
SVD-based UM→RNN gives 1 bit/char head start. Quotient equals luck: Bayesian interpretation of compression.
Phase 4: Lexicon and Word-Level Structure (1-4 Feb)
Word identity refuted (4.9-6.4% accuracy). RNN encodes character transitions, not words. ES1 (word boundary), ES2 (syllable momentum).
Phase 5: Synthesis and Saturation (6 Feb)
Eight days in review. Saturation experiment: 0.079 bpc on 1024 bytes. Memorization simplifies interpretation.
Phase 6: The Export Gap (7 Feb)
Quantization chaos (0.09-2.1 bpc). W_h bottleneck. BPTT-50 artifact. The gap is a signal about missing skip patterns.
Phase 7: Pattern Chains (7-8 Feb)
Direct UM from data surpasses RNN (0.067 vs 0.079 bpc). Backward trie = ground-truth attention map.
Phase 8: Skip-Patterns (8 Feb)
Greedy offset selection. 4 non-contiguous bytes match 12 contiguous with 9× fewer patterns. DSS doubling for artifact detection.
Phase 9: Weight Construction (8 Feb)
RNN weights from data, no BPTT. Better generalization (5.43-5.59 vs 8.22 bpc test). Readout loss: 0.147 bpc.
Phase 10: The Factor Map (9 Feb)
Every neuron is a 2-offset conjunction detector. Mean R²=0.837. 92.5% of RNN's gain captured.
Phase 11: Sparse Diff (9 Feb)
Sensitivity profiles for neuron matching. Solves neuron permutation problem.

What Was Picked Up

What Was Dropped

Navigation

Next: 20260211 →
Toward total interpretation: backward attribution chains, sat-rnn-redux, six formal questions.
← Previous: 20260209
Sparse diff, factor map, neuron sensitivity profiles.