Back to Archives

Archive: 2026-01-31_2

Tock methodology: ES extraction, joint event spaces, and critical analysis

Focus
Formalizing the "tock" phase: extracting interpretable structure from trained RNN
Key Results
ES captures 15.9% of Markov MI. Max entropy loses 2.09 bits/char within-ES.
Builds On
2026-01-31

Figures

fig-01: Joint Event Space
Byte alphabet factorization: E = ES × within-ES. Shows 65K joint space reducing to 25 ES-pairs.
fig-02: Product Pattern
How patterns lift from factor spaces to products. Concrete "th"→"e" decomposition.
fig-03: E → N Bijection
Events as natural numbers. Mixed-radix encoding, row-major for products.
fig-04: Training Loss Curves
Baseline vs Augmented early training. (See critique for caveats.)
fig-05: Pattern Rings (ES Transitions)
ES→ES transitions as weighted arcs. Interactive controls.
fig-06: Augmented Model Probe
ES weight analysis: h43 is Vowel/Other discriminator (+1.66 vs -1.63).
fig-07: ES-Conditional Predictions
Model predictions: "." → 99% whitespace, "a" vs "e" shows position effects.
fig-08: UM Pattern Rings (Byte→Byte)
Data-based patterns. Top: 'e'→' ' (255), ' '→'t' (248), 't'→'h' (247).
fig-09: Pattern Matrix (Simple)
ES×ES matrix with top patterns per transition. Clean table view.

Papers

Quotient Spaces as Bias
Equivalence classes as bias terms. Marginalization in architecture.
Tick: Training with Factored Event Spaces
Augmented input experiments. (Results superseded by critique.)
Tock: Extracting Interpretable Structure
ES-Markov interpretation, Bayesian granularity criterion.
Tock-2 Predictions
Second-order ESs: ES-pairs, positional ESs, XML structure.
CRITIQUE: The 53% Illusion
Call and response. 53% → 3.7% max. ES explains 15.9%, not 59%.
Quotient Renormalization and Maximum Entropy
Working in N→E. The 2.09 bit gap. Why max entropy fails.
OPEN QUESTIONS for 20260131_3
8 research questions. Better ESs, hierarchical coding, tock from hidden states.

Data

patterns.json
Extracted bigram patterns from 10M bytes. ES pairs and top 100 patterns.
experience-report.md
Lessons learned: Bayesian bounds first, validate patterns against data, wait for convergence.

Source (.tex)

quotient.tex tick.tex tock.tex tock2-predictions.tex critique.tex quotient-maxent.tex open-questions.tex

Live Training

Training Logs
Augmented model on full enwik9 (1B chars, 3 epochs)

Navigation

← Previous: 20260131
Initial ES experiments, activation probing, RNN-UM mapping.
Next: 20260131_3 →
Memory traces, time, integration, and explanatory sufficiency.