fig-20260131_2-07 | Model predictions reveal learned ES transition structure
The augmented model (1M chars training) has learned strong ES transition patterns that match our tock2 predictions:
Whitespace → expecting word start
Digit → expecting number end
Period → almost certain whitespace
Vowel → consonants (in-word)
Vowel → mixed (word-final 'e' common)
Within-ES distinctions matter
| 'a' | 'e' | Δ | |
|---|---|---|---|
| Other | 73.5% | 53.6% | -19.9 |
| Whitespace | 20.7% | 40.9% | +20.2 |
'e' is often word-final ("the", "be"), so higher whitespace probability. 'a' rarely ends words, so higher consonant continuation.
These results confirm that ES-pairs (prev_ES, next_ES) have more predictive structure than individual ESs alone. For tock-2, we should consider:
The 53% improvement comes partly from making these ES-level transitions explicit, freeing capacity for within-ES distinctions.
Command: ./hutter predict-aug "context" models/aug_epoch1.bin
Model: Augmented RNN, 128 hidden, 1M chars training