Augmented Model Probe: ES Weight Analysis

fig-20260131_2-06 | Trained on 1M chars, revealing ES-specific representations

Key Finding: ES Weights >> Byte Mean Weights

The ES weight columns (positions 256-260 in Wx) learn values 10-100x larger than the mean of byte weights within each class. This is not marginalization - it's amplification.

This suggests the model uses ES features as strong class-level signals that complement rather than replace byte-level patterns.

ES Weight Statistics (Input Layer)

ESMeanStdMinMax
Digit-0.0200.439-1.315+1.121
Punct+0.0080.318-0.995+1.322
Vowel+0.0540.518-1.625+1.662
Whitespace+0.0320.607-1.513+1.353
Other-0.0600.472-1.628+1.272

ES Weight Correlation Matrix

Negative correlations show the model is learning to discriminate between classes.

Digit Punct Vowel WS Other Digit Punct Vowel WS Other 1.00 0.11 -0.20 -0.37 -0.03 0.11 1.00 -0.45 0.09 -0.05 -0.20 -0.45 1.00 -0.36 -0.34 -0.37 0.09 -0.36 1.00 -0.37 -0.03 -0.05 -0.34 -0.37 1.00

Strongest negative: Vowel-Punct (-0.45). Green = positive, Red = negative.

Top Hidden Units per ES

ESTop 5 Units (weight)
Digith39 (-1.32)h91 (-1.16)h35 (+1.12)h94 (-1.08)h3 (-1.00)
Puncth95 (+1.32)h102 (-0.99)h87 (-0.95)h120 (+0.83)h121 (-0.78)
Vowelh43 (+1.66)h79 (+1.56)h68 (+1.54)h102 (+1.50)h120 (-1.62)
Whitespaceh124 (-1.51)h102 (-1.48)h79 (-1.46)h1 (+1.35)h19 (+1.33)
Otherh43 (-1.63)h29 (+1.27)h51 (-1.25)h88 (+1.11)h100 (-1.10)

h43 is the Vowel/Other discriminator: +1.66 for Vowel, -1.63 for Other.

h102 appears in Punct, Vowel, and Whitespace - an ES-sensitive unit.

Mean Hidden Activation by ES (Sample)

UnitDigitPunctVowelWhitespaceOther
h00.570.350.620.37-0.02
h10.810.930.680.980.49
h20.930.63-0.810.900.11
h3-0.850.160.710.940.26

h3 shows dramatic separation: Digit=-0.85, Whitespace=+0.94 (1.79 spread).

Output Layer: Top Predictors per ES

ESBest UnitMean Wy
Digith34+0.212
Puncth46+0.254
Vowelh87+0.218
Whitespaceh95+0.307
Otherh33+0.018

Other has weakest predictor (+0.018 vs +0.2-0.3) - it's the residual class.

Reproduction

Command: ./hutter probe-aug enwik9 models/aug_epoch1.bin

Model: Augmented RNN, 128 hidden units, trained 1 epoch on 1M chars

Input size: 261 (256 bytes + 5 ES one-hot features)