Product Pattern: Lifting Patterns to Joint Spaces

fig-20260131_2-02 | How patterns in factor spaces combine

Patterns in Factor Spaces

When event space E factors as E = E₁ × E₂, patterns can operate independently on each factor:

The product pattern p ⊗ q operates on joint events (e₁, e₂) → (e₁', e₂')

Why this matters for compression:

Instead of learning 65,536 separate byte→byte patterns, we learn:

This is a 13× compression of the pattern space itself!

Product patterns decompose "th"→"e" prediction into ES-level and within-ES components

The product pattern framework shows how the RNN's predictions decompose:

P(next | context) = P(ES_next | context) × P(byte | ES_next, context)

Our 5 ESs capture the first factor. The remaining 41% of compression comes from the second factor (within-ES prediction conditioned on context).

Model: Elman RNN 256-128-256

Data: enwik9, "th" bigram analysis

Commands: ./hutter predict, ./hutter es

Theory: CMP paper §3.2 (factored event spaces)