Two papers: (1) The trigram embedding, a P-program for the orthographic bijection. Three-layer architecture (bytes → bigrams → words) via threshold creation on LPPs. (2) Consolidation v2: how the organism discovers structure from memory traces. Grounded in GCD decomposition (Bayes from Counting), ring tower (KN-quotient v2), experimental findings (English context results). Addresses L∞ vs L1 gap, OBSERVE LPP RNG coupling, log-stochastic quantization, and the two-level trace.
Empirical analysis on 8,048 reified words from Exp L1. How many words share a k-character suffix? This determines the trie depth needed for unique word identification.
| Suffix length | Trie depth | Unique ID rate | Colliding types |
|---|---|---|---|
| 3 | 1 | 12.6% | 7,099 (88%) |
| 4 | 2 | 37.2% | 5,141 (64%) |
| 5 | 3 | 63.2% | 3,042 (38%) |
| 6 | 4 | 81.3% | 1,568 (20%) |
| 7 | 5 | 91.6% | 699 (9%) |
| 8 | 6 | 96.3% | 309 (4%) |
| 10 | 8 | 99.6% | 37 (0.5%) |
| 15 | 13 | 100% | 0 |
Key finding: depth 1 (3-char suffix) uniquely identifies only 12.6% of tokens, far below the predicted >80%. English morphology dominates: “-ing” (321 words), “-tion” (215 words), “-ted” (160 words). Depth 5 needed for >90%. Full disambiguation requires suffix length 15.