2026-03-06: Trigram Embedding & Consolidation

Two papers: (1) The trigram embedding, a P-program for the orthographic bijection. Three-layer architecture (bytes → bigrams → words) via threshold creation on LPPs. (2) Consolidation v2: how the organism discovers structure from memory traces. Grounded in GCD decomposition (Bayes from Counting), ring tower (KN-quotient v2), experimental findings (English context results). Addresses L∞ vs L1 gap, OBSERVE LPP RNG coupling, log-stochastic quantization, and the two-level trace.

Suffix Collision Analysis

Empirical analysis on 8,048 reified words from Exp L1. How many words share a k-character suffix? This determines the trie depth needed for unique word identification.

Suffix length	Trie depth	Unique ID rate	Colliding types
3	1	12.6%	7,099 (88%)
4	2	37.2%	5,141 (64%)
5	3	63.2%	3,042 (38%)
6	4	81.3%	1,568 (20%)
7	5	91.6%	699 (9%)
8	6	96.3%	309 (4%)
10	8	99.6%	37 (0.5%)
15	13	100%	0

Key finding: depth 1 (3-char suffix) uniquely identifies only 12.6% of tokens, far below the predicted >80%. English morphology dominates: “-ing” (321 words), “-tion” (215 words), “-ted” (160 words). Depth 5 needed for >90%. Full disambiguation requires suffix length 15.

Papers

consolidation.pdf

Consolidation in the Universal Model, v2: The GCD, the Differential ω, and the Discovery of Structure. Grounded in GCD decomposition, ring tower, experimental findings. OBSERVE LPP RNG coupling, log-stochastic quantization, L∞ vs L1 gap, two-level trace, CRT word extension. 11pp.

trigram-embedding.pdf

The Trigram Embedding: A P-Program for the Orthographic Bijection. Design document for Exp L2. 3-layer architecture, trie structure, bidirectional bijection, model size analysis, scaling experiments. 7pp.

Data

suffix-collision.txt

Full suffix collision analysis: type-level and token-weighted rates at suffix lengths 1–15. Top collision groups. Word length distribution.

Navigation

← Previous: 20260304

Exp L1: Word Discovery. 73K unique words, 8K reified, 88% token coverage.

Next: 20260312 →

Quotient Explainer: MCP support paper for graf 13.