2026-03-06: Trigram Embedding & Consolidation

Two papers: (1) The trigram embedding, a P-program for the orthographic bijection. Three-layer architecture (bytes → bigrams → words) via threshold creation on LPPs. (2) Consolidation v2: how the organism discovers structure from memory traces. Grounded in GCD decomposition (Bayes from Counting), ring tower (KN-quotient v2), experimental findings (English context results). Addresses L∞ vs L1 gap, OBSERVE LPP RNG coupling, log-stochastic quantization, and the two-level trace.

Suffix Collision Analysis

Empirical analysis on 8,048 reified words from Exp L1. How many words share a k-character suffix? This determines the trie depth needed for unique word identification.

Suffix lengthTrie depthUnique ID rateColliding types
3112.6%7,099 (88%)
4237.2%5,141 (64%)
5363.2%3,042 (38%)
6481.3%1,568 (20%)
7591.6%699 (9%)
8696.3%309 (4%)
10899.6%37 (0.5%)
1513100%0

Key finding: depth 1 (3-char suffix) uniquely identifies only 12.6% of tokens, far below the predicted >80%. English morphology dominates: “-ing” (321 words), “-tion” (215 words), “-ted” (160 words). Depth 5 needed for >90%. Full disambiguation requires suffix length 15.

Papers

consolidation.pdf
Consolidation in the Universal Model, v2: The GCD, the Differential ω, and the Discovery of Structure. Grounded in GCD decomposition, ring tower, experimental findings. OBSERVE LPP RNG coupling, log-stochastic quantization, L∞ vs L1 gap, two-level trace, CRT word extension. 11pp.
trigram-embedding.pdf
The Trigram Embedding: A P-Program for the Orthographic Bijection. Design document for Exp L2. 3-layer architecture, trie structure, bidirectional bijection, model size analysis, scaling experiments. 7pp.

Data

suffix-collision.txt
Full suffix collision analysis: type-level and token-weighted rates at suffix lengths 1–15. Top collision groups. Word length distribution.

Navigation

← Previous: 20260304
Exp L1: Word Discovery. 73K unique words, 8K reified, 88% token coverage.
Next: 20260312 →
Quotient Explainer: MCP support paper for graf 13.