Order Scaling: The KN–H3 Crossover

6-policy ablation at orders 2–8 on 1M bytes of enwik9. The right combination rule changes with order.

Scaling Curves

Each line shows how a policy's bpc changes as n-gram order increases. The crossover between KN (blue) and H3 (coral) occurs between orders 4 and 6.

max-min
sharpest
KN-interp
ent-blend
gap-blend
2^(g/H)
The crossover: KN gets worse from order 6→8 (+0.147 bpc) because recursive backoff cascades put 99.9% weight on unigram. H3 gets better (−0.050 bpc) because 2^(gap/H) concentrates on the rare confident high-order contexts. The gap grows monotonically after order 4.

Marginal Gain per Order Step

How much does adding the next order help (negative = better)? KN's marginal gain turns negative after order 4. H3 stays positive throughout.

Full Table

Ordermax-minsharpestKNentgap2^(g/H)
25.2884.5973.2323.3323.5143.506
35.2474.5552.6552.7442.9142.890
45.2854.8062.3912.4272.5122.484
65.3475.1002.4182.3372.2442.239
85.5434.7012.5652.3792.2942.189