How far back does the RNN use context? Measured by variance of predictions across different values at distance k.
Y-axis: Variance of prediction across different values at distance k. Higher = more dependency.
Distance k (characters back)
| Distance | Variance | Normalized |
|---|---|---|
| 1 | 0.000610 | 0.763 |
| 2 | 0.000767 | 0.960 |
| 3 | 0.000713 | 0.892 |
| 4 | 0.000799 | 1.000 |
| 5 | 0.000667 | 0.836 |
| 6 | 0.000667 | 0.835 |
| 7 | 0.000537 | 0.673 |
| 8 | 0.000684 | 0.856 |
| 9 | 0.000546 | 0.683 |
| 10 | 0.000797 | 0.998 |
| 11 | 0.000443 | 0.555 |
| 12 | 0.000561 | 0.703 |
| 13 | 0.000596 | 0.747 |
| 14 | 0.000452 | 0.565 |