The shipping backtest in documentation/methodology.md reports Brier / log-loss / ECE on a single-fold 5-year holdout (2021-05-23 → 2026-05-22). That holdout is dominated by friendlies — only ~15% of the matches are major-tournament games. A reader who skims the numbers gets a Brier near 0.510 and assumes the model is that accurate on the matches it actually has to predict in June 2026.
This note answers: **what do those same metrics look like when we restrict the evaluation slice to major-t…