The model's --use-xg path fits round(xG) as the per-match Poisson response in Dixon-Coles and Hierarchical Poisson (falling back to realised goals where xG is absent). It was structurally bounded by a tiny corpus: the JaseZiv FBref mirror that scripts/pull_intl_xg.py reads carries Opta xG for only 143 tournament matches — WC 2018 (64), Euro 2020 (51), Copa 2021 (28). WC 2022 is present-but-null there; Euro 2024 / Copa 2024 / AFCON 2023 are absent. With 143 xG-bearing matches out of ~49…
Onderzoeksnotitie
Back-filling international xG from StatsBomb open data
Status: Production xG path WIRED (2026-05-29) — `auto-refit.yml` fits DC with `--use-xg` and refits the calibrator on the xG-enabled ensemble; HP excluded (fails the gate's ECE half). The xG-enabled artefacts (`dixon_coles.json` / `ensemble_calibrator.json` / `data.json`) regenerate on the first auto-refit after merge (not hand-committed — see "Production wiring" for why). Single-provider corpus (314 StatsBomb + 28 residual Opta Copa-2021 = 342 rows). Gate clears for DC + Ensemble on both evaluation slicesBacktestdatum: 29 May 2026Samenvatting + volledige notitie · 1,755 woorden
Samenvatting
Volledige notitie
Standard Pass
Lees de volledige onderzoeksnotitie
Back-filling international xG from StatsBomb open data telt 1,755 woorden. De Standard Pass ontgrendelt elke onderzoeksnotitie in zijn geheel, plus de volledige voorspelling en beoordelingen per team en per speler, geldig gedurende het toernooi.
24h self-service refund·No subscription, no auto-renewal·Access through 31 Dec 2026. See refund policy.