リサーチノート

Calibrating predictions differently for friendlies vs tournaments

ステータス: Shipped (Variant 4 — per-tier Platt temperature scaling). Production calibrator uses a hybrid strategy: Platt for the tournament tier (where isotonic collapses to identity at n~70), isotonic for friendlies/qualifiers (where it's more expressive at n~400+). Gate passed. See results belowトップライン + 全文 · 3,433 語

トップライン

The shipping ensemble calibrator (scripts/fit_ensemble_calibrator.py) fits per-class isotonic regression curves on the uniform-averaged three-component output (Elo bracket MC + Dixon-Coles + Hierarchical Poisson MAP). The first cut lifted holdout ECE on the 365-day common-subset training pool from 4.62pp uncalibrated → 2.70pp under the pooled-across-tiers fit (5-fold CV, n_train = 939, current artefact at data/wc2026/ensemble_calibrator.json).

A subsequent tier-aware refit (three sets…

全文

Standard Pass

リサーチノートの全文を読む

Calibrating predictions differently for friendlies vs tournaments は 3,433 語です。Standard Pass ですべてのリサーチノートの全文を解除できるほか、完全な予測、チーム別・選手別の評価も大会期間中ご利用いただけます。

Pass を購入 — $15

24h self-service refund·No subscription, no auto-renewal·Access through 31 Dec 2026. See refund policy.