Calibrating predictions differently for friendlies vs tournaments

상태: Shipped (Variant 4 — per-tier Platt temperature scaling). Production calibrator uses a hybrid strategy: Platt for the tournament tier (where isotonic collapses to identity at n~70), isotonic for friendlies/qualifiers (where it's more expressive at n~400+). Gate passed. See results below핵심 요약 + 전문 · 3,433단어

핵심 요약

The shipping ensemble calibrator (scripts/fit_ensemble_calibrator.py) fits per-class isotonic regression curves on the uniform-averaged three-component output (Elo bracket MC + Dixon-Coles + Hierarchical Poisson MAP). The first cut lifted holdout ECE on the 365-day common-subset training pool from 4.62pp uncalibrated → 2.70pp under the pooled-across-tiers fit (5-fold CV, n_train = 939, current artefact at data/wc2026/ensemble_calibrator.json).

A subsequent tier-aware refit (three sets…

전문

Standard Pass

전체 연구 노트 읽기

Calibrating predictions differently for friendlies vs tournaments은(는) 3,433단어입니다. Standard Pass는 모든 연구 노트의 전문, 전체 예측, 팀별 및 선수별 평점을 잠금 해제하며, 대회 기간 동안 유효합니다.

Pass 구매 — $15 →

Every forecast graded against the real result, scored on 987 matches since 2014. See the scorecard.

24h money-back, no questions asked·No subscription, no auto-renewal·Access through 31 Dec 2026. See refund policy.