Research

How the forecasts are built, and why you can trust them

Every probability is backtested walk-forward against an 8×90-day gate before it ships, then scored live against real results as the tournament plays. Methodology, backtests, and limits all in the open, including the experiments that failed.

50 short-form posts, 22 methodology docs, 25 research & backtest notes.

4 independent models, averaged
8×90-day walk-forward gate
Failed experiments published
Built from public data only

Can you trust the numbers?

Built to be checked

What determines whether the published probabilities are worth taking seriously: how they hold up against real results, the failures published alongside the wins, and the versioned record behind every number.

The full argument · free

Why trust these numbers

A probability publication is a credibility game. Anyone can publish numbers; the question is whether those numbers track outcomes once the matches finish. This page collects the d…

Live calibration tracker

Do the numbers track outcomes?

Brier score and calibration by tier, scored against real results and updated through the tournament. A 70%-rated outcome should happen about 70% of the time. This is where you check.

Negative results

The experiments that failed

Every model variant that did not clear the gate, published in full with its verdict. The no-ships are as visible as the wins.

Model changelog

Every version, on the record

The versioned history of the model: every retrain and architecture change, each stamped with its Brier-at-release and linked to its full notes. The number on every page traces back to a dated row here.

Methodology essentials

Start here

The three documents to read first if you want to know how the model works. Free to read in full.

How we make predictions

How our 2026 World Cup prediction model works

Our 2026 FIFA World Cup forecasts come from a statistical prediction model that blends three approaches — an Elo rating system, a Dixon-Coles Poisson goals model, and a hierarchic…

How we make predictions

What we predict and how

For every prediction target — match outcomes, goal totals, scorelines, individual player events — there's a standard modelling approach and a set of input variables. This page cat…

Behind the scenes

Where our data comes from

The quality of any prediction depends on the data behind it. This page maps every data source we use — from free public archives to commercial feeds — and explains what each one p…

All 22 docs →

What we tried

Research notes

Decision logs from the model build: hypothesis, backtest, result, ship-or-no-ship verdict. The failed experiments are kept on the record alongside the wins.

Shipped · 29 June 2026

Neural Poisson: a nonlinear extension of Dixon-Coles

The ensemble's three existing models share a structural constraint:

Not shipped · 3 June 2026

A within-match chase layer "passes" the headline gate — and the placebo proves it shouldn't

The feasibility probe found that, after controlling for team strength, only

Shipped · 31 May 2026

Testing our approach on the Champions League final

The `/test/live/<slug>/` route renders the live-tracker pipeline

All 25 research notes →

What changed recently

Short-form notes from the most recent model runs and findings.

1 July 2026 · edwin-chan