MP — Matches played
Number of matches the player appeared in.
Any appearance (start or substitute) counts as one match. Doesn't distinguish minutes played — a 90-minute outing and a 5-minute cameo each add 1 to MP.
- Source
- FBref
Reference
Definitions for every metric, abbreviation, and model output surfaced on the site. Linked from every column header — tap any abbreviation in a data table to jump straight to the relevant entry.
Counting and rate stats sourced from FBref's Big-5 European league coverage. Same definitions as fbref.com and StatsBomb.
Number of matches the player appeared in.
Any appearance (start or substitute) counts as one match. Doesn't distinguish minutes played — a 90-minute outing and a 5-minute cameo each add 1 to MP.
Total minutes on the pitch in the season.
Includes all match minutes (regular time + stoppage). The more reliable signal for workload and starter status than matches played.
International or club goals scored, depending on context.
Penalty goals are included. On player pages this is the career total across the FBref Big-5-league coverage window (2009–2022) plus current-season Wikipedia top-scorer rows where available.
Passes that directly led to a goal.
FBref counts the last touch before a goal as the assist. Doesn't include shots that rebounded off a teammate. Career total in the player-browse table; per-season on the player detail page.
The model's per-shot quality estimate, summed. Each shot is scored 0–1 based on distance, angle, body part, defender pressure, and assist type.
If a player has 12 goals from 10.2 xG, they've outperformed the model's quality expectation. If they have 4 goals from 8.1 xG, they've underperformed (and the model expects regression toward the xG over time). xG over 90 minutes is the comparable rate stat across players. The FBref values are derived from StatsBomb's open-data model.
The xG value of shots created by this player's passes — i.e. the chance-quality the player generated for teammates.
When player A passes to player B and B shoots, the xG of B's shot is credited to A as xAG. Sum across the season for playmaker quality independent of whether the receiving teammate actually finished. xAG > assists usually means the player created good chances that weren't converted; xAG < assists means luck or finishing quality bailed them out.
xG with penalties stripped out.
Penalties are scored at ~0.76 xG each and inflate a player's xG total even though penalty-taking is largely orthogonal to open-play scoring ability. npxG isolates the latter. The tournament-scorer model uses npxG per 90 as its core rate.
Primary playing position (GK / DF / MF / FW).
FBref's broad-bucket positions, not specific roles. A wide midfielder and a #10 both register as MF; a striker and a winger both as FW. The squad-prediction model uses these broad buckets when filling per-position quotas.
FIFA 3-letter country code for the player's nationality.
BIH = Bosnia and Herzegovina, NED = Netherlands, etc. Players with multiple eligibilities show the country they have actually represented internationally; uncapped dual-nationals follow FBref's default attribution.
Senior international appearances.
Used in the predicted-squad model as a recency-weighted signal: how often has the manager actually called this player up. Doesn't distinguish friendly vs competitive, but recent caps weight more than caps from years ago.
Senior international assists, summed across every competition the player has appeared in.
Scraped from each player's Transfermarkt national-team page. TM credits the last pass before a goal as the assist, consistent with the FBref definition used elsewhere on the site. Goalkeepers don't have this stat — TM omits the Assists column on keeper pages and we render an em-dash rather than fabricate a zero.
Tackles attempted in the player's most recent Big-5 league season, summed across any clubs they played for that season.
FBref's Tkl total — every challenge for the ball, won or lost. Big-5 leagues only (Premier League, La Liga, Serie A, Bundesliga, Ligue 1); defenders at non-European clubs (Saudi Pro League, MLS, Liga MX, African leagues) show an em-dash rather than a synthesised value. The shown season is the latest one for which FBref has complete data — currently 2022-23.
Shots on target faced that the keeper saved, expressed as a 0-100 number for the most recent Big-5 league season.
FBref's Save_percent — Saves / Shots on Target Against. Penalties saved are included in the numerator (standard FBref definition). Same Big-5 coverage caveat as Tackles: keepers at non-Big-5 clubs show an em-dash. The shown season is the latest one for which FBref has complete data — currently 2022-23.
Composite metrics that aggregate several underlying signals into a single per-player or per-team number. Used to rank teams for the bracket, players for the predicted XI.
A single per-player number combining recent international caps + goals, recent club-level xG and xAG per 90, position-relative quality, and call-up priors. Higher = more likely to start.
Used by the predicted-XI selector to rank players for each of the 26 squad slots. A CB and a #9 are scored on the same scale but the weights differ — a 0.4 xG/90 from a CB is exceptional, from a #9 it's below average. The model doesn't directly observe form; the rating is a recency-weighted snapshot of the last 24 months of available data. Full formula in the methodology doc.
FIFA's world-ranking position at the latest published refresh.
Calculated by FIFA from a SUM (sum-of-points) algorithm over the last 8 years of internationals, weighted by match importance and opponent strength. The bracket model uses FIFA rank as one input but not the only one — its limitations (anchored toward UEFA, slow to adjust to recent form) are why the model leans more on Elo for match-outcome probabilities.
The component models and statistical techniques behind the probabilities. Every per-match number on the site is produced by some combination of these.
A team-strength rating updated after every match. Higher rating beats lower rating more often, by a calibrated amount.
Same family as the chess Elo. A team's rating moves up after wins (more if the opponent was strong) and down after losses. World Cup, Euros, and Copa results move Elo more aggressively than friendlies (the K-factor — K=60 for WC, K=50 for continental finals, K=20 for friendlies). Used as the baseline strength signal in the match-outcome model.
A maximum-likelihood model from a 1997 paper that fits each team's expected goals scored and conceded, with a small correction for low scorelines.
Standard Poisson on goals over-predicts 0-0 and 1-1 because those games aren't independent — they cluster. Dixon-Coles applies a low-score correction to fix this. The fit yields a joint goal distribution (e.g. 0-0, 1-0, 2-1, …); we marginalise it to win/draw/loss probabilities, exact-score grids, totals (over/under), and first-goal-window outputs.
A Bayesian model that treats each team's attack and defence as draws from a shared population distribution, so data-sparse teams are pulled toward the average.
Same prediction interface as Dixon-Coles, different fit. Uses partial pooling: teams with lots of matches mostly trust their own data; teams with few matches shrink toward a global mean. Reduces overfitting on small samples. Fit via PyMC NUTS (Bayesian sampling) and reduced to a maximum-a-posteriori (MAP) point estimate for serving.
The published per-match probability: a uniform average of Elo, Dixon-Coles, and Hierarchical Poisson, then post-processed for calibration.
Each component model produces a (win, draw, loss) probability triple. The ensemble averages them with equal weights, then runs the average through isotonic calibration and a small extremization step. Averaging uncorrelated forecasts usually beats any single model on Brier score and log-loss; the backtest at /posts/2026-05-18-when-fivethirtyeight-stopped confirms that here.
Running the entire tournament 50,000 times with randomised match outcomes drawn from the ensemble, then counting how often each team reaches each round.
For each simulated tournament: every group match is decided by sampling from the ensemble's (win, draw, loss) probabilities, group standings are computed, the bracket is drawn, and every knockout is sampled the same way. Repeating 50,000 times turns a per-match probability into stage-reach probabilities and a tournament-winner probability for each of the 48 teams.
A post-processing step that maps the model's raw probabilities to observed frequencies — so a stated 70% actually wins about 70% of the time.
Models can be systematically over- or under-confident. Isotonic regression fits a non-decreasing curve from raw predicted probabilities to observed outcome frequencies on held-out historical data. The ensemble's raw probabilities are passed through that curve before publication, which improves Brier score and ECE without changing the ordering of teams.
A small adjustment that pushes the ensemble's averaged probabilities slightly away from the middle when the component models agree.
Averaging three forecasts pulls probabilities toward 1/3 even when each component is confident. Extremization (parameter d = 1.15) raises each averaged probability to the power d and renormalises — so 60% might become 64%, and 20% might become 18%. Calibrated alongside the draw factor so the published probabilities still match observed draw frequency.
Outputs of the Monte Carlo bracket simulator (50,000 sims). All values are the probability the team REACHES that round — so p_advance ≥ p_r16 ≥ p_qf ≥ p_sf ≥ p_final ≥ tournament_prob.
Probability the team finishes 1st, 2nd, or qualifying 3rd in their group.
16 group winners + 16 runners-up + 8 best 3rd-place teams advance to the Round of 32. P(advance) sums those three outcomes from the per-team finish distribution.
Probability the team makes the R16, after the R32 cut.
Conditional on advancing and then winning the R32 fixture (against whatever opponent the bracket produces). Always lower than p_advance because the R32 fixture has to be won, not just survived.
Probability the team makes the QF.
Same chain as p_r16, one round deeper.
Probability the team makes the SF.
Same chain, one round deeper.
Probability the team makes the Final.
Same chain, one round deeper.
Probability the team lifts the trophy. The final step in the round-by-round progression above.
The 48 tournament-prob values sum to 1.0. Top teams are typically 5–15%; outside the top 10 the value drops rapidly toward the long tail. Bookmakers and prediction markets publish equivalent numbers; this site doesn't compare to those (per the compliance posture) — only to the model's own previous estimates.
Per-player probability of scoring at least one goal across the matches their country plays at the tournament.
Derived from each player's npxG/90, an expected-minutes figure based on international caps, a position-weighted share of the team's xG, an opponent-defence multiplier, and the team's expected number of WC matches. Doesn't include penalties or set-piece-taker bonuses in v0. Full formula in the methodology doc.
Research-quality measurements of how well the model's stated probabilities match observed outcomes. Lower is better for all three. The tracker on /docs/calibration/ updates per scored match.
Mean squared difference between predicted probabilities and the observed outcome (1 for the realised result, 0 for the others). Range 0–2 for 3-class outcomes.
The standard calibration metric for probabilistic forecasts. A model that always says "33% / 33% / 33%" scores ~0.667 on every match. A perfect model scores 0. 0.18 is typical for well-calibrated football models, 0.22 is mediocre, 0.26+ suggests systematic miscalibration.
Mean of −log(probability the model assigned to the realised outcome).
Penalises confident wrong predictions exponentially. A model that says 99% home-win and the home loses gets log-loss = −log(0.01) = 4.6 — devastating to the average. The complementary metric to Brier: Brier rewards calibrated probabilities; log-loss punishes overconfidence.
Mean absolute gap between predicted-probability bucket and observed frequency in that bucket.
Bin predictions into deciles (or however many) and check what fraction of "50% home win" predictions actually ended in a home win. If 50% predictions resolve 50% of the time, ECE = 0 for that bucket. Aggregated across buckets, ECE is the model's average reliability gap.
Shorthand that shows up next to numbers throughout the site.
The arithmetic difference between two percentages. "5pp" means 5 percentage points, not 5 percent.
If a team's tournament probability moves from 12% to 17%, that's a +5pp move (or +42% in relative terms — five extra out of twelve). The site uses pp throughout because gaps and shifts between probabilities are most readable in absolute terms.
The range the model thinks the true probability is likely to fall into. The site uses two variants — a bootstrap interval on the ensemble and a credible interval on the Hierarchical Poisson posterior — depending on which method generated the band.
The bootstrap interval (used in posts and on most pages) is computed by resampling historical match data, refitting the ensemble on each resample, and re-running the bracket Monte Carlo; the 2.5th and 97.5th percentiles give the 95% interval. The credible interval (surfaced via the prediction-inputs popover on fixture pages) is computed from PyMC NUTS posterior draws of the Hierarchical Poisson fit, propagated through the same bracket Monte Carlo. Wide interval = the model isn't sure; narrow = the answer is robust to the data we happened to see.
Stage labels and shorthand for the 2026 FIFA World Cup format.
12 groups of 4. Each team plays the other three; top 2 plus 8 best 3rd-place finishers advance.
First knockout round. Single-leg, extra time + penalties if needed.
First edition of the World Cup with R32. 32-team knockout = log₂(32) = 5 rounds before the final.
Second knockout round.
Final 8 round.
Final 4 round.
Championship match.
The black star next to a country name marks one of the three host nations (USA, Canada, Mexico).
Hosts get a small probability boost in the bracket model to reflect home-advantage effects observed in prior tournaments. Not a moral statement.
The date of the model run whose output is being displayed. Shown as a pill near the H1 on every prediction page.
The site is statically prerendered — every probability you see was computed at the timestamp shown. The model refreshes on every push to main; in practice that's once every few days, more often as the tournament approaches.