About

Sports data, models, and AI should belong to everyone

Professional clubs and national federations have spent the last decade buying their way into analytics — proprietary tracking feeds, internal model stacks, dedicated data-science staff, and tools the public never sees. OnThePitch democratises that — the data, the models, and the AI behind them — starting with the 2026 FIFA World Cup.

OnThePitch — by Pelagia Private Limited

What we publish

A calibrated statistical model for the World Cup — win probabilities for every fixture, projected starting XIs for every nation, tournament-winner probabilities, per-player scoring probabilities, expected goals, and the methodology behind all of it. Built from publicly available data sources; refreshed on every run; documented openly.

See /docs/methodology/ for the full write-up, /posts/ for short-form research notes, and /data/ for the export endpoints.

How to read a probability

Every number on the site is a probability — a long-run frequency, not a forecast of what will definitely happen. When the model gives a team an 18% chance of winning, it means that across many comparable matches an outcome like this lands about once in every five. Low-probability results happen all the time; that's exactly what an 18% means. The right way to read a single number is as one row in a long table of similar situations.

Because the numbers are frequencies, we check them the same way — by comparing what the model said against what actually happened, match after match. The methodology walks through how the probabilities are built and calibrated, and the glossary defines every term and column you'll meet along the way.

Who we're for

Fantasy players who want a probability-backed view of who'll actually start. Football journalists looking for a citable model output instead of a vibes-based take. Sports-analytics enthusiasts and developers who want clean per-fixture data via API. Academics studying tournament dynamics. Anyone who watches the game and wants the same statistical lens the professionals already use.

Why we're doing this

A modern Premier League club runs five-figure-per-month data contracts, employs analysts who fit Dixon-Coles variants for a living, and ships tactical reports their fans will never read. Most of what they produce isn't secret — it's an Elo refit here, a Poisson goals model there, a Bayesian hierarchical something — but it's expensive to assemble, and the assembly is what costs.

The argument we'd make is that the assembly should be a public good. If anyone with a browser can read a calibrated tournament forecast, the conversation about a fixture stops being "what does the bloke on TV think" and starts being "here's the model's view, here's where it's confident, here's where it isn't". That's a better conversation, and it's the one this project is trying to enable.

How we work

Open sources

Everything the model reads is publicly available — FIFA fixtures, Wikipedia squad pages, FBref via the worldfootballR ecosystem, public club-season stats. The full source list is in the methodology page, and any reader can reproduce the inputs from the same public archives.

Methodology in the open

Every probability on the site comes from a model whose architecture, features, training procedure, and limitations are written up at /docs/. If a number on the page surprises you, the methodology should answer where it came from.

Free baseline access

The headline forecast — tournament winner, group standings, the knockout cascade — is free for every reader, along with projected XIs, squads, per-player composite ratings, and the full methodology. Deeper layers (per-fixture probabilities, scoreline distributions, the four-model comparison, full research-note bodies, and the data export API) sit behind a one-time Pass that funds the work.

Editorial scope

OnThePitch is structurally a statistical publication — in the same tradition as FiveThirtyEight or FBref. The product is the model's calibrated probabilities, the methodology behind them, and the per-team and per-player breakdowns derived from them. Revenue funds the research.

Get in touch

Found a bug, want to suggest a feature, or have a research question? The fastest path is the feedback form. For long-form discussion of the model, our research notes live on Substack.

Who's behind it

OnThePitch is built and operated by Pelagia Private Limited.