Pro Data

Get the model's numbers into your code in minutes

157,119

data points

13,605 records across 13 datasets, up to 148 fields each: every value model-generated or curated. Updated after every match day through the World Cup, so results, bracket movement, and revised probabilities flow into the API within hours. Free to browse here; pull it all via the API with a free key.

48 nations1,215 players8,964 player-seasons958 match results

Every probability for the 48 nations and 104 matches, projected squads, season-by-season player careers, recent results and the historical record that feeds the model: as JSON or CSV, or as an MCP tool your assistant can call. The model refits after every match day, so results and revised probabilities land in the API within hours. Built for researchers, fantasy players, journalists, and analytics teams who want the model output in a pipeline rather than a page. Try a live endpoint below with one command: no signup.

Live now, updating through the tournament· Results and revised probabilities after every match day · Bulk zip + historical slicing live

Quickstart

Try it now: no signup

Run this. It returns the five most-likely tournament winners straight from the live model: no account, no API key. Add ?format=csv for the tabular variant.

curl "https://onthepitch.now/api/v1/sample/"

Want the full picture: all 48 teams, every fixture, players, squads? Sign in, generate an API key (10 free calls to start), and pass it as a Bearer token on any endpoint:

curl -H "Authorization: Bearer otp_live_…" \
  "https://onthepitch.now/api/v1/forecast/"

Try it free: 10 calls

Sign in and generate an API key 10 free programmatic calls, shared across these REST endpoints and the hosted MCP server: enough to wire the data into a notebook or an assistant and see it working. The MCP handshake doesn't count; only the data calls do. The Pro Pass is unmetered.

What's in the data

View interactive API reference →

13 datasets, each reachable under /api/v1/ as JSON or — with ?format=csv — a flat table. /api/v1/sample/ is public; the rest are gated (a Bearer API key or a Pro session cookie). Below is exactly what each one returns. Full schemas live in the interactive reference (OpenAPI 3.0 spec at /api/v1/openapi.json).

Tournament forecast
48 nations
/api/v1/forecast/
Each nation's probability of lifting the trophy and of reaching every stage.
Returns · 17 fields Tournament-win probability · advance from group · reach R16 / QF / SF / final · win-group probability · P(finish 1st–4th in group) · confederation · FIFA rank · group
Team forecast trajectory
576 snapshots
/api/v1/team-history/
How each team's forecast has moved over time — one row per daily build.
Returns · 7 fields Team · build timestamp · Elo rating · tournament-win probability · 90% CI band (low & high)
Match forecast
104 matches
/api/v1/fixtures/
Win / draw / win plus match-event probabilities for every fixture in the bracket.
Returns · 25 fields P(home / draw / away) · expected goals each side · over 0.5 / 1.5 / 2.5 / 3.5 goals · both teams to score · clean-sheet probability each side · kickoff, venue & city · stage & group
Predicted squads
1,248 players
/api/v1/predicted-squads/
Projected 26-man squad for every nation, one row per player.
Returns · 13 fields Starting-XI vs bench slot · position · jersey number · captain flag · international caps & goals · composite rating · rank within position
Player index
1,215 players
/api/v1/players/
Career profile for every player in the modelling pool.
Returns · 9 fields Name · nation · position · current club & league · birth year · career goals & assists
Player season history
8,964 player-seasons
/api/v1/player-history/
Season-by-season career record for every player — one row per player-season.
Returns · 12 fields Player · season · team & league · position · matches & minutes · goals & assists · xG & xAG
Anytime scorer
50 ranked scorers
/api/v1/anytime-scorer/
Tournament scoring probabilities for the most likely goalscorers.
Returns · 15 fields P(scores ≥1 in the tournament) · expected matches played · per-stage scoring probability (group → final) · rank
Recent match results
958 matches
/api/v1/results/
Every team's recent senior-international results — last 24 months, their POV.
Returns · 10 fields Date · opponent · home / away / neutral · goals for & against · result · competition
Head-to-head records
144 pairings
/api/v1/h2h/
All-time record between each nation and its group-stage opponents.
Returns · 12 fields Meetings · wins / draws / losses · last-meeting date, score, tournament & winner
Major-tournament history
106 tournament rows
/api/v1/majors/
Each nation's most-recent run at every major, with squad and coach continuity.
Returns · 15 fields Tournament & year · matches · W-D-L · goals for / against · bracket finish · coach (name, year appointed, still in charge) · squad-pool continuity · ?tournament=&year= slicing
Group schedule
144 team-matchdays
/api/v1/schedule/
Every nation's three group-stage fixtures.
Returns · 7 fields Matchday · date · opponent · host city & country
Team list
48 nations
/api/v1/teams/
The tournament field — the lookup table to join everything else on team_id.
Returns · 6 fields team_id · name · confederation · FIFA rank · host flag · group
Bulk snapshot
One zip · ~3–5 MB
/api/v1/bulk/snapshot/
Every source file the site renders from, in a single download, with checksums.
Returns 11 source JSONs + manifest (size, sha256 & built_at per file) · includes the deep fixture-enrichment blobs the per-endpoint variants flatten away

How you access it

Every dataset above is gated except the public /api/v1/sample/ teaser. Authenticate with a Bearer API key or a Pro session cookie; everything else about getting the data out is below.

CSV variant on every endpoint
Live
Pass ?format=csv on any endpoint for a tabular response, RFC-4180 quoted, suitable for spreadsheet or notebook import. All row types are flat — no nested JSON in CSV cells.
Match-day refresh + provenance metadata
Live
The model refits after every match day through the tournament: results, bracket movement, and revised probabilities flow into the API within hours. Every response carries the upstream snapshot's built_at timestamp in the envelope (JSON) or X-OnThePitch-Generated-At header (CSV). Cache is private 1h.
Bulk snapshot download (zip)
Live
GET /api/v1/bulk/snapshot/ returns one DEFLATE-compressed archive with the 11 source JSONs + a manifest (per-file size + sha256 + built_at). Includes the deep fixture enrichment blobs the per-endpoint variants flatten away. ~3–5 MB on the wire.
Historical tournament slicing
Live
Pass ?tournament=&year= on /api/v1/majors/ to extract a single tournament-year — e.g. every WC2026 team's WC2022 row in one call. The dataset stores each team's most-recent appearance per tournament, so WC2022 covers most qualifiers; WC2018 only covers teams that haven't been at a WC since.
Long-lived API keys
Live
Generate a Bearer key at /account/api-key/ and pass it as Authorization: Bearer otp_live_… on any endpoint — no cookie juggling for headless / CI / scheduled jobs. One key per account; reset it any time if it leaks.
MCP server — connect Claude / Cursor
Live
A hosted Model Context Protocol server at /api/mcp exposes every endpoint as an MCP tool. Add it to Claude Code, Claude Desktop, or Cursor with one command and ask questions against the live model — no install, no glue code.
Backfilled model outputs — WC2018, WC2022
Coming soon
Pre-tournament model probabilities for the last two World Cups, replayed against frozen inputs so you can backtest the model end-to-end against known outcomes. Requires re-pulling historical FBref / Wikipedia snapshots; tracked as a separate research initiative.
Reproducible notebooks
Coming soon
Jupyter notebooks that fit the model end-to-end from the public data files and reproduce the website's outputs. Run locally with the published datasets.

Examples

Three things you can do today

Set OTP_KEY to the API key you generate here (export OTP_KEY=otp_live_…) — one token, no cookie juggling, works the same from a shell, a notebook, or a scheduled job. (Signed in already? A Pro session cookie works too, but the key is the friendlier path.)

Full forecast as CSV

curl -H "Authorization: Bearer $OTP_KEY" \
  "https://onthepitch.now/api/v1/forecast/?format=csv" \
  -o forecast.csv

WC2022 historical slice

curl -H "Authorization: Bearer $OTP_KEY" \
  "https://onthepitch.now/api/v1/majors/?tournament=FIFA%20World%20Cup&year=2022" \
  -o wc2022.json

Bulk snapshot as a single zip

curl -H "Authorization: Bearer $OTP_KEY" \
  "https://onthepitch.now/api/v1/bulk/snapshot/" \
  -o onthepitch-snapshot-v1.zip

Build on the data

Pull the snapshot, fit your own model

The bulk zip ships everything the public site renders from: forecast, fixtures, predicted squads, head-to-heads, intl majors history. Three starter recipes for getting it into a notebook and running your own analysis against it.

1 · Load the snapshot into pandas

Pull the zip, extract it once, then read each file as a DataFrame. The forecast envelope nests one row per team: flatten with pd.json_normalize.

import io, json, os, zipfile, requests, pandas as pd

z = zipfile.ZipFile(io.BytesIO(requests.get(
    "https://onthepitch.now/api/v1/bulk/snapshot/",
    headers={"Authorization": f"Bearer {os.environ['OTP_KEY']}"},
).content))

forecast = pd.json_normalize(
    json.loads(z.read("data.json"))["teams"],
)
fixtures = pd.json_normalize(
    json.loads(z.read("fixtures.json"))["fixtures"],
)
squads = pd.json_normalize(
    json.loads(z.read("predicted_squads.json"))["squads"],
    record_path="players",
    meta=["team_id"],
)

2 · Fit a baseline win-probability model

Roll your own H/D/A model from the predicted squads + the team forecast. Composite ratings + a squad-strength sum is a reasonable starting feature set; compare against the published probabilities to see where your model agrees and disagrees.

from sklearn.linear_model import LogisticRegression

# Aggregate predicted-XI composite to a team strength score.
xi = squads.query("slot == 'xi'")
strength = xi.groupby("team_id")["composite"].sum().rename("strength")

# Join strength onto fixtures and fit a logistic on prior tournament
# results (intl_majors.json has W/D/L from previous WCs + qualifiers).
intl = pd.json_normalize(
    json.loads(z.read("intl_majors.json"))["rows"],
)
features = intl.merge(strength, left_on="team_id", right_index=True)
X = features[["strength", "fifa_rank"]]
y = (features["finish"] >= 9).astype(int)  # 1 = quarter-final or better

clf = LogisticRegression().fit(X, y)
print(clf.coef_)

3 · Calibrate your model against the published forecast

Once your model produces team-level tournament-win probabilities, compare against ours via Brier score and a reliability plot: the same metric we use in the methodology docs.

from sklearn.metrics import brier_score_loss

merged = forecast.merge(
    pd.Series(my_probs, name="my_p_win"),
    left_on="team_id", right_index=True,
)
print("OnThePitch Brier:", brier_score_loss(merged["won"], merged["p_win"]))
print("Yours        Brier:", brier_score_loss(merged["won"], merged["my_p_win"]))

The Pass licence is research-use: fit, backtest, publish your findings. Attribution back to the methodology doc is appreciated but not required.

Agents

Connect Claude, Cursor, or any MCP client

A hosted Model Context Protocol server at /api/mcp exposes every endpoint as an MCP tool, so you can ask questions against the live model right inside your assistant: no install, no glue code. Authenticate with an API key any signed-in account can generate one, and your first 10 tool calls are free.

Add it to Claude Code

claude mcp add --transport http onthepitch \
  https://onthepitch.now/api/mcp \
  --header "Authorization: Bearer otp_live_…"

Then prompt naturally — “which team is most likely to win the group?”, “compare the projected XIs for France and England”, “how did Brazil do at WC2022?” — and the assistant calls the matching tool (get_forecast, get_predicted_squads, get_majors) and answers from the live data.

Pro Pass

Get the Pro Pass for unmetered access

Every account gets 10 free calls to try the API + MCP. The Pro Pass is a one-time $35 purchase for unmetered access: everything in the Standard Pass plus the data endpoints, bulk zip download, and historical tournament slicing. Every endpoint updates after each match day through the tournament. Already hold the Standard Pass? Upgrade for $20.

Get the Pro Pass: $35 →

24h money-back, no questions asked·No subscription, no auto-renewal·Access through 31 Dec 2026. See refund policy.