← All docs

Data sources

Data Sources Matrix

Summary + sample · full document is 1,708 words

Summary

Source × coverage × cost × access difficulty. Use this to decide what to integrate first and where the binding constraints are. Companion to data-sources.md (which has narrative context).

Costs are approximate and as of mid-2026; treat as orders of magnitude. Access difficulty is a 1–5 score: 1 = "git clone, you're done"; 5 = "vendor partnership required."

Sample

Results and odds

SourceWhat it coversCostAccess difficultyNotes
Football-Data.co.ukTop European leagues 1993–present, results + pre-match odds, partial closing oddsFree1The starter dataset; CSV per league per season.
Betfair HistoricTop-flight English football back to 2008+, sub-second exchange order books and pricesFree2Registration + download of large compressed JSONs. Best public source of fillable closing prices.
Pinnacle archives (3rd-party)Pinnacle pre-match + closing prices, partial coverage via OddsAPI archivePaid (mid-three to low-four figures/year) or free with self-scrape3First-party export not offered; self-scrape and accumulate is the cheapest long-run path.
Smarkets APITop-flight English football, exchange order books in real-timeFree2Lower liquidity than Betfair; useful cross-check.

Full document

Pro

Want the full document?

Data Sources Matrix runs 1,708 words. Pro members get every research note in full, plus the arbitrage feed, model outputs, and weekly updates.

Sign in for Pro access