Tennis ELO Model in Python: Beat the Closing Line with Free Historical Odds

Tennis ELO Model in Python - OddsPapi API Blog
How To Guides June 1, 2026

You Don’t Need a Black-Box Tennis Model. You Need a Sharper Benchmark.

Search “tennis prediction model” and you get two kinds of results: academic ELO papers with no code, and paid “AI tipster” services that won’t show you their math. Neither is useful if you actually want to build and validate your own model.

Here’s the thing nobody tells you: the model is the easy part. ELO ratings for tennis are about 40 lines of Python. The hard part is answering one question — is my model actually better than the market? And to answer that you need the one number that’s almost impossible to get cheaply: the sharp closing line, plus its full price history.

This guide builds a surface-weighted ELO model from scratch, then benchmarks it against Pinnacle’s de-vigged probability and 23 other bookmakers — all from a single free API call. We’ll use a real match: Lorenzo Musetti vs Holger Rune, French Open Round 4, which had 24 books pricing it live, including Pinnacle, Kalshi, Polymarket and Betfair Exchange.

The Old Way vs OddsPapi

StepThe Old WayOddsPapi
Match data & resultsScrape Tennis Abstract / Flashscore, parse HTMLOne /fixtures call (sportId 12)
The sharp benchmarkPinnacle API is closed to the public/odds returns Pinnacle + 23 books
Closing-line history (for CLV)Paid add-on, or scrape every 5 min yourself/historical-odds — free tier
De-vig to “true” probabilityManual, per bookSame nested JSON across every book
Cost$50–300/mo + scraper maintenanceFree tier

That third row is the killer. A predictive model is only as good as the yardstick you measure it against, and the only honest yardstick in betting is the closing line — specifically a sharp book’s closing line, de-vigged. Pinnacle moves their price as sharp money comes in; by the time the match starts, their no-vig probability is the most accurate public forecast on the planet. OddsPapi gives you that line and its full history for free, so you can compute Closing Line Value (CLV) and actually prove whether your edge is real.

Step 1: Authenticate

OddsPapi uses an apiKey query parameter — not a header. Every call needs it.

import requests

API_KEY = "YOUR_API_KEY"
BASE_URL = "https://api.oddspapi.io/v4"

# Smoke test
r = requests.get(f"{BASE_URL}/sports", params={"apiKey": API_KEY})
print(r.status_code)  # 200

Step 2: Find Tennis Fixtures

Tennis is sportId=12. Pull a date range (max 10 days apart) and keep the ones that actually have odds.

import time

def tennis_fixtures(date_from, date_to):
    r = requests.get(f"{BASE_URL}/fixtures", params={
        "apiKey": API_KEY,
        "sportId": 12,
        "from": date_from,
        "to": date_to,
    })
    return [f for f in r.json() if f.get("hasOdds")]

fixtures = tennis_fixtures("2026-06-01", "2026-06-07")
for f in fixtures[:5]:
    print(f["fixtureId"], f["participant1Name"], "v", f["participant2Name"],
          "|", f["tournamentName"])

During Roland Garros this returned 100+ singles fixtures. Our target — Musetti v Rune — carried 24 bookmakers on the match-winner market.

Step 3: Fetch the Winner Odds

The tennis match-winner market is marketId 171. Outcomes are 171 (player 1) and 172 (player 2). The live /odds response is deeply nested — players["0"] is a dict holding the current price (the historical endpoint uses a list; don’t mix them up).

WINNER = "171"

def winner_prices(fixture_id):
    r = requests.get(f"{BASE_URL}/odds", params={
        "apiKey": API_KEY,
        "fixtureId": fixture_id,
    })
    books = r.json().get("bookmakerOdds", {})
    out = {}
    for slug, data in books.items():
        market = (data.get("markets") or {}).get(WINNER)
        if not market:
            continue
        oc = market["outcomes"]
        try:
            p1 = oc["171"]["players"]["0"]["price"]
            p2 = oc["172"]["players"]["0"]["price"]
        except (KeyError, TypeError):
            continue
        if p1 and p2:
            out[slug] = (p1, p2)
    return out

prices = winner_prices("id1003846272365390")  # Musetti v Rune
print(len(prices), "books")
print("Pinnacle:", prices["pinnacle"])  # (1.473, 2.71)

Real output for this match (sorted by margin, tightest first):

BookMusettiRuneMargin (vig)
Kalshi1.5872.6321.01%
Polymarket1.552.701.49%
Betfair Exchange1.552.662.14%
Parimatch / Betsson / GGBet1.502.663.95%
Pinnacle1.4732.714.79%
Bet365 / Unibet1.472.705.05%
DraftKings / BoyleSports1.452.705.94%
Coral / Ladbrokes1.472.6256.18%

Note the pattern OddsPapi exposes that single-book APIs can’t: the prediction markets (Kalshi, Polymarket) and the exchange (Betfair) price tighter than Pinnacle itself — 1–2% margin vs Pinnacle’s 4.8%. That matters for your benchmark, and we’ll come back to it.

Step 4: Build the ELO Engine

ELO is dead simple. Every player has a rating. Before a match, the expected win probability of player A is a logistic function of the rating gap. After the match, both ratings move toward the actual result — winner gains, loser loses, scaled by K and by how surprising the result was.

def expected_score(rating_a, rating_b):
    """Win probability of A given the rating gap."""
    return 1.0 / (1.0 + 10 ** ((rating_b - rating_a) / 400.0))

def update(rating_a, rating_b, a_won, k=32):
    """Return updated (rating_a, rating_b) after a single match."""
    exp_a = expected_score(rating_a, rating_b)
    score_a = 1.0 if a_won else 0.0
    rating_a += k * (score_a - exp_a)
    rating_b += k * ((1 - score_a) - (1 - exp_a))
    return rating_a, rating_b

Tennis has one wrinkle that makes a generic ELO useless: surface. A clay specialist like Musetti is a different player on clay than on hard court. The fix the research community settled on (Kovalchik, 2016) is to keep a separate rating per surface and blend it with the overall rating. Here’s a compact surface-aware engine that ingests a match history and spits out ratings:

from collections import defaultdict

class TennisElo:
    def __init__(self, k=32, base=1500, surface_weight=0.6):
        self.overall = defaultdict(lambda: base)
        self.surface = defaultdict(lambda: defaultdict(lambda: base))
        self.k = k
        self.sw = surface_weight  # how much surface form counts

    def rating(self, player, surface):
        # Blend surface-specific and overall rating
        return self.sw * self.surface[surface][player] + (1 - self.sw) * self.overall[player]

    def predict(self, p1, p2, surface):
        return expected_score(self.rating(p1, surface), self.rating(p2, surface))

    def add_match(self, winner, loser, surface):
        # Update the overall ratings
        ro_w, ro_l = self.overall[winner], self.overall[loser]
        self.overall[winner], self.overall[loser] = update(ro_w, ro_l, True, self.k)
        # Update the surface-specific ratings
        rs_w, rs_l = self.surface[surface][winner], self.surface[surface][loser]
        self.surface[surface][winner], self.surface[surface][loser] = update(rs_w, rs_l, True, self.k)

# Feed it your match history (winner, loser, surface), oldest first:
model = TennisElo()
for m in match_history:          # list of (winner, loser, surface) tuples
    model.add_match(*m)

p_musetti = model.predict("Lorenzo Musetti", "Holger Rune", "clay")
print(f"ELO win prob Musetti: {p_musetti:.1%}")

Where does match_history come from? Process a season of completed results — ATP/WTA match logs are widely available as CSV, and you backfill ratings by replaying every match chronologically. The surface_weight and k are the knobs you tune in the backtest (Step 6).

Step 5: De-vig the Market — Your Ground Truth

You can’t compare your ELO probability to a raw bookmaker price, because that price includes the bookmaker’s margin (the “vig”). You have to strip it out first. For a two-way market the simplest method is multiplicative normalisation:

def devig_two_way(odds_a, odds_b):
    """Return (fair_prob_a, fair_prob_b) with the margin removed."""
    imp_a, imp_b = 1 / odds_a, 1 / odds_b
    overround = imp_a + imp_b
    return imp_a / overround, imp_b / overround

# Pinnacle: Musetti 1.473, Rune 2.71
fair_m, fair_r = devig_two_way(1.473, 2.71)
print(f"Pinnacle fair: Musetti {fair_m:.1%}, Rune {fair_r:.1%}")
# -> Pinnacle fair: Musetti 64.8%, Rune 35.2%

So the sharpest book on earth, with the margin removed, makes Musetti a 64.8% favourite. That’s your benchmark. (For heavy favourites you may prefer the “power” method, which corrects the favourite–longshot bias — but for a balanced match like this, multiplicative is fine and within a fraction of a percent.)

Step 6: Compare, and Find the Edge

Now put the two numbers side by side. Suppose your clay-weighted ELO — after replaying the season — outputs ratings of roughly 2003 (Musetti) vs 1897 (Rune). That’s a 106-point gap, which the logistic turns into:

p = expected_score(2003, 1897)
print(f"ELO: {p:.1%}")   # 64.8%

Your model and the sharp market land in the same place: ~64.8%. That’s actually the most common outcome when you benchmark against Pinnacle — and it’s a feature, not a disappointment. It tells you the favourite is fairly priced and your model isn’t fooling itself. The edge isn’t in disagreeing with the market on probability. It’s in the price.

Your fair probability says Musetti should be 1 / 0.648 = 1.543. Now scan all 24 books for the best available price on Musetti:

def best_price(prices, side):
    # side: 0 for player1, 1 for player2
    return max(prices.items(), key=lambda kv: kv[1][side])

book, (m, r) = best_price(prices, 0)
print(f"Best Musetti: {book} @ {m}")   # Best Musetti: kalshi @ 1.587

fair_odds = 1 / fair_m            # 1.543
ev = fair_m * 1.587 - 1           # expected value at the best price
print(f"Fair odds {fair_odds:.3f} | EV at 1.587: {ev:+.2%}")
# -> Fair odds 1.543 | EV at 1.587: +2.82%

Kalshi was offering Musetti at 1.587 while the sharp fair price was 1.543. Backing the favourite there is +2.8% EV versus Pinnacle’s de-vigged line — a clean, model-confirmed value bet that exists purely because a prediction market priced the favourite more generously than the sharp book. Check the other side too: the best Rune price was 1xBet at 2.74, but fair Rune odds are 1 / 0.352 = 2.84, so even the best dog price is −3.5% EV. Pass.

This is the whole workflow in one screen: ELO gives you an independent prior, the de-vigged sharp line tells you the true probability, and line-shopping across 350+ books turns “I think Musetti wins” into a quantified +EV bet. The model’s real job is to stop you taking the −EV side when the price looks tempting.

Step 7: Prove It With Closing Line Value

One worked example proves nothing — a model is only validated over hundreds of matches. The professional standard is Closing Line Value: did you consistently bet at prices better than the closing (sharpest) line? Beat the close over a big sample and you are, by definition, +EV. OddsPapi’s free /historical-odds endpoint makes this measurable. The shape differs from live odds — here players["0"] is a list of snapshots:

def closing_line(fixture_id, book="pinnacle"):
    r = requests.get(f"{BASE_URL}/historical-odds", params={
        "apiKey": API_KEY,
        "fixtureId": fixture_id,
        "bookmakers": book,          # max 3 per call
    })
    oc = r.json()["bookmakers"][book]["markets"]["171"]["outcomes"]
    history = oc["171"]["players"]["0"]   # list of snapshots
    return [(s["createdAt"], s["price"]) for s in history]

snaps = closing_line("id1003846272365390")
print("Open:", snaps[0], "| Close:", snaps[-1])
# Open: ('2026-05-31T08:58...', 1.485) | Close: (..., 1.473)

Pinnacle shortened Musetti from 1.485 to 1.473 in the hours before the match — sharp money landed on the Italian. If you’d taken Kalshi’s 1.587 the morning of the match, you beat the close by a wide margin: textbook positive CLV. A real backtest loops this over your whole bet log:

def clv(bet_odds, closing_odds):
    """Positive = you beat the closing line."""
    return bet_odds / closing_odds - 1

# Your 1.587 vs Pinnacle close 1.473
print(f"CLV: {clv(1.587, 1.473):+.1%}")   # CLV: +7.7%

Run your ELO picks through this over a season. If your average CLV is positive, your model has a real edge and you can size up with confidence. If it’s negative, no staking plan will save you — and you’ve learned that for the price of zero API calls. To turn a validated edge into bet sizes, pipe the de-vigged probability into a Kelly criterion staking calculator.

Why This Matters: The Three Kill Shots

  • Free historical odds. CLV is the only honest way to validate a model. Competitors charge for closing-line history; OddsPapi gives it away on the free tier.
  • 350+ bookmakers, sharps included. One match returned 24 books — Pinnacle for the benchmark, Kalshi/Polymarket/Betfair for the tightest prices, and soft books where the value usually hides.
  • The sharp line, no enterprise contract. Pinnacle’s API is closed to the public. OddsPapi hands you their price (and history) through one query parameter.

Frequently Asked Questions

What’s the best K-factor and surface weight for tennis ELO?

There’s no universal answer — that’s what the CLV backtest is for. Common starting points are K=32 and a surface weight around 0.5–0.7. Tune them to maximise your average CLV against Pinnacle’s closing line over a full season, not to maximise in-sample accuracy.

Where do I get the historical match results to train ELO?

ATP/WTA match logs (winner, loser, surface, date) are published as free CSVs by several community datasets. You replay them oldest-first through the engine in Step 4 to backfill ratings. OddsPapi supplies the odds side — the closing lines you validate against.

Why de-vig Pinnacle instead of just using its raw price?

Raw odds bake in the bookmaker’s margin, so they overstate both players’ probabilities. De-vigging normalises them back to 100% so you can compare apples to apples with your model. Pinnacle is the book to de-vig because its low margin and sharp customer base make its closing number the most accurate public forecast.

Can I do this for other sports?

Yes. ELO works for any head-to-head sport — swap sportId and the match-winner market ID. The de-vig and CLV logic is identical. See the Tennis Odds API guide for the full market catalogue.

Is the OddsPapi free tier really enough for this?

Yes. Fixtures, live odds across 350+ books, and historical odds are all on the free tier. The only limits are a short cooldown between calls (use time.sleep(0.2) in loops) and a max of 3 bookmakers per historical-odds call.

Stop Guessing. Benchmark Against the Sharps.

A tennis model you can’t measure is just a hunch with extra steps. Build the ELO engine, de-vig the sharp line, and prove your edge with closing line value — all on free data. Grab your free OddsPapi API key and start validating.

Next steps: turn your fair probabilities into a full expected value & CLV scanner, calculate consensus fair odds from 350+ books, and read the guide to backtesting a betting model with free historical odds.