rGAX: Rethinking goals above expectation (GAX)

Deriving a statistically informed alternative to goals over expectation (GAX).

Authors

Robert Bajons

Joint work with Lucas Kook

Published

November 1, 2024

Hit Counter by Digits

Resource Date Link
Presentation Mathsport 2025 2025-06-05 Presentation (Mathsport 25)
Presentation Brown Bag Seminar 2025 2025-03-07 Presentation (BBS 25)

Note

This project is a special application of (and one of the main motivations for) the project: Machine learning based statistical inference in sports analytics.

Overview

Expected Goals (xG) is one of the most popular metrics in modern football (soccer) analytics. xG models assign a probability of success to each shot, by relating it to shot-specific covariates. Popular xG models are typically based on high-level machine learning models that account for non-linear and interaction effects of the shot-specific covariates. As a measure of a shot’s value, it is commonly used to evaluate the shooting skills of players by considering goals over expectation (GAX), i.e. the difference for each shot between actual and expected goal. However, GAX is often criticized for being unstable over seasons and for not providing (direct) means of uncertainty quantification. In this work, we address these issues by showing how the player-specific GAX relates to a score test when the xG model is a logistic regression and proposing a natural nonparametric extension that enjoys doubly robustness properties and can be used with any sufficiently powerful machine learning algorithm for xG. In this way, we are able to leverage commonly used black-box xG models, while still obtaining valid statistical inferences on the player-specific odds or probability of scoring a goal on the influence of a player on shooting the ball. Moreover, in order to make the results more accessible for practitioners, we show how the proposed procedure can yield player-specific effect estimates in a partially linear logistic regression model which are interpretable as additive effects on the log-odds of scoring a goal from a shot. Finally, we apply our framework to the 2015/16 season of the top 5 European leagues and determine the best shooters using our novel methodology.