Vienna University of Economics and Business
Nov 25, 2025
A common scheme for player evaluation in sports analytics:
Expected value metrics \(\rightarrow\) compare observed and expected outcome:
Examples:
Problems:
Goals above expectation (GAX):
over a time frame (e.g. a season), compute differences between goals and xG for all shots of player \(i\)
\[ \operatorname{GAX}_i = \sum_{j=1}^{N_i} (Y_j - \hat h(Z_j))\]
Recent criticism of GAX (Baron et al. 2024; Davis and Robberechts 2024):
Instability and limited replicability (over seasons)
Low (effective) sample size \(\rightarrow\) high uncertainty, lack of uncertainty quantification
Biases arising from data:
How can we identify outstanding shooters?
Setup:
Logistic regression model (Pollard and Reep 1997):
\(Y \mid X,Z \sim \operatorname{Ber}(\pi(X,Z)), \quad \pi(X,Z) = P(Y=1 \mid X,Z)\) and
\[ \begin{aligned} \log\left(\frac{\pi(X,Z)}{1-\pi(X,Z)}\right) = X\beta + Z^{\top}\gamma. \end{aligned} \]
Goal: Inference on \(\beta\)
Given i.i.d data \((Y_i,X_i,Z_i)_{i = 1}^N\) from the logistic regression model
Score of \(\beta\) (target of score tests): …………………………………………..
\[ \sum_{i = 1}^N\frac{\partial\log L(\beta,\gamma \mid Y_i,X_i,Z_i)}{\partial \beta}\]
Score test on \(\beta\) uses score under \(H_0: \beta = 0\):
Recall:
\[ \operatorname{GAX}_i = \sum_{j=1}^{N_i} (Y_j - \hat h(Z_j))\]
Since \(X_j\) binary \(\rightarrow\) score is exactly GAX for a player
Conclusion: GAX relates to a classical score test in logistic regression model
Problems:
Linear model assumptions unrealistic
Traditional GAX via machine learning model:
How can we identify outstanding shooters?
Problem reformulation (\(Y\), \(X\), and \(Z\) as before): partially linear logistic regression model (PLLM), where
\[ \log\left(\frac{\pi(X,Z)}{1-\pi(X,Z)}\right) = X\beta + g(Z) \]
Under PLLM: Test for \(Y\) conditionally independent of \(X\) given \(Z\) (\(Y \perp\!\!\!\perp X \mid Z\)) \(\Leftrightarrow\) Test for \(H_0 : \beta = 0\)
Generalised Covariance Measure:
\[ \begin{align} \operatorname{GCM} &= \mathbb{E}[\operatorname{Cov}(Y,X \mid Z)] \\ &=\mathbb{E}[(Y - \mathbb{E}[Y | Z])(X - \mathbb{E}[X | Z])] \end{align} \]
Basis for GCM test:
\[Y \perp\!\!\!\perp X \mid Z \Rightarrow \mathbb{E}[\operatorname{Cov}(Y,X \mid Z)] = 0\]
GCM test in practice: obtain empirical version of GCM
Proposition
Consider a PLLM and let \(X \in \mathbb{R}^{d_X}\) with \(\operatorname{Var}(X \mid Z)\) a.s. positive semidefinite. Then \[\beta = 0 \Leftrightarrow \mathbb{E}[(\operatorname{Cov}(Y,X \mid Z)] = 0.\]
Proposition
Consider a PLLM and let \(X \in \mathbb{R}\) with \(\operatorname{Var}(X \mid Z) > 0\) a.s. Then \[\operatorname{sign}(\beta) = \operatorname{sign}(\mathbb{E}[\operatorname{Cov}(Y,X \mid Z)]).\]
Takeaways:
Proposition
Consider a PLLM and let \(X \in \mathbb{R}^{d_X}\) with \(\operatorname{Var}(X \mid Z)\) a.s. positive semidefinite. Then \[\beta = 0 \Leftrightarrow \mathbb{E}[(\operatorname{Cov}(Y,X \mid Z)] = 0.\]
Proposition
Consider a PLLM and let \(X \in \mathbb{R}\) with \(\operatorname{Var}(X \mid Z) > 0\) a.s. Then \[\operatorname{sign}(\beta) = \operatorname{sign}(\mathbb{E}[\operatorname{Cov}(Y,X \mid Z)]).\]
Takeaways:
Proposition
Consider a PLLM and let \(X \in \mathbb{R}^{d_X}\) with \(\operatorname{Var}(X \mid Z)\) a.s. positive semidefinite. Then \[\beta = 0 \Leftrightarrow \mathbb{E}[(\operatorname{Cov}(Y,X \mid Z)] = 0.\]
Proposition
Consider a PLLM and let \(X \in \mathbb{R}\) with \(\operatorname{Var}(X \mid Z) > 0\) a.s. Then \[\operatorname{sign}(\beta) = \operatorname{sign}(\mathbb{E}[\operatorname{Cov}(Y,X \mid Z)]).\]
Takeaways:
Use rGAX instead of GAX for shooting skill evaluation:
rMetrics: Framework generalizable to any metric of the form
\[\sum_{j=1}^{N} (Y_j - \hat{h}(Z_j))X_j\]
rMetrics (and GCM test results) conveniently obtained via comets R package (Kook and Lundborg 2024)
Freely available event stream data from Hudl-Statsbomb
Shot specific features \(Z\):
xG model (\(\hat h\)):
Model for \(X\) regression (\(\hat f\)):
Are rGAX really better than GAX?
Are rGAX really better than GAX?
Recall recent criticism of GAX (Baron et al. 2024; Davis and Robberechts 2024):
Instability and limited replicability (over seasons)
Low (effective) sample size and hence high uncertainty
Biases arising from data
How are GAX and rGAX affected by data selection?
How are GAX and rGAX affected by data selection?
Illustrative Example: Messi data \(\rightarrow\) Hudl-Statsbomb provides event stream data from all Messi matches at FC Barcelona.
How are GAX and rGAX affected by data selection?
Illustrative Example: Messi data \(\rightarrow\) Hudl-Statsbomb provides event stream data from all Messi matches at FC Barcelona.
Fit 3 different xG models:
Model trained on all data:
Model trained on 2015/16 data of top 5 European leagues:
Model trained on shots from players with less than 30 shots observed in the data
Survival setup:
Injuries above expectation (IAX):
Residualized version (rIAX)
Injury data from injurytools package in R
Injury specific features \(Z\):
Models:
In a logistic regression model: GAX is directly related to a score test on a players effect on the probability of goal.
If you don’t believe the GLM setup: GAX using ML models does not allow valid inference! \(\Rightarrow\) Residualize \(X\) as well, i.e. use rGAX.
If you want interpretation: rGAX directly related to parameter in popular semi-parametric model!
Outlook:
General framework usable beyond player evaluation via GAX:
In a logistic regression model: GAX is directly related to a score test on a players effect on the probability of goal.
If you don’t believe the GLM setup: GAX using ML models does not allow valid inference! \(\Rightarrow\) Residualize \(X\) as well, i.e. use rGAX.
If you want interpretation: rGAX directly related to parameter in popular semi-parametric model!
General framework usable beyond player evaluation via GAX:
Thank you for your attention!
Earliest version of xG dates back to Pollard and Reep (1997):
Logistic regression model on binary shot outcome
Most important features: shot location and goal angle
Distinction between kicked and headed shots
Earliest version of xG dates back to Pollard and Reep (1997):
Logistic regression model on binary shot outcome
Most important features: shot location and goal angle
Distinction between kicked and headed shots
Modern xG Models (Robberechts and Davis 2020; Anzer and Bauer 2021; Hewitt and Karakuş 2023):
Modern xG Models (Robberechts and Davis 2020; Anzer and Bauer 2021; Hewitt and Karakuş 2023):
Flexible machine learning methods \(\Rightarrow\) account for non-linearities and interactions:
Extreme gradient boosting machines (XGBoost)
Random forests
Neural networks
Modern xG Models (Robberechts and Davis 2020; Anzer and Bauer 2021; Hewitt and Karakuş 2023):
Flexible machine learning methods \(\Rightarrow\) account for non-linearities and interactions:
Extreme gradient boosting machines (XGBoost)
Random forests
Neural networks
Broad set of shot-specific features:
Classical features: Distance to goal, angle, body part
Extended features from event and tracking data: distances to defenders and goalkeeper, shot type and technique, speed of and space for shooter