rMetrics: A statistically motivated framework for player evaluation using residualized scores

Deriving a unified framework for common metrics in sports analytics.

Authors

Robert Bajons

Joint work with Lucas Kook

Published

September 1, 2025

Hit Counter by Digits

Table 1
Resource Date Link
Talk presented at the 2nd Workshop on Sports Analytics in Soccer at TU Dortmund 2025-11-25 Presentation (Dortmund 25)
Talk presented at the 2025 New England Symposium on Statistics in Sports (NESSIS) 2025-09-27 Presentation (NESSIS 25)
Preprint on arXiv: “Rethinking player evaluation in sports: Goals above expectation and beyond” 2025 arXiv preprint

Note

This project is an extension and generalization of the project: rGAX: Rethinking goals above expectation (GAX). As such, it is also intimately related to the project: Machine learning based statistical inference in sports analytics.

Overview

To efficiently assess and identify undervalued players, it is crucial to measure a player’s skills accurately. In dynamic games such as soccer, basketball, or ice hockey, a popular quantitative approach to evaluating player performance involves comparing an actual outcome to an expected outcome estimated by a statistical (or machine learning) model. Examples of this approach are goals (saved) above expectation (soccer and ice hockey), or shooter impact (basketball). Typically, analysts rely on flexible machine learning models capable of handling complex structures present in sports analytics when estimating expected outcomes. While machine learning models are superior to traditional statistical models in terms of predictive ability, using them for inferential tasks such as player evaluation is often inappropriate due to potential biases and a lack of uncertainty quantification.

In this work, we present a unifying framework for metrics of the above type. We first show that, when using parametric models to estimate the expected outcome, these metrics are directly related to score test statistics. Hence, valid statistical inference is possible under parametric assumptions. Motivated by this finding, we propose a natural extension to this framework using residualized versions of the original metrics. In this way, valid uncertainty quantification can be achieved when using machine learning models. Furthermore, we relate the proposed procedure to player-specific effect estimates in interpretable semiparametric models. This allows for a different view on and a deeper understanding of these popular player evaluation metrics. We present various use cases of our framework.

Sneak Peak

The most prominent use case of our framework is rGAX, a residualized version of GAX. Other use cases are: evaluating goal-stopping ability of goalkeepers via a residualized version of goals saved above expectation (rGSAX), shooting skill evaluation in Basketball via a residualized version of the commonly used quantified shooter impact (rqSI), quarterback passing skill evaluation in American football via a residualized version of completion percentage above expectation (rCPAE), and measure injury-proneness of soccer players with residualized injuries above expectation (rIAX, an application of the framework to survival analysis). Furthermore, the framework can be extended to metrics derived by taking the difference between game states before and after an action. Examples thereof are expected points added (EPA) in American football, or possession value models (VAEP, OBV) in soccer.