Evaluating football (soccer) players

Developing a player rating system using possession sequences derived from event stream data.

Authors

Robert Bajons

Joint work with Kurt Hornik

Published

November 1, 2022

Resources on the Project

Resource	Date	Link
Paper published in IMA Journal of Management Mathematics	2025	DOI
Preprint on arXiv	2024	arXiv preprint
Presentation at the Austrian Statistic Days 2023	2023-10-05	Presentation (ASD 23)
Presentation at the International Conference on Mathematics and Statistics 2023 (won best presentation award)	2023-07-15	Presentation (ICoMS 23)
Contribution to proceedings of the International Conference on Mathematics and Statistics 2023	2023-07-15	Short Paper (ICoMS 23)
Presentation at the Sports Analytics Workshop (SAW) 2023 of AUEB	2023-05-04	Presentation (SAW 23)

Overview

This project introduces a novel framework for evaluating players in association football (soccer). The method uses possession sequences, i.e. sequences of consecutive on-ball actions, for deriving estimates for player strengths. On the surface, the methodology is similar to classical plus-minus models using mainly regularized regression techniques. However, by analyzing possessions, our framework is able to distinguish on-ball and off-ball contributions of players to the game. From a methodological viewpoint, the framework explores four different penalization schemes, which exploit football-specific structures such as the grouping of players into position groups as well as into common strength groups. Furthermore, we provide a regression approach that uses debiased machine learning techniques in order to obtain more interpretable strength estimates from players. We compare similarities as well as particular use cases of each model and provide guidance for practitioners when using the individual model specifications. Finally, we conclude our analysis by providing a domain-specific statistical evaluation framework, which highlights the potential of the penalized regression approaches for evaluating players.

Some Details

For further information on the project you can consult the above-linked resources. As the above documents do not contain all information on the project, but are more focused to the debiased machine learning approach (besides the slides from the Austrian Statistic Days 2023), some further details are provided in the following.

The idea of our framework for evaluating the strength of football players is similar to plus minus models (see e.g. Kharrat, McHale, and Peña (2020), Hvattum (2019)), however, we provide two novel adaptions. First, the setup relies on possession sequences, which are sequences of consecutive on-ball actions of a team derived from event stream data. Thus, we consider much more granular segments of the game than the usual PM models. This leads to a data structure where it is possible to separate effects of being part of a possession, i.e. actually performing on-ball actions, from effects of only being on the pitch. From a domain-specific viewpoint, this is highly attractive as it is well-known that off-ball actions have a substantial impact in soccer (cf. Spearman (2018), Fernández, Bornn, and Cervone (2021)). Second, we incorporate four different penalization structures into the regression setup in order to derive a rating. As a start, we follow the classical approach and estimate player strength by using a ridge penalty term. This is similar to existing plus-minus models, however analyzing possessions allows for interesting novel insights. Further, the natural grouping of players into position groups (forward, midfielder, defender, goalkeeper) is exploited via a group lasso penalty (see e.g. Simon et al. (2013)), which allows for ridge shrinkage between groups instead on all players jointly. The third penalization structure employed is the exclusive lasso (cf. Campbell and Allen (2017)), which again utilizes the group structure but uses a lasso type of penalty within these groups, such that strong (resp. weak) players are selected, whereas average players are set to zero. As a final regularization structure, we apply the generalized lasso penalty which implicitly tries to group players of equal strength together (an analysis of this penalty for Gaussian response variables can be found in e.g. Tibshirani and Taylor (2011)). Lastly, a different regression modeling structure based on a debiased machine learning approach is discussed.

While the results of the evaluation framework provide interesting football-specific insight, we also emphasize the importance of objectively analyzing whether a novel metric is doing a good job. This task is not straightforward as there is no ground truth for the strength of players and the opinions of practitioners and experts may vary substantially on that matter. In order to statistically validate the ratings derived from the regression methods, we rely on ideas from Franks et al. (2016) and Hvattum and Gelade (2021), who focus on the “relevancy” of ratings. In terms of football, a metric is relevant if the prediction of match results is improved by taking into account the strengths of the players involved in these matches. Thus we set up a match prediction framework, which predicts results with the help of the ratings as covariates. If prediction accuracy increases by the use of the ratings as predictors, it is doing a good job. A more detailed description can be found in the short paper linked above (Bajons (n.d.)). It turns out that the regularized approaches do not only perform well on a subjective basis but also using the described validity framework. They perform even better than the debiased machine learning approach, however, the DML methodology has the advantage of easier interpretation, which is often desired by practitioners.

More details as well as use cases and guidance for practitioners in using the framework are included in the upcoming paper. As always if you have questions simply drop me an email.

Update: We now have a preprint on arXiv that you can check out as well (see resources table above).

References

Bajons, Robert. n.d. “Evaluating Player Performances in Football: A Debiased Machine Learning Approach.” Proceedings of 2023 6th International Conference on Mathematics and Statistics (ICoMS 2023).

Campbell, Frederick, and Genevera I. Allen. 2017. “Within group variable selection through the Exclusive Lasso.” Electronic Journal of Statistics 11 (2): 4220–57. https://doi.org/10.1214/17-EJS1317.

Fernández, Javier, Luke Bornn, and Daniel Cervone. 2021. “A Framework for the Fine-Grained Evaluation of the Instantaneous Expected Value of Soccer Possessions.” Machine Learning 110 (6): 1389–1427. https://doi.org/10.1007/s10994-021-05989-6.

Franks, Alexander M., Alexander D’Amour, Daniel Cervone, and Luke Bornn. 2016. Journal of Quantitative Analysis in Sports 12 (4): 151–65. https://doi.org/doi:10.1515/jqas-2016-0098.

Hvattum, Lars Magnus. 2019. “A Comprehensive Review of Plus-Minus Ratings for Evaluating Individual Players in Team Sports.” International Journal of Computer Science in Sport 18 (1): 1–23. https://doi.org/doi:10.2478/ijcss-2019-0001.

Hvattum, Lars Magnus, and Garry A. Gelade. 2021. “Comparing Bottom-up and Top-down Ratings for Individual Soccer Players.” International Journal of Computer Science in Sport 20 (1): 23–42. https://doi.org/doi:10.2478/ijcss-2021-0002.

Kharrat, Tarak, Ian G. McHale, and Javier López Peña. 2020. “Plus–Minus Player Ratings for Soccer.” European Journal of Operational Research 283 (2): 726–36. https://doi.org/https://doi.org/10.1016/j.ejor.2019.11.026.

Simon, Noah, Jerome Friedman, Trevor Hastie, and Robert Tibshirani. 2013. “A Sparse-Group Lasso.” Journal of Computational and Graphical Statistics 22 (2): 231–45. https://doi.org/10.1080/10618600.2012.681250.

Spearman, William. 2018. “Beyond Expected Goals.” In. MIT Sloan Sports Analytics Conference.

Tibshirani, Ryan J., and Jonathan Taylor. 2011. “The Solution Path of the Generalized Lasso.” The Annals of Statistics 39 (3): 1335–71. http://www.jstor.org/stable/23033600.