Evaluating football (soccer) players
Developing a player rating system using possession sequences derived from event stream data.
Resources on the Project
| Resource | Date | Link |
|---|---|---|
| Paper published in IMA Journal of Management Mathematics | 2025 | DOI |
| Preprint on arXiv | 2024 | arXiv preprint |
| Presentation at the Austrian Statistic Days 2023 | 2023-10-05 | Presentation (ASD 23) |
| Presentation at the International Conference on Mathematics and Statistics 2023 (won best presentation award) | 2023-07-15 | Presentation (ICoMS 23) |
| Contribution to proceedings of the International Conference on Mathematics and Statistics 2023 | 2023-07-15 | Short Paper (ICoMS 23) |
| Presentation at the Sports Analytics Workshop (SAW) 2023 of AUEB | 2023-05-04 | Presentation (SAW 23) |
Overview
This project introduces a novel framework for evaluating players in association football (soccer). The method uses possession sequences, i.e. sequences of consecutive on-ball actions, for deriving estimates for player strengths. On the surface, the methodology is similar to classical plus-minus models using mainly regularized regression techniques. However, by analyzing possessions, our framework is able to distinguish on-ball and off-ball contributions of players to the game. From a methodological viewpoint, the framework explores four different penalization schemes, which exploit football-specific structures such as the grouping of players into position groups as well as into common strength groups. Furthermore, we provide a regression approach that uses debiased machine learning techniques in order to obtain more interpretable strength estimates from players. We compare similarities as well as particular use cases of each model and provide guidance for practitioners when using the individual model specifications. Finally, we conclude our analysis by providing a domain-specific statistical evaluation framework, which highlights the potential of the penalized regression approaches for evaluating players.
Some Details
For further information on the project you can consult the above-linked resources. As the above documents do not contain all information on the project, but are more focused to the debiased machine learning approach (besides the slides from the Austrian Statistic Days 2023), some further details are provided in the following.
The idea of our framework for evaluating the strength of football players is similar to plus minus models (see e.g. Kharrat, McHale, and Peña (2020), Hvattum (2019)), however, we provide two novel adaptions. First, the setup relies on possession sequences, which are sequences of consecutive on-ball actions of a team derived from event stream data. Thus, we consider much more granular segments of the game than the usual PM models. This leads to a data structure where it is possible to separate effects of being part of a possession, i.e. actually performing on-ball actions, from effects of only being on the pitch. From a domain-specific viewpoint, this is highly attractive as it is well-known that off-ball actions have a substantial impact in soccer (cf. Spearman (2018), Fernández, Bornn, and Cervone (2021)). Second, we incorporate four different penalization structures into the regression setup in order to derive a rating. As a start, we follow the classical approach and estimate player strength by using a ridge penalty term. This is similar to existing plus-minus models, however analyzing possessions allows for interesting novel insights. Further, the natural grouping of players into position groups (forward, midfielder, defender, goalkeeper) is exploited via a group lasso penalty (see e.g. Simon et al. (2013)), which allows for ridge shrinkage between groups instead on all players jointly. The third penalization structure employed is the exclusive lasso (cf. Campbell and Allen (2017)), which again utilizes the group structure but uses a lasso type of penalty within these groups, such that strong (resp. weak) players are selected, whereas average players are set to zero. As a final regularization structure, we apply the generalized lasso penalty which implicitly tries to group players of equal strength together (an analysis of this penalty for Gaussian response variables can be found in e.g. Tibshirani and Taylor (2011)). Lastly, a different regression modeling structure based on a debiased machine learning approach is discussed.
While the results of the evaluation framework provide interesting football-specific insight, we also emphasize the importance of objectively analyzing whether a novel metric is doing a good job. This task is not straightforward as there is no ground truth for the strength of players and the opinions of practitioners and experts may vary substantially on that matter. In order to statistically validate the ratings derived from the regression methods, we rely on ideas from Franks et al. (2016) and Hvattum and Gelade (2021), who focus on the “relevancy” of ratings. In terms of football, a metric is relevant if the prediction of match results is improved by taking into account the strengths of the players involved in these matches. Thus we set up a match prediction framework, which predicts results with the help of the ratings as covariates. If prediction accuracy increases by the use of the ratings as predictors, it is doing a good job. A more detailed description can be found in the short paper linked above (Bajons (n.d.)). It turns out that the regularized approaches do not only perform well on a subjective basis but also using the described validity framework. They perform even better than the debiased machine learning approach, however, the DML methodology has the advantage of easier interpretation, which is often desired by practitioners.
More details as well as use cases and guidance for practitioners in using the framework are included in the upcoming paper. As always if you have questions simply drop me an email.
Update: We now have a preprint on arXiv that you can check out as well (see resources table above).