The path to a goal: Understanding possessions via path signatures
Using path signatures and deep learning to gain insights into possession value.
| Resource | Date | Link |
|---|---|---|
| Preprint on arXiv | 2025 | arXiv preprint |
| Code on Github | 2025 | Github repo |
Overview
We present a novel framework for predicting next actions in soccer possessions by leveraging path signatures to encode their complex spatio-temporal structure. Unlike existing approaches, we do not rely on fixed historical windows and handcrafted features, but rather encode the entire recent possession, thereby avoiding the inclusion of potentially irrelevant or misleading historical information. Path signatures naturally capture the order and interaction of events, providing a mathematically grounded feature encoding for variable-length time series of irregular sampling frequencies without the necessity for manual feature engineering. Our proposed approach outperforms a transformer-based benchmark across various loss metrics and considerably reduces computational cost. Building on these results, we introduce a new possession evaluation metric based on well-established frameworks in soccer analytics, incorporating both predicted action type probabilities and action location. Our metric shows greater reliability than existing metrics in domain-specific comparisons. Finally, we validate our approach through a detailed analysis of the 2017/18 Premier League season and discuss further applications and future extensions.
Main contributions
We propose a novel methodology for action prediction models in soccer using path signatures. The signature of a path is a sequence of iterated integrals that gradually encodes spatio-temporal details until the path obtained by interpolation of data can be uniquely characterized. Informally, they can be viewed as a feature extraction tool for time series that preserves the order and interaction of events over time. Our methodology is intuitive for the problem at hand, outperforms existing benchmark approaches concerning a variety of evaluation metrics, and is computationally more efficient than benchmark models.
Action prediction models have become increasingly important for soccer modeling recently, as they are a main building block for possession value models. Hence, these models have attracted a lot of interest from researchers (Simpson et al. 2022; Mendes-Neves, Meireles, and Mendes-Moreira 2024; Yeung, Sit, and Fujii 2025). Due to their sequential nature, techniques predominantly used in NLP tasks, such as RNNs, LSTMs, or transformers, have been mainly used for action prediction. For soccer possessions, however, using path signatures is a natural choice as they are able to handle possessions of different lengths and irregular sampling frequencies. Furthermore, they implicitly extract all relevant information in a time series, avoiding the need for explicit feature engineering.
We utilize our novel methodology to predict the action type and location of actions in possessions. To optimally make use of our model in practice, we devise a novel possession value metric inspired by domain-specific knowledge, which takes into account both action type probabilities and predicted locations. We show that our metric exhibits desirable domain-specific advantages over existing metrics. We conclude our paper with a comprehensive study of the 2017/18 Premier League season.