Introduction

Methodology

Simulation

Application

Summary

  • Fast, flexible and robust implementation of the adaptive generalized logistic lasso (AGLL) model using conic programming methods.

  • Application to ranking in sports \(\Rightarrow\) high-dimensional and sparse data setup.

Work in progress and extensions:

  • Implementation in R via package ROI \(\Rightarrow\) flexible but overhead in computation (work in progress).

  • Generalization to other GLM type of models and non linear effects possible.

Thank you for your attention!

References

Boyd, Stephen, and Lieven Vandenberghe. 2004. Convex Optimization. Cambridge University Press.
Gramacy, Robert B., Shane T. Jensen, and Matt Taddy. 2013. “Estimating Player Contribution in Hockey with Regularized Logistic Regression.” Journal of Quantitative Analysis in Sports 9 (1): 97–111. https://doi.org/doi:10.1515/jqas-2012-0001.
Hvattum, Lars Magnus. 2019. “A Comprehensive Review of Plus-Minus Ratings for Evaluating Individual Players in Team Sports.” International Journal of Computer Science in Sport 18 (1): 1–23. https://doi.org/doi:10.2478/ijcss-2019-0001.
Masarotto, Guido, and Cristiano Varin. 2012. The ranking lasso and its application to sport tournaments.” The Annals of Applied Statistics 6 (4): 1949–70. https://doi.org/10.1214/12-AOAS581.
Schwendinger, Benjamin, Florian Schwendinger, and Laura Vana. 2024. “Holistic Generalized Linear Models.” Journal of Statistical Software 108 (7): 1–49. https://doi.org/10.18637/jss.v108.i07.
Schwendinger, Florian, Bettina Grün, and Kurt Hornik. 2021. “A Comparison of Optimization Solvers for Log Binomial Regression Including Conic Programming.” Computational Statistics 36 (3): 1721–54. https://doi.org/10.1007/s00180-021-01084-5.
Theußl, Stefan, Florian Schwendinger, and Kurt Hornik. 2020. ROI: An Extensible R Optimization Infrastructure.” Journal of Statistical Software 94 (15): 1–64. https://doi.org/10.18637/jss.v094.i15.
Tibshirani, Ryan J., and Jonathan Taylor. 2011. “The Solution Path of the Generalized Lasso.” The Annals of Statistics 39 (3): 1335–71. https://doi.org/10.1214/11-AOS878.
Zou, Hui. 2006. “The Adaptive Lasso and Its Oracle Properties.” Journal of the American Statistical Association 101 (476): 1418–29. https://doi.org/10.1198/016214506000000735.

Appendix

Cone examples

Original Problem:

\[ \begin{align*} \text{minimize}_y & \quad \exp(2y_1+y_2) \\ \text{s.t. } & \quad 5y_1+2 y_2 \le 4. \end{align*} \]

Epigraph form:

\[ \begin{align*} \text{minimize}_{y,t} & \quad t \\ \text{s.t. } & \quad \exp(2y_1+y_2) \le t \\ & \quad \quad 5y_1+2 y_2 \le 4. \end{align*} \]

  • Two convex cones necessary to write in conic form:

    • Exponential cone \(K_{\exp} = \{x \in \mathbb{R}^3 | x_1 \ge x_2 \exp(x_3/x_2), x_1 > 0, x_2 > 0\}\):

      \[\exp(2y_1+y_2) \le t \Leftrightarrow (t,1,2y_1+y_2) \in K_{\rm exp}\]

    • Linear cone:

      \[5y_1+2 y_2 \le 4 \Leftrightarrow 4-(5y_1+2 y_2) \in K_{\text{lin}}\]

Details on reformulation of the AGLL

  • Likelihood terms: \[ \begin{align} &- y_i \log(\frac{1}{1+\exp(-\boldsymbol{\beta}^{\top} x_i)})-(1-y_i)\log(\frac{1}{1+\exp(\boldsymbol{\beta}^{\top} x_i)}) \le t_i \\ \Leftrightarrow & \begin{cases}\log(1+\exp(-\beta^\top x_i)) \le t_i &, y_i = 1\\ \log(1+\exp(\beta^\top x_i)) \le t_i &, y_i = 0\end{cases} \end{align} \]

  • Constraint \[t \ge \log(1+\exp(u)) \Leftrightarrow \exp(u-t) + \exp(-t) \le 1\]

  • Exponential function can be modeled via exponential cone: \[ \begin{align} z_1 \ge \exp(u-t) &\Leftrightarrow (z_1,1,u-t) \in K_{\rm exp}. \\ z_2 \ge \exp(-t) &\Leftrightarrow (z_2,1,-t) \in K_{\rm exp} \end{align} \]

  • Additional linear cone is necessary for \(z_1 + z_2 \le 1\).

Implementation in R

  • 3 steps necessary:

    • Set up optimization problem: linear OP on augmented set of variables.
    • Set up linear constraints matrix.
    • Set up exponential cone constraints matrix.
  • Advantage of ROI:

    • One syntax allows for usage of variety of solvers.
    • Designed for R-users.
  • Weights: Masarotto and Varin (2012) suggest weights inversely proportional to modified MLE with small ridge penalty on parameters.

\[w_{j}=\left|D\tilde{\boldsymbol{\beta}}\right|^{-1}, \quad \tilde{\boldsymbol{\beta}}_{\epsilon}={\arg \min}_{\boldsymbol{\beta}} \left\{-\ell(\boldsymbol{\beta})+\epsilon \sum_{i}\beta_i^2\right\}.\]

Selection of Lasso Penalty

  • Compute AGLL solution for a range of values of \(\lambda\).

  • Selection of \(\lambda\) through minimization of AIC type criterion (Tibshirani and Taylor 2011): \[{\rm AIC}(\lambda) = -2\ell(\boldsymbol{\beta})+2{\rm enp}(\lambda),\] \({\rm enp}(\lambda) \dots\) effective number of parameters estimated as number of distinct groups for \(\lambda\).

  • Classical CV procedure:

    • Fit model on part of data.
    • Evaluate performance on hold out sample.
    • Average performance value over all folds \(\Rightarrow\) select \(\lambda\) with best performance.