Sparsity with multi-type Lasso regularized GLMs Sander Devriendt - - PowerPoint PPT Presentation

sparsity with multi type lasso regularized glms
SMART_READER_LITE
LIVE PREVIEW

Sparsity with multi-type Lasso regularized GLMs Sander Devriendt - - PowerPoint PPT Presentation

Sparsity with multi-type Lasso regularized GLMs Sander Devriendt (email: sander.devriendt@kuleuven.be) Joint work with K. Antonio, T. Reynkens, E. Frees, R. Verbelen eRum 2018, Budapest May 15, 2018 Motivation 2 Claim frequency and claim


slide-1
SLIDE 1

Sparsity with multi-type Lasso regularized GLMs

Sander Devriendt (email: sander.devriendt@kuleuven.be) Joint work with K. Antonio, T. Reynkens, E. Frees, R. Verbelen eRum 2018, Budapest May 15, 2018

slide-2
SLIDE 2

Motivation 2

Claim frequency and claim severity as function of nominal / numeric ∼ ordinal / spatial features

Sparse modeling with multi-type variables – Sander Devriendt

slide-3
SLIDE 3

Research questions 3

◮ Generalized Linear Models (GLMs) for frequency (∼ Poisson) and

severity (∼ Gamma).

◮ How to:

(1) select variables or features? (2) cluster (or bin or fuse) levels within a variable? age groups / postal code clusters / clusters of car models

◮ Procedure should be data driven, scalable to large (big) data. ◮ End product is interpretable, within actuarial comfort zone. Sparse modeling with multi-type variables – Sander Devriendt

slide-4
SLIDE 4

Research questions rephrased 4

◮ Generalized Linear Models (GLMs) for frequency (∼ Poisson) and

severity (∼ Gamma).

◮ How to:

(1) avoid overfitting with too many variables or levels? (2) avoid underfitting with a priori binning/selection?

Sparse modeling with multi-type variables – Sander Devriendt

slide-5
SLIDE 5

A stepwise solution 5

Henckaerts, Antonio et al., 2018 (Scandinavian Actuarial Journal) Stepwise procedure

1

Do an exhaustive search through variables to find best GAM model.

2

Use well-chosen clustering algorithm to bin 2D spatial effect.

3

Use evolutionary trees to bin 1D continuous effects and interactions.

4

Fit GLM with bins and clusters obtained in previous steps. R packages: mgcv, classInt, evtree, rpart

Sparse modeling with multi-type variables – Sander Devriendt

slide-6
SLIDE 6

50 100 150 200 250 25 50 75

ageph power

−0.5 0.0 0.5

f ^

4 50 100 150 200 250 25 50 75

ageph power GLM coefficients

−0.07 −0.021 0.035 0.064 −0.4 −0.2 0.0 0.2

f ^

5

GLM coefficients

−0.329 −0.204 −0.155 0.199

Sparse modeling with multi-type variables – Sander Devriendt

slide-7
SLIDE 7

Sparsity with multi-type Lasso regularized GLMs

Devriendt, Antonio, Reynkens, Frees, Verbelen, 2018 (in progress)

slide-8
SLIDE 8

Regularization 8

✞ ✝ ☎ ✆

Standard GLM fit data as good as possible, no constraint on parameters.

 

✝ ☎ ✆

Regularized GLM tradeoff between fit and interpretability/sparsity/stability, constraint on parameters.

Sparse modeling with multi-type variables – Sander Devriendt

slide-9
SLIDE 9

Lasso 9

◮ Less is more: (Hastie, Tibshirani & Wainwright, 2015)

a sparse model is easier to estimate and interpret than a dense model.

◮ Regularize (with budget constraint t, or regularization parameter λ):

min

β0,β {−L(β0, β)} subject to β1 ≤ t,

  • r equivalenty

min

β0,β

  −L(β0, β) + λ ·

p

  • j=1

|βj|

   .

Shrinks coefficients and even sets some to zero.

Sparse modeling with multi-type variables – Sander Devriendt

slide-10
SLIDE 10

Lasso visualization 10

Regularization = limited budget for β1, β2, β3.

‘Statistical Learning with Sparsity’ - Hastie et al. (2015) Sparse modeling with multi-type variables – Sander Devriendt

slide-11
SLIDE 11

Lasso plot 11

Package glmnet

  • verfitting

← − λ − → underfitting

5 10 15 −0.2 −0.1 0.0 0.1 0.2 λ Coordinates of β

Sparse modeling with multi-type variables – Sander Devriendt

slide-12
SLIDE 12

Lasso and friends 12

◮ Adjust lasso regularization to the type of variable:

  • Determine type (nominal / numeric ∼ ordinal / spatial);
  • Allocate logical penalty.

◮ Thus, for J variables, each with regularization term Pj(.), we want to

  • ptimize:

−L (β1, . . . , βJ) + λ ·

J

  • j=1

Pj (βj).

Sparse modeling with multi-type variables – Sander Devriendt

slide-13
SLIDE 13

Lasso and friends: visualization 13

Different variable type → different penalty budget.

‘Statistical Learning with Sparsity’ - Hastie et al. (2015) Sparse modeling with multi-type variables – Sander Devriendt

slide-14
SLIDE 14

Fused Lasso 14

Package genlasso

  • verfitting

← − λ − → underfitting

5 10 15 20 −0.05 0.00 0.05 0.10 0.15 0.20

  • rdinal penalty example

λ Coordinates of β var 1 var 2 var 3 var 4 var 5 var 6 var 7 var 8 var 9 var 10

Sparse modeling with multi-type variables – Sander Devriendt

slide-15
SLIDE 15

Generalized Fused Lasso 15

Package genlasso

  • verfitting

← − λ − → underfitting

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 −0.05 0.00 0.05 0.10 0.15 0.20

nominal penalty example

λ Coordinates of β var 1 var 2 var 3 var 4 var 5 var 6 var 7 var 8 var 9 var 10

Sparse modeling with multi-type variables – Sander Devriendt

slide-16
SLIDE 16

Unified GLM framework with multiple type of penalties 16

◮ Gertheiss & Tutz (2010) and Oelker & Gertheiss (2017):

  • GLMs with various penalties.
  • R package available: gvcm.cat (not maintained).

◮ Uses local quadratic approximations of penalties and PIRLS:

  • non-exact selection or fusion;
  • computationally intensive.

Sparse modeling with multi-type variables – Sander Devriendt

slide-17
SLIDE 17

Unified GLM framework with multiple type of penalties 17

◮ Our contribution:

  • implements an efficient algorithm (with proximal operators);
  • code bottleneck in C++ (Rcpp)
  • efficient linear algebra (RcppArmadillo)
  • parallel computations (parallel)
  • scalable to big data (splits into smaller sub-problems);
  • flexible regularization
  • penalty takes type of variable into account;
  • works for all popular penalties;

⇒ Package under construction.

Sparse modeling with multi-type variables – Sander Devriendt

slide-18
SLIDE 18

Case study: MTPL data 18

◮ Frequency (and severity) information for n = 163, 234 policyholders. ◮ 14 variables: binary, ordinal and nominal. ◮ Exposure modeled as offset. ◮ Fit Poisson GLM for frequency data with different penalties.

  • Ni ∼ Poisson(µi)
  • log(µi) = log(exposurei) + β0 + 14

j=1 Xjβj

  • O(β) = −L (β0, β1, . . . , β14) + λ · 14

j=1 Pj (βj)

Sparse modeling with multi-type variables – Sander Devriendt

slide-19
SLIDE 19

Case study: MTPL data 19

1 10 100 1000 10000 0.00 0.05 0.10 0.15 0.20 0.25 0.30

Payment Frequency

Lambda Parameters

Sparse modeling with multi-type variables – Sander Devriendt

slide-20
SLIDE 20

Case study: MTPL data 20

Sparse modeling with multi-type variables – Sander Devriendt

20 30 40 50 60 70 80 90 −0.2 −0.1 0.0 0.1 0.2 0.3 0.4 0.5

Age parameters

Age Parameter value

Lambda = 1

slide-21
SLIDE 21

Case study: MTPL data 21

◮ Settings:

  • Incorporate adaptive (GLM) and standardization weights for better

consistency and predictive performance.

  • Tune λ with out-of-sample MSE (ˆ

λ = 380)

◮ Re-estimate the final sparse GLM with standard GLM routines (from

164 to 38 params.).

Sparse modeling with multi-type variables – Sander Devriendt

slide-22
SLIDE 22

MTPL claim frequency with multiple type of penalties 22

20 30 40 50 60 70 80 90 −0.2 0.0 0.2 0.4 Age

  • 50

100 150 −0.5 0.0 0.5 1.0 Power (kW)

  • ● ●
  • ● ● ● ● ● ● ● ●
  • ● ● ● ● ● ● ● ● ● ● ● ●
  • ● ● ● ● ●
  • ● ● ● ● ● ●
  • ● ● ● ● ● ● ● ● ● ● ● ●
  • ● ● ● ● ●

5 10 15 20 −0.2 0.2 0.6 1.0 Bonus−Malus scale

  • 5

10 15 20 25 −0.5 0.0 0.5 Car age

  • ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
  • ● ● ● ● ●
  • ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
  • ● ● ● ● ● ●
  • GAM fit, penalized GLM fit, GLM refit with new clusters.

Sparse modeling with multi-type variables – Sander Devriendt

slide-23
SLIDE 23

MTPL claim frequency with multiple type of penalties 23

  • −0.2

0.0 0.2 0.4 0.6 Parameter estimates

  • sex

use fuel sport fleet monovolume 4x4

  • −0.1

0.1 0.3 Parameter estimates

  • payfreq2

payfreq3 payfreq4 coverage2 coverage3

GAM fit, penalized GLM fit, GLM refit with new clusters.

Sparse modeling with multi-type variables – Sander Devriendt

slide-24
SLIDE 24

Wrap-up 24

◮ Less is more. ◮ Flexible regularization can help predictive modeling. ◮ R package combines general framework with efficient algorithm. ◮ Package and working paper to be finalized. Sparse modeling with multi-type variables – Sander Devriendt

slide-25
SLIDE 25

Thank you 25

Ageas Continental Europe

+ Tom Reynkens and colleagues

Sparse modeling with multi-type variables – Sander Devriendt

slide-26
SLIDE 26

References 26

Henckaerts, R., Antonio, K., Clijsters, M. and Verbelen, R. (2018) A data driven strategy for the construction of insurance tariff classes. Scandinavian Actuarial Journal, published online. Wood, S. (2006) Generalized additive models: an introduction with R. Chapman and Hall/CRC Press. Gertheiss, J. and Tutz, G. (2010). Sparse modeling of categorial explanatory variables. The Annals of Applied Statistics, 4(4), 2150-2180. Oelker, M. and Gertheiss, J. (2017). A uniform framework for the combination of penalties in generalized structured models. Advances in Data Analysis and Classification, 11(1),97-120.

Sparse modeling with multi-type variables – Sander Devriendt

slide-27
SLIDE 27

References 27

Parikh, N. and Boyd, S. (2013). Proximal algorithms. Foundations and Trends in Optimization, 1(3):123-231. Hastie, T., Tibshirani, R. and Wainwright, M. (2015) Statistical learning with sparsity: the Lasso and generalizations. Chapman and Hall/CRC Press.

Sparse modeling with multi-type variables – Sander Devriendt