Chapter 5: Generalized Linear Models by Curtis Gary Dean, FCAS, - - PowerPoint PPT Presentation

chapter 5 generalized linear
SMART_READER_LITE
LIVE PREVIEW

Chapter 5: Generalized Linear Models by Curtis Gary Dean, FCAS, - - PowerPoint PPT Presentation

w w w . I C A 2 0 1 4 . o r g Chapter 5: Generalized Linear Models by Curtis Gary Dean, FCAS, MAAA, CFA Ball State University: Center for Actuarial Science and Risk Management My Interest in Predictive Modeling 1989 article in Science


slide-1
SLIDE 1

w w w . I C A 2 0 1 4 . o r g

Chapter 5: Generalized Linear Models

by Curtis Gary Dean, FCAS, MAAA, CFA Ball State University: Center for Actuarial Science and Risk Management

slide-2
SLIDE 2

2

My Interest in Predictive Modeling

 1989 article in Science  “Clinical Versus Actuarial Judgment”  Summarized in 1990 in Contingencies

slide-3
SLIDE 3

3

“Clinical Versus Actuarial Judgment”

 “In the clinical method the decision-maker

combines or processes information in his or her head.”

 “In the actuarial or statistical method the

human judge is eliminated and conclusions rest solely on empirically established relations between data and the condition or event of interest.”

slide-4
SLIDE 4

4

“Clinical Versus Actuarial Judgment”

 “…with a sample of about 100 studies and the

same outcome obtained in almost every case, it is reasonable to conclude that the actuarial advantage is not exceptional but general and likely encompasses many of the unstudied judgment tasks.”

slide-5
SLIDE 5

5

“Clinical Versus Actuarial Judgment”

 “To be truly actuarial, interpretations must be

both automatic (that is, prespecified or routinized) and based on empirically established relations.”

 Gary’s statement: “This is predictive

modeling (predictive analytics).”

slide-6
SLIDE 6

6

“Clinical Versus Actuarial Judgment”

 “Even when given an information edge, the

clinical judge still fails to surpass the actuarial method; in fact, access to additional information often does nothing to close the gap between the two methods.”

slide-7
SLIDE 7

7

Why Use Generalized Linear Models?

 Can readily see link between

predictors and outcomes

 Useful statistical tests for coefficients

and fit of model

 Easier to explain than some other

methods

 Software is widely available

slide-8
SLIDE 8

8

Classical Multiple Linear Regression

 μi = E[Yi] = a0 + a1Xi1 + …+ amXim

 Yi is Normally distributed random variable

with constant variance σ2

 Want to estimate μi = E[Yi] for each i

slide-9
SLIDE 9

9

  • 12
  • 7
  • 2
3 8

Response Yi has Normal Distribution

μi

slide-10
SLIDE 10

10

Problems with Traditional Model

 Number of claims is discrete  Claim sizes are skewed to the right  Probability of an event is in [0,1]  Variance is not constant across data

points i

 Nonlinear relationship between X’s and

Y’s

slide-11
SLIDE 11

11

Generalized Linear Models - GLMs

 Fewer restrictions  Y can model number of claims, probability of

renewing, loss severity, loss ratio, etc.

 Large and small policies can be put into one

model

 Y can be nonlinear function of X’s  Classical linear regression model is a special

case

slide-12
SLIDE 12

12

Generalized Linear Models - GLMs

 g(μi )= a0 + a1Xi1 + …+ amXim

 g( ) is the link function  E[Yi] = μi = g-1(a0 + a1Xi1 + …+ amXim)

  • Yi can be Normal, Poisson, Gamma,

Binomial, Compound Poisson, …

  • Variance can be modeled
slide-13
SLIDE 13

13

Exponential Family of Distributions – Canonical Form

         

parameter. nuisance a called

  • ften

is ! interest

  • f

parameter the is ) ( ) ( ' ' ] [ ) ( ' ] [ , exp , ;            a b Y Var b Y E y c a b y y f            

slide-14
SLIDE 14

14

Why Exponential Family?

 Distributions in Exponential Family can

model a variety of problems

 Standard algorithm for finding coefficients

a0, a1, …, am

slide-15
SLIDE 15

15

Normal Distribution in Exponential Family

                                                  

2 2 2 2 2 2 2 2 2 2 2 2 2

2 ln 2 2 / exp 2 2 exp 2 1 ln exp 2 ) ( exp 2 1 ) , ; (               y y y y y y f

slide-16
SLIDE 16

16

Normal Distribution in Exponential Family

2 2 2 2 2 2 2 2 2 2

1 ) ( ) ( ] [ and ) ( 2 / ) ( then , ) ( and Let 2 ln 2 2 / exp ) , ; (                                              a b Y Var b b a y y y f

θ b(θ)

slide-17
SLIDE 17

17

Poisson Distribution in Exponential Family

                            

 

) ! ln( 1 ) (ln exp ] Pr[ ! ln exp ] Pr[ ! ] Pr[ y y y Y y e y Y y e y Y

y y

   

 

θ

slide-18
SLIDE 18

18

Compound Poisson Distribution

 Y = C1 + C2 + . . . + CN

 N is Poisson random variable  Ci are i.i.d. with Gamma distribution  This is an example of a Tweedie distribution  Y is member of Exponential Family

slide-19
SLIDE 19

19

Members of the Exponential Family

  • Normal
  • Poisson
  • Binomial
  • Gamma
  • Inverse Gaussian
  • Compound Poisson
slide-20
SLIDE 20

20

V(μ)

 Normal

 0

 Poisson

 Binomial

 (1-)

 Tweedie

 p, 1<p<2

 Gamma

 2

 Inverse Gaussian 3

Var[Yi] = Φ V(μi)/wi

slide-21
SLIDE 21

21

Variance of Yi and Fit at Data Point i

 Var(Yi) is big → looser fit at data point i  Var(Yi) is small → tighter fit at data

point i

) Var( 1 fit

  • f

Tightness

i

Y 

slide-22
SLIDE 22

22

Estimating Coefficients a1, a2, .., am

 Classical linear regression uses least

squares

 GLMs use Maximum Likelihood Method

slide-23
SLIDE 23

23

Which Exponential Family Distribution?

 Frequency: Poisson  Severity: Gamma  Loss ratio: Compound Poisson  Pure Premium: Compound Poisson  How many policies will renew: Binomial

slide-24
SLIDE 24

24

What link function?

 Additive model: identity  Multiplicative model: natural log  Modeling probability of event: logistic

slide-25
SLIDE 25

Chapter 5: Generalized Linear Models

 Intended as a first exposure to GLMs  Tried to make it accessible and self-

contained

 Hard to squeeze everything into one

chapter – at Ball State the topic spans a semester-long course

25

slide-26
SLIDE 26

5.1 Introduction to Generalized Linear Models 5.1.1 Assumptions of Linear Model

  • Shortcomings for actuarial applications

5.1.2 Generalized Linear Model Assumptions

26

slide-27
SLIDE 27

5.2 Exponential Family of Distributions

5.2.1 The Variance Function and the Relationship between Variances and Means

5.3 Link Functions 5.4 Maximum Likelihood Estimation

5.2.1 Quasi-likelihood

5.5 Generalized Linear Model Review

27

slide-28
SLIDE 28

5.6 Applications

5.6.1 Modeling Probability of Cross Selling with Logit Link 5.6.2 Claim Frequency with Offset 5.6.3 Severity with Weights 5.6.4 Modeling Pure Premiums or Loss Ratios

28

slide-29
SLIDE 29

5.7 Comparing Models

5.7.1 Deviance 5.7.2 Log-likelihood, AIC, AICC, and BIC

29

slide-30
SLIDE 30

30

The End