Qualitative Response Models Michael R. Roberts Department of - - PowerPoint PPT Presentation

▶

Jan 11, 2023 662 likes •1.27k views

Introduction Logit Generalized Extreme Value (GEV) Qualitative Response Models Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania January 21, 2009 Michael R. Roberts Qualitative Response Models 1/59

SLIDE 1

Introduction Logit Generalized Extreme Value (GEV)

Qualitative Response Models

Michael R. Roberts

Department of Finance The Wharton School University of Pennsylvania

January 21, 2009

Michael R. Roberts Qualitative Response Models 1/59

SLIDE 2

Introduction Logit Generalized Extreme Value (GEV) The Choice Set & Choice Probabilities Identification Aggregation Regression

Discrete Choice Framework

The choice set must exhibit 3 characteristics:

Alternatives must be mutually exclusive.

Choice set must be exhaustive.

# of alternatives must be finite.

1 and 2 can usually be satisfied with appropriate classifications. 3 is the defining feature of discrete choice models.

Michael R. Roberts Qualitative Response Models 2/59

SLIDE 3

Introduction Logit Generalized Extreme Value (GEV) The Choice Set & Choice Probabilities Identification Aggregation Regression

Random Utility Models

Decision maker (agent, firm, person, etc.) i faces J alternatives (alts). Decision maker obtains a certain utility or profit from each alt Utility that agent i obtains from alt j is Uij. Agent chooses alt that provides highest utility Uij > Uik∀j = k.

Michael R. Roberts Qualitative Response Models 3/59

SLIDE 4

Introduction Logit Generalized Extreme Value (GEV) The Choice Set & Choice Probabilities Identification Aggregation Regression

Empirical Implimenation

Utility for agent i, alternative j, Uij, decomposed into two components:

V (xij, sj, θ) = Indirect Utility observed by researcher. Function of alternative attributes xij, agent attributes si, and parameters θ.

εij = unobservable to researcher factors affecting utility. Defined relative to reseracher’s representation of choice situation (i.e., V ).

Unobserved components, εij, are assumed random according to a distribution f (εij): Unobserved vector of errors across alternatives is described by joint density f (εi): εi = (εi1, ..., εiJ) ∼ f (εi)

Michael R. Roberts Qualitative Response Models 4/59

SLIDE 5

Introduction Logit Generalized Extreme Value (GEV) The Choice Set & Choice Probabilities Identification Aggregation Regression

Probability of Agent’s Choice

With f we can make probabilistic statements about agent’s choice Pij = Pr(Uij > Uik, ∀j = k) = Pr(Vij + εij > Vik + εik, ∀j = k) = Pr(εij − εik > Vik − Vij, ∀j = k) This last expression is a CDF Pij =

I(εik − εij > Vij − Vik, ∀j = k)f (εi)dεi This is a multidimensional (J) integral over the domain of εi (e.g., RJ). Note: Independence implies f (εi) = f (εi1) × · · · × f (εiJ). Different distributions = ⇒ different models.

εi i.i.d. extreme value = ⇒ logit (Closed Form)

εi i.i.d generalized extreme value = ⇒ nested logit (Closed Form)

εi multivariate normal = ⇒ probit

Michael R. Roberts Qualitative Response Models 5/59

SLIDE 6

Introduction Logit Generalized Extreme Value (GEV) The Choice Set & Choice Probabilities Identification Aggregation Regression

Model Identification

Only Differences in Utility Matter or The level of utility doesn’t matter. The choice probability is: Pij = Pr (Uij > Uik, ∀j = k) = Pr (Uij − Uik > 0, ∀j = k) which depends only on the difference in utility not its absolute level. Similarly, Pij = Pr (Vij + εij > Vik + εik, ∀j = k) = Pr (εij − εik > Vik − Vij, ∀j = k) which also just depends on differences. In general, the only parameters that can be estimated are those that capture differences across alternatives.

Michael R. Roberts Qualitative Response Models 6/59

SLIDE 7

Introduction Logit Generalized Extreme Value (GEV) The Choice Set & Choice Probabilities Identification Aggregation Regression

Alternative-Specific Constants

Assume Vij = xijβ + kj, ∀j kj captures average effect on utility of all factors not in model (like intercept in linear regression). Including kj forces εij to have zero-mean “Only differences matter” = ⇒ only differences kj − kk matter, not absolute level of each. Normalize one of the constants to zero. With J alternatives, J − 1 constants can be estimated.

Michael R. Roberts Qualitative Response Models 7/59

SLIDE 8

Introduction Logit Generalized Extreme Value (GEV) The Choice Set & Choice Probabilities Identification Aggregation Regression

Sociodemographic Variables

Consider choosing between commuting via bus or car Uc = αTc + βMc + θcY + εc Ub = αTb + βMb + θbY + kb + εc where T=commute time, M=commute cost, Y =income. We can only estimate differences: θc − θb (or vice versa). So, either normalize one θ to 0 Uc = αTc + βMc + εc Ub = αTb + βMb + θbY + kb + εc where θb = θb − θc, or interact alternative-specific variables Uc = αTc + βMc/Y + εc Ub = αTb + βMb/Y + θbY + kb + εc

Michael R. Roberts Qualitative Response Models 8/59

SLIDE 9

Introduction Logit Generalized Extreme Value (GEV) The Choice Set & Choice Probabilities Identification Aggregation Regression

Independent Error Terms

Recall choice probability is J-dimensional integral: Pij =

I(εik − εij > Vij − Vik, ∀j = k)f (εi)dεi Can write in terms of J − 1-dimensional integral Pij = Pr(˜ εijk > Vik − Vij, ∀j = k) =

ε

I(˜ εijk > Vij − Vik, ∀j = k)g(˜ εij)d ˜ εij,

˜ εijk = εij − εik = difference in errors for alt’s j and k

˜ εij = (˜ εij1, ..., ˜ εijJ) = J − 1-dim vector of error differences over all alternatives except j.

g(˜ εij) is the J − 1-dimensional density of error differences.

Since choice prob’s can be expressed in terms of g(˜ εij), one dimension of f (εi) is not identified and must be normalized.

Michael R. Roberts Qualitative Response Models 9/59

SLIDE 10

Introduction Logit Generalized Extreme Value (GEV) The Choice Set & Choice Probabilities Identification Aggregation Regression

Scale of Utility is Irrelevant

Multiplying by a positive constant doesn’t affect choice. Following two models are equivalent ∀λ > 0: U0

ij

= Vij + εij, ∀j U1

ij

= λVij + λεij, ∀j Address by normalizing the variance of error terms.

i.i.d. errors: The two models are equiv U0

= xijβ + ε0

ij, V (ε0 ij) = σ2

U1

= xij(β/σ) + ε1

ij, V (ε1 ij) = 1

Normalizing constant important when comparing coeffs across models (e.g., probit and logit) or datasets where scale varies.

Heteroskedastic Errors: Normalize one of the variances = ⇒ normalize variance of error difference.

Correlated Errors: Normalize the variance of one of the error differences.

Michael R. Roberts Qualitative Response Models 10/59

SLIDE 11

Introduction Logit Generalized Extreme Value (GEV) The Choice Set & Choice Probabilities Identification Aggregation Regression

Average Response

Linear model f = ⇒ E[f (x)] = f (E[x]) so we can insert aggregate

r average values into model, e.g., ¯

y = α + β¯ x Nonlinear models f : E[f (x)] = f (E[x]) Prob at avg utility can over- or under-estimate depending on where individual choice prob’s are (convex or concave portion of curve).

Michael R. Roberts Qualitative Response Models 11/59

SLIDE 12

Introduction Logit Generalized Extreme Value (GEV) The Choice Set & Choice Probabilities Identification Aggregation Regression

Average Marginal Effects

Derivative is small a and b, derivative of avg. large. Solution: To get aggregate outcome, average indiv probs. To get at average marginal effect, avg indiv MEs (APE).

Michael R. Roberts Qualitative Response Models 12/59

SLIDE 13

Introduction Logit Generalized Extreme Value (GEV) The Choice Set & Choice Probabilities Identification Aggregation Regression

A Regression Perspective

Consider binary case: two mutually exclusive outcomes captured by Y ∈ {0, 1} Pr(Y = 1|x) = F(x, β) Pr(Y = 0|x) = 1 − F(x, β) Assume F(x, β) = x′β. E(Y |x) = 0 ∗ Pr(Y = 0|x) + 1 ∗ Pr(Y = 1|x) = Pr(Y = 1|x) = x′β so the regression model is: y = E(Y |x) + (y − E(Y |x)) = x′β + ε

Michael R. Roberts Qualitative Response Models 13/59

SLIDE 14

Introduction Logit Generalized Extreme Value (GEV) The Choice Set & Choice Probabilities Identification Aggregation Regression

Two Problems

1 Heteroskedastic errors

V (ε|x) = E(ε2|x) + (E(ε|x)2) = E((y − x′β)2|x) = E(y2 − 2yx′β + (x′β)2|x) = E(y2|x) − E(y|x)2x′β + (x′β)2 = x′β − 2(x′β)2 + (x′β)2 = x′β(1 − x′β) (FGLS solves this.)

2 Predicted values not constrained to [0, 1] implies

nonsense probabilities negative variances

Michael R. Roberts Qualitative Response Models 14/59

SLIDE 15

Introduction Logit Generalized Extreme Value (GEV) The Choice Set & Choice Probabilities Identification Aggregation Regression

Solution to Unbounded Predictions

Choose F: limz→∞F(z) = 1 limz→−∞F(z) = Examples:

F(z) = Φ(z) = z

−∞ φ(t)dt (Probit - symmetric)

F(z) = Λ(z) =

ez 1+ez (Logit - symmetric)

F(z) = G(z) = exp[−exp(−z)] (Gumbel - asymmetric)) F(z) = L(z) = 1 − exp[exp(z)] (Complementary Log Log - asymmetric))

where φ is the standard normal density (1/2π)0.5exp(−0.5t2). Little guidance on choice Asymmetry refers to ε not Y

Michael R. Roberts Qualitative Response Models 15/59

SLIDE 16

Introduction Logit Generalized Extreme Value (GEV) The Choice Set & Choice Probabilities Identification Aggregation Regression

Index Model

Consumer weighs unobservable Marginal Costs and Benefits of decision y∗ = x′β + ε where x′β is index function and observation equation is y = 1 if y∗ > 0 y = 0 if y∗ ≤ 0 Note:

Variance of ε is unidentified (data depend only on sign) y ∗ = x′β + σε ⇐ ⇒ (y ∗/σ) = x′(β/σ) + ε

Zero threshold is irrelevant. Consider threshold a Pr(y ∗ > a|x) = Pr((α − a) + x′β + ε > 0|x)

See Brock and Durlauf (2000) for examples.

Michael R. Roberts Qualitative Response Models 16/59

SLIDE 17

Introduction Logit Generalized Extreme Value (GEV) The Choice Set & Choice Probabilities Identification Aggregation Regression

Specification Concerns

Yatchew and Griliches (1984) show that unlike linear regression,

Ommitted variables: even if the ommitted variable is uncorrelated with the included ones, the estimated coefficient will be inconsistent

Heteroskedastic regressors result in inconsistent MLEs and inapproriate covariance matrix

MLE is also inconsistent if

there is unmeasured heterogeneity

the functional form of the index is nonlinear but assume linear

the distributional assumption (e.g., normal, logit) is incorrect

Punchline: specification errors are more serious in nonlinear setting

Michael R. Roberts Qualitative Response Models 17/59

SLIDE 18

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Error Distibution

Model same as before: Uij = Vij + εij, ∀j. Logit assumes that εij i.i.d. extreme value (a.k.a. Gumbel, Type I Extreme Value) across agents i and alternatives j.

Density is f (εij) = e−εije−e−εij CDF is F(εij) = e−e−εij Variance of this distribution is π2/6, which implicity normalizes the scale of utility. Mean = 0 But, irrelevant since only differences in utility matter and difference of two random vars with same mean is 0.

Michael R. Roberts Qualitative Response Models 18/59

SLIDE 19

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Error Difference Distribution

If εij i.i.d. extreme value, then ˜ εijk = εij − εik is logistic:

CDF is F(˜ εij) = e ˜

εijk

1 + e ˜

εijk

This is distribution for binary logit. Similar to normal but fatter tails.

Key assumption is independence of errors. Ok if model is “well-specified” If errors are correlated then

use different model to allow for corr

respecify representative utility V to capture corr

consider model only as approximation

Violation of indep assumption less important for estimating avg preferences, than for forecasting substitution patterns.

Michael R. Roberts Qualitative Response Models 19/59

SLIDE 20

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Logit Choice Probabilities I

McFadden (1974) Pij = Pr (Vij + εij > Vik + εik, ∀j = k) = Pr (εik < Vij − Vik + εij, ∀j = k) Fix εij’s, this is CDF of εik’s evaluated at Vij − Vik + εij Indep of ε’s = ⇒ CDF i Pij|εij =

e−e−(εij +Vij −Vik ). This is conditional joint CDF of all ε’s except εij, the conditioning variable.

Michael R. Roberts Qualitative Response Models 20/59

SLIDE 21

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Logit Choice Probabilities II

To get unconditional density, integrate out εij Pij =  

k=j

e−e−(εij +Vij −Vik )   e−εije−e−εij dεij = eVij

j eVij .

If representative utility V is linear in parameters then Pij = exijβ

j exijβ .

Choice probs sum to 1. Relation between P and V is sigmoid shaped

Point at which increase in V has largest effect on prob of alternative being chosen is when prob ≈ 0.5. Small changes tip the balance.

Michael R. Roberts Qualitative Response Models 21/59

SLIDE 22

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Coefficients

Consider choice between gas and electric heating system Ug = β1PPg + β2OCg Ue = β1PPe + β2OCe where PP =purchase price, OC =operating cost. The sign of the coefficients indicate the effect on utility (expect β1 and β2 < 0). Ratio of coefficients, β2/β1 is willingness to pay for operating-cost

reductions. E.g., β2 = −1.14 and β1 = −0.20 =

⇒ β2/β1 = $5.70. Pay $5.70 more for a system whose annual OCs are $1 less. Take total deriv of utility and set = 0. dU = β1dPP + β2dOC = 0 = ⇒ ∂PP/∂OC = −β2/β1

Michael R. Roberts Qualitative Response Models 22/59

SLIDE 23

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Multinomial Logit

Reconsider choice probabilities on previous slide Pg = eβ1β1PPg+β2OCg eβ1β1PPg+β2OCg + eβ1PPe+β2OCe = 1 1 + e(β1β1PPg+β2OCg)−(β1PPe+β2OCe) Consider a third option, oil heating Pg = eβ1β1PPg+β2OCg eβ1β1PPg+β2OCg + eβ1PPe+β2OCe + eβ1PPo+β2OCo In binomial or multinomial, if we have variables constant across alternatives, must normalize one of the alternative-specific coefficients to zero.

Michael R. Roberts Qualitative Response Models 23/59

SLIDE 24

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Scale Invariance

Logit assumes type 1 extreme value with variance π2/6. Utility is U∗

ij = Vij + ε∗ ij, where Var(ε∗ ij) = σ2(π2/6).

Scale irrelevance = ⇒ divide by σ without changing behavior Uij = Vij/σ + εij where εij = εij/σ and V (εij) = π2/6. The choice prob is: Pij = eVij/σ

j eVij/σ =

e(β/σ)xij

j e(β/σ)xij

Estimated coeffs = orig params scaled by σ = ⇒ careful when interpreting magnitudes (Ben-Akiva & Morikawa ’90, Swait & Louviere ’93) Can’t identify scale σ = ⇒ normalize to 1.

Michael R. Roberts Qualitative Response Models 24/59

SLIDE 25

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Applicability of Logit Models

1 Logit can represent systematic (i.e., related to observables) taste

variation, not random (i.e., unrelated to observables) taste variation

2 Logit implies proportional substitution across alternatives (i.i.a.) 3 Logit can handle state dependence and repeated choice but can’t

handle serially correlated errors.

Michael R. Roberts Qualitative Response Models 25/59

SLIDE 26

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Taste Variation - Observables

Consider car choice Uij = αiSRj + βiPPj + εij where SR = shoulder room=, PP = purchase price. Note parameters (αi, βi) are agent-specific. E.g., αi = ρMi, βi = θ/Ii where M = # of family members, I = income. Then Uij = ρ(MiSRj) + θ(PPj/Ii) + εij As long as taste varies with observables, no problem

Michael R. Roberts Qualitative Response Models 26/59

SLIDE 27

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Taste Variation - Unobservables

Now let parameters (αi, βi) vary with unobservables αi = ρMi + µi, βi = θ/Ii + ηi where µi = unobserved to researcher value of shouler room (e.g., size of people, frequency of traveling together), and ηi = unobserved to researcher value of purchase price. Now Uij = ρ(MiSRj) + θ(PPj/Ii) + ˜ εij where ˜ εij = µiSRj + ηiPPj + εij. New error term can’t be i.i.d.

µi and ηi = ⇒ corrrelation across alts.

SRj and PPj = ⇒ heteroskedasticity across alts.

Solution: Probit or Mixed Logit.

Michael R. Roberts Qualitative Response Models 27/59

SLIDE 28

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Substitution Patterns

Increase in prob of choosing one alt = ⇒ decrease in prob of choosing other alternatives (probs sum to 1). Logit model implies a specific substitution pattern. Can be seen as

restriction on the ratios of probs, and/or

restriction on the cross-elasticities of probs.

Michael R. Roberts Qualitative Response Models 28/59

SLIDE 29

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Independence of Irrelevant Alternatives (IIA)

Ratio of logit probs for any 2 alts Pij Pik = eVij/

j eVij

eVik/

j eVij = eVij

eVik = eVij − eVik The ratio does not depend on any other alts. IIA can be viewed as a property of a properly specified model.

Michael R. Roberts Qualitative Response Models 29/59

SLIDE 30

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

IIA: Red Bus-Blue Bus

Travel by car or blue bus: Pc = Pbb = 1/2 = ⇒ Pc/Pbb = 1. Introduce identical red bus, = ⇒ Prb = Pbb = ⇒ Prb/Pbb = 1. iia = ⇒ Pc/Pbb = 1 since new alt has no effect on ratio. Only probs: Pc/Pbb = 1 and Pc/Pbb = 1 are Pc = Pbb = Prb = 1/3. In real life, this is silly. Pc shouldn’t change and Prb = Pbb if they’re

identical. So, Pc = 1/2 and Prb = Pbb = 1/4

IIA leads to overest of bus, underest of car.

Michael R. Roberts Qualitative Response Models 30/59

SLIDE 31

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Proportional Substitution

Same idea in terms of cross-elasticities of logit probs. How does ∆ in characteristics of alt j affect all prob of other alts? Elasticity of Pij wrt variable in representative utility of alt k (see below for deriv): ∂Pij ∂zik = Eizik = −βzzikPik Note cross-elasticity is same ∀j since j does not enter the formula. Means improvement in attributes of 1 alt reduces probs ∀ other alts by same %. I.e., improvement in 1 alt draw proportionately from all other alts. (a.k.a., proportionate shifting and is manifest of iia).

Michael R. Roberts Qualitative Response Models 31/59

SLIDE 32

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Advantages & Tests of IIA

With lots of alts, we can focus on subset. Two types of tests:

Hausman and McFadden (1984): Parameter ests from a subset of alts are same as parameter ests from full set of alts.

McFadden (1987) and Train et al. (1989)

Michael R. Roberts Qualitative Response Models 32/59

SLIDE 33

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Panel Data

If unobserved factors affecting agents are independent over repeated choices logit is kosher. Dynamics related to observed factors easily accomodated:

State dependence where agent’s past choices affect current choice, or Vijt = αyij(t − 1) + βxijt, or Vijt = α

yij(s) + βxijt,

where yij(t) = 1 if alt j chosen in period t.

Lagged response to changes in attributes. Vijt = βxij(t−1), or Vijt = βxij(t−1) + β2xij(t−2) + ...,

Dynamics related to unobserved factors cannot be handles because

f indep assumption.

Michael R. Roberts Qualitative Response Models 33/59

SLIDE 34

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Dynamic Models

yit = I(x′

itβ + αi + γyi,t−1 + εit > 0)

Lagged effects or persistence arises from three souces:

serial correlation in ε

heterogeneity, αi (some individuals are more inherently more likely to choose or experience event for all time)

state dependence, yi,t−1 (occurence of past event or decision influences prob of current event or decision)

(see Heckman (1978,1981)) Example: Choice of restaurant

Someone barfed outside Taco Bell yesterday, turning me off

I don’t like fast food

I ate at Taco Bell yesterday and don’t want the same thing twice

Intial conditions have big impact on entire path in short panels

Michael R. Roberts Qualitative Response Models 34/59

SLIDE 35

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Consumer Surplus

Agent’s consumer surplus is defined as the utility, in $, that the person receives from a choice situation. Consumer Surplus is: CSi = (1/αi)maxj(Uij∀j) where αi = dUi/Yi is marginal utility of income, and Yi is income of agent i. (Division by MU inc translates utility into $.) Observe V , not U so we can compute expected CS: E(CSi) = (1/αi)Eεij[maxj(Uij∀j)] If U is linear in income, then αi is constant wrt income & E(CSi) = (1/αi)ln  

J

eVij  

Log-Sum Term

+ C

Unknown Constant

Michael R. Roberts Qualitative Response Models 35/59

SLIDE 36

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Policy Analysis

E(CSi) = avg CS in subpop of people with same V ’s as person i. Total CS in population is weighted sum of E(CSi) over sample ∆ in CS due to change in alternatives is: E(CSi) = (1/αi)  ln  

J1

eV 1

  − ln  

J0

eV 0

    where 0 and 1 superscripts refer to pre- and post-change To get αi, can use coef on a price or cost variable. E.g., cost coef, β, should be < 0. −β is amount utility rises due to $1 inc in dec in cost, which is equiv to $1 inc in income since person can spend dollar saved just as if he received an extra $1 in income. −β is inc in utility from $1 inc in income (i.e., MU of inc). If MU income function of income see (e.g., McFadden (1999), Karlstrom (2000)).

Michael R. Roberts Qualitative Response Models 36/59

SLIDE 37

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Own Alternative Derivatives

Change in prob that agent i chooses alt j given change in observed factor zij entering that alt is ∂Pij ∂zij = ∂

eVij/

k eVik

∂zij = ∂Vij ∂zij Pij(1 − Pij) If V linear in parms then ∂Vij/∂zij = β Derivative is largest when Pij = 0.5 and smaller when Pij approaches 0 or 1. Intuition: Change matters most when choice probs indicate high degree of uncertainty.

Michael R. Roberts Qualitative Response Models 37/59

SLIDE 38

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Cross Alternative Derivatives

Change in prob that agent i chooses alt j given change in observed factor zik entering a different alt is ∂Pij ∂zik = ∂

eVij/

k eVik

∂zik = −∂Vik ∂zik PijPik If V linear in parms then ∂Vik/∂zik = β If z is desirable so that β > 0, then raising z increases prob of that alt but lowers prob of other alts.

Michael R. Roberts Qualitative Response Models 38/59

SLIDE 39

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Elasticities

Elasticity of Pij wrt zij entering the utility of alt i is: Ejzij = ∂Pij ∂zij zij Pij = ∂Vij ∂zij zij(1 − Pij) Elasticity of Pij wrt zik entering the utility of alt k is: Ejzik = ∂Pij ∂zik zik Pij = −∂Vik ∂zik zikPik Cross-elasticity is same ∀i = ⇒ changing attribute of alt j changes prob ∀ other alts by same %.

Michael R. Roberts Qualitative Response Models 39/59

SLIDE 40

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Derivatives and Elasticities: Details

Derivatives and elasticities are fxn of data (that’s the Pij, the i subscript) = ⇒ choose value (e.g., mean, median). But, this choice can have big impact on deriv/elast. APE = average ME over each observation. Need to estimate SE and valid only in “large” samples. For dummy variables, d, compute differnce in probabilities at 0 and 1 holding all other variables fixed (at mean, median, etc.) Can use this technique for continuous variables to estimate change in prob for a large movement in x (e.g., 25th to 75th percentile, 1 SD below mean to 1 SD above, etc.) Do not multiply marginal effect - valid locally - times SD - big change.

Michael R. Roberts Qualitative Response Models 40/59

SLIDE 41

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Maximum Likelihood

Assume random sample and exogenous explanatory vars. Probability of agent i choosing alt he actually choose is:

(Pij)yij, where yij = 1 if person i chose alt j, 0 otherwise. This reduces to prob of chosen alt since yij = 0∀ alts other than chosen one. (Log) Likelihood Function:

N

(Pij)yij

N

yijlnPij

Michael R. Roberts Qualitative Response Models 41/59

SLIDE 42

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

MLE Interpretations

Can interpret the MLEs in several ways:

Can be shown that MLEs of βs are those that make the predicted avg

f each explanatory variable = to observed avg in sample.

This property = ⇒ MLEs of alt-specific constants = share of agents who choose alt. I.e., predicted shares = actual shares.

MLEs are values βs that make residuals (yij − Pij) uncorrelated with explanatory variables.

Michael R. Roberts Qualitative Response Models 42/59

SLIDE 43

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Binary Case: Likelihood Function

Recall log likelihood: N

i=1

j yijlnPij.

For binary case, we have j ∈ {0, 1} and from the regression approach earlier: Pi0 = Pr(Yi = 0|X) = F(X, β) Pi1 = Pr(Yi = 1|X) = 1 − F(X, β) For binary case, log likelihood is: lnL = ln(Pr(Y1 = y1, ..., Yn = yn|x)) =

n

i=1
yilnF(x′

i β) + (1 − yi)ln[1 − F(x′ i β)]

Note: Symmetric dist. =

⇒ 1 − F(x′

i β) = F(−x′ i β)

Note: Let q = 2y − 1 = ⇒ lnL =

i lnF(qix′ i β)

Michael R. Roberts Qualitative Response Models 43/59

SLIDE 44

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Binary Case: Model Fit

Lots of measures, little guidance

χ2 test of slope coefficient significance (like a regression F-test)

Pseudo-R2 = 1 − (lnL/lnL0) where ln L = log likelihood of model and ln L0 = log likelihood of constant only model (not same as R2 from linear regression)

2 x 2 table of percentage hits and misses Observed Predicted Value Value ˆ Y = 0 ˆ Y = 1 Y = 0 Y = 1 where ˆ Y = 1 if ˆ F > F ∗ and F ∗ is a threshold (e.g., 50%).

Michael R. Roberts Qualitative Response Models 44/59

SLIDE 45

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Naming Convention: “Multinomial Logit”

The multinomial logit often refers to models in which the data are individual specific (no variation across alts). Since only differences across alts matter, and individual variables don’t vary across alts, we can’t identify coefficients on these vars. Pij = exiβ J

j exiβ = 1

J Solution: Interact these vars with dummies that vary across alts. Equiv to letting parameters vary across alts. Pij = exiβj

k exiβk

The conditional logit often refers to models in which the data var across agent’s and alts. We can restrict coefs to be constant across alts. P = exijβ

Michael R. Roberts

Qualitative Response Models 45/59

SLIDE 46

Introduction Logit Generalized Extreme Value (GEV) Choice Probabilities Power & Limitations of Logit Interpreting and Using the Model Estimation and Inference

Naming Convention: “Conditional Logit”

The conditional logit often refers to models in which the data var across agent’s and alts. We can restrict coefs to be constant across alts. Pij = exijβ

k exikβk

Derivative wrt xim ∂Pij ∂xim = [Pij(I(j = m) − Pim)] β, m = 1, ..., J

Elasticities. The effect of attribute k on choice m on Pij

∂ ln Pij ∂ ln xmk = xmk [I(j = m) − Pim] βk The distinction between multinomial and conditional is artificial.

Michael R. Roberts Qualitative Response Models 46/59

SLIDE 47

Introduction Logit Generalized Extreme Value (GEV) Nested Logit Extensions

Introduction

Logit assumes iia, imposing proportional substitution. GEV relax iia, allowing for variety of substitution patterns. Key assumption: unobserved portion of utility ∀ alts is jointly GEV, allowing for corr over alts. Example of GEV:

Nested Logit

Paired Combinatorial Logit (PCL)

Generalized Nested Logit (GNL)

Advantage of GEV = choice probs in close form.

Michael R. Roberts Qualitative Response Models 47/59

SLIDE 48

Introduction Logit Generalized Extreme Value (GEV) Nested Logit Extensions

Substitution Patterns

Nested logit works when set of alts can be partitioned into subsets (i.e., nests):

IIA holds within nest. (I.e., for any 2 alts in same nest, ratio of probs is indep of attributes or existence of all other alts.)

IIA doesn’t holds across nests. (I.e., Ratio of probs can depend on attributes of other alts in the two nests.)

E.g., Choices = {drive alone, carpool, bus, rail}. How would probs change if one choice were removed?

With Alt Removed Alt Orig Alone Carpool Bus Rail Alone .40 – .45 (+.125) .52 (+.30) .48 (+.20) Carpool .10 .20 (+1) – .52 (+.30) .48 (+.20) Bus .30 .48 (+.60) .33 (+.1) – .40 (+.33) Rail .20 .32 (+.60) .22 (+.1) .35 (.70) –

Michael R. Roberts Qualitative Response Models 48/59

SLIDE 49

Introduction Logit Generalized Extreme Value (GEV) Nested Logit Extensions

Substitution Patterns (Con’t)

To determine partition look at how probs change when removing

ption.

Note:

Bus and Rail probs always change by same amount

Along and Carpool probs always change by same amount

Note: Remove alone & carpool rises proportionately more (1.0) than prob of bus (.60) or rail (.60) IIA holds within a nest but not across nests. Remove alt outside the nest and the probs of alts within the nest all change proportionately the same.

Michael R. Roberts Qualitative Response Models 49/59

SLIDE 50

Introduction Logit Generalized Extreme Value (GEV) Nested Logit Extensions

Choice Probabilities I

Partition alts into K nonoverlapping subsets (nests): B1, ..., BK. Utility is still: Uij = Vij + εij Nested logit comes from assumption on εi = (εi1, ..., εiJ): exp   −

K

 

j∈Bk

e−εi/λk  

λk

  Marginal dist of each εij is univariate extreme. εij are correlated within nests, λk is measure of degree of independence among alts in nest k. 1 − λk is an indicator of correlation among alts in nest k. λk = 1∀k = ⇒ standard multinomial logit.

Michael R. Roberts Qualitative Response Models 50/59

SLIDE 51

Introduction Logit Generalized Extreme Value (GEV) Nested Logit Extensions

Choice Probabilities II

Choice probability for alternative j in nest Bk: Pij = eVij/λk

s∈Bk eVis/lambdak

λk−1 S

l=1

s∈Bl eVis/lambdal

λl Denominator is just sum over all nests, sum over all alts in a nest. Consider two alts, j ∈ Bk and m ∈ Bl. Pij Pim = eVij/λk

s∈Bk eVis/lambdak

λk−1 eVim/λl

s∈Bl eVis/lambdal

λl−1 If two alts in same nest (k = l) Pij Pim = eVij/λk eVim/λl

Michael R. Roberts Qualitative Response Models 51/59

SLIDE 52

Introduction Logit Generalized Extreme Value (GEV) Nested Logit Extensions

IIA in the Nested Logit

Ratio for two alts in same nest is indep of other alts (factors in parentheses cancel). Ratio for two alts in different nests depends on attributes of all alts in the nests containing j and m. Doesn’t depend on attributes of alts in nests other than those containing j and m. A form of IIA holds in Nested Logit, IIN = Independence from Irrelevant Nests. Drop an alternative from one nest and all alts in another nest change prob in same proportion.

Michael R. Roberts Qualitative Response Models 52/59

SLIDE 53

Introduction Logit Generalized Extreme Value (GEV) Nested Logit Extensions

λk

λk can vary over nests k reflecting different correlation among unobserved factors within each nest. in paren cancel. For model to be consistent with utility maximizing behavior, λk ∈ (0, 1)∀k. For λk > 1, model consistent with utility maximizing behavior for a range of explanatory vars. λk < 0 = ⇒ model inconsistent with utility maximizing behavior and implies improving attributes of an alt can dec prob of alt being chosen. λk can be specified as a fxn of demographic characteristics (e.g., λk = exp(αzi)).

Michael R. Roberts Qualitative Response Models 53/59

SLIDE 54

Introduction Logit Generalized Extreme Value (GEV) Nested Logit Extensions

Decomposition in Two Logits I

WLOG observed components of util, V , can be expressed into 2 parts:

Wik constant across alts within nest k

Yij varies across alts within nest k

Uij = Wik + Yij + εij, j ∈ Bk This decomposition enables us to write nested logit prob as product

f 2 standard logits.

Pij = Pij|BkPiBk where Pij|Bk = conditional prob of choosing alt j given nest Bk was chosen, PjBk is marginal (over alts in nest Bk) prob of choosing alt j in nest Bk.

Michael R. Roberts Qualitative Response Models 54/59

SLIDE 55

Introduction Logit Generalized Extreme Value (GEV) Nested Logit Extensions

Decomposition in Two Logits II

Conditional (Pij|Bk) and marginal (PiBk) distributions are logits: PiBk = eWik+λkIik K

l=1 eWil+λlIil ; Pij|Bk =

eYij/λk

j∈Bk eYij/λk

where Iik Iik = ln

j∈Bk

eYij/λk Prob of choosing an alt in Bk is a logit over nests and includes all vars that vary over nests Wik but not vars that vary over alternatives, Yij. Conditional prob of choosing alt j given an alt in Bk was chosen is also logit over alts in Bk and includes all vars that vary over alts in nest, Yij but not vars that vary over nests, Wik.

Michael R. Roberts Qualitative Response Models 55/59

SLIDE 56

Introduction Logit Generalized Extreme Value (GEV) Nested Logit Extensions

Inclusive Value Term

Iik is the log of the denominator of conditional prob. Called inclusive value of nest Bk. λkIik = expected utility that agent i receives from choice among alts in nest Bk Marginal prob (choice of nest) = upper model Conditional prob (choice of alt—nest) = lower model. Some people don’t divide by λk in lower model (STATA). Intuition: Inclusive value term enters upper model as explanatory var because choosing nest Bk depends on

Expected util regardless of chosen alt, Wik plus

Expected util from being able to choose bets alt in nest, λkIik.

Michael R. Roberts Qualitative Response Models 56/59

SLIDE 57

Introduction Logit Generalized Extreme Value (GEV) Nested Logit Extensions

Marginal Effects

Change in attribute r in the utility function for alt K in ∂ ln Prob(alt = j, nest = k) ∂x(

Michael R. Roberts Qualitative Response Models 57/59

SLIDE 58

Introduction Logit Generalized Extreme Value (GEV) Nested Logit Extensions

Estimation

MLE works but likelihood fxn not globally concave, like logit. Check ests by varying starting values. Can estimate sequentially = ⇒ consistent but inefficient ests. Sequential est performed “bottom up.”

Estimate lower model. I.e., estimate separate logits on each nest.

Use coef ests from (1) to compute inclusive value terms.

Estimate upper model (choice of nest), with inclusive value entering as explanatory vars.

Two problems with sequently estimation:

SEs biased downward because of estimation error in IV terms. (can correct for this, Ben-Akiva and Lerman (1985)).

Some parameters appear in several submodels and equality restrictions may be violated.

Use FIML if possible

Michael R. Roberts Qualitative Response Models 58/59

SLIDE 59

Introduction Logit Generalized Extreme Value (GEV) Nested Logit Extensions

GEV More Broadly

Can have more than two levels. Can have overlapping nests. Alts appear in more than one nest. (Cross-nested logits, ordered GEV, paired combinatorial logit, Generalized nested logit)

Michael R. Roberts Qualitative Response Models 59/59