Getting to Regression: The Workhorse of Quantitative Political - - PowerPoint PPT Presentation

getting to regression the workhorse of quantitative
SMART_READER_LITE
LIVE PREVIEW

Getting to Regression: The Workhorse of Quantitative Political - - PowerPoint PPT Presentation

Correlation Regression Getting to Regression: The Workhorse of Quantitative Political Analysis Department of Government London School of Economics and Political Science Correlation Regression 1 Correlation 2 Regression Correlation


slide-1
SLIDE 1

Correlation Regression

Getting to Regression: The Workhorse of Quantitative Political Analysis

Department of Government London School of Economics and Political Science

slide-2
SLIDE 2

Correlation Regression

1 Correlation 2 Regression

slide-3
SLIDE 3

Correlation Regression

1 Correlation 2 Regression

slide-4
SLIDE 4

Correlation Regression

Correlation as Measure of Bivariate Relationship

Covariance: Cov(X, Y ) =

n

i=1

(Xi − ¯ X)(Yi − ¯ Y ) n − 1

slide-5
SLIDE 5

Correlation Regression

Correlation as Measure of Bivariate Relationship

Covariance: Cov(X, Y ) =

n

i=1

(Xi − ¯ X)(Yi − ¯ Y ) n − 1 Correlation:

Corr(X, Y ) = rx,y =

n i=1

(Xi − ¯ X)(Yi − ¯ Y ) (n − 1)sxsy where sx =

n i=1(xi − ¯

x)2

slide-6
SLIDE 6

Correlation Regression

Correlation is linear!

Source: Wikimedia

slide-7
SLIDE 7

Correlation Regression

Guess the Correlation!

1 Go to:

http://guessthecorrelation.com/

2 Play a few rounds

slide-8
SLIDE 8

Correlation Regression

1 Correlation 2 Regression

slide-9
SLIDE 9

Correlation Regression

Regression

Definition: a statistical method for measuring the relationships between one variable and many other variables

slide-10
SLIDE 10

Correlation Regression

Regression

Definition: a statistical method for measuring the relationships between one variable and many other variables Uses of Regression

1 Description 2 Prediction 3 Causal Inference

slide-11
SLIDE 11

Correlation Regression

Regression

Definition: a statistical method for measuring the relationships between one variable and many other variables Uses of Regression

1 Description 2 Prediction 3 Causal Inference

Ordinary least squares (OLS) regression

slide-12
SLIDE 12

Correlation Regression

Interpretations of OLS

slide-13
SLIDE 13

Correlation Regression

Interpretations of OLS

1 Line (or surface) of best fit 2 Ratio of Cov(X, Y ) and Var(X) 3 Minimizing residual sum of squares

(SSR)

slide-14
SLIDE 14

Correlation Regression

Interpretations of OLS

1 Line (or surface) of best fit 2 Ratio of Cov(X, Y ) and Var(X) 3 Minimizing residual sum of squares

(SSR)

4 Estimating unit-level causal effect

slide-15
SLIDE 15

Correlation Regression

Bivariate Regression I

Y is continuous X is a randomized treatment indicator/dummy (0, 1) How do we know if the X had an effect

  • n Y ?
slide-16
SLIDE 16

Correlation Regression

Bivariate Regression I

Y is continuous X is a randomized treatment indicator/dummy (0, 1) How do we know if the X had an effect

  • n Y ?

Look at outcome mean-difference: E[Y |X = 1] − E[Y |X = 0]

slide-17
SLIDE 17

Correlation Regression

Bivariate Regression I

Mean difference (E[Y |X = 1] − E[Y |X = 0]) is the regression line slope Slope (β) defined as ∆Y

∆X

slide-18
SLIDE 18

Correlation Regression

Bivariate Regression I

Mean difference (E[Y |X = 1] − E[Y |X = 0]) is the regression line slope Slope (β) defined as ∆Y

∆X

∆Y = E[Y |X = 1] − E[Y |X = 0] ∆X = 1 − 0 = 1

slide-19
SLIDE 19

Correlation Regression

Three Equations

1 Population:

Y = β0 + β1X (+ǫ)

slide-20
SLIDE 20

Correlation Regression

Three Equations

1 Population:

Y = β0 + β1X (+ǫ)

2 Sample estimate:

ˆ y = ˆ β0 + ˆ β1x + e

slide-21
SLIDE 21

Correlation Regression

Three Equations

1 Population:

Y = β0 + β1X (+ǫ)

2 Sample estimate:

ˆ y = ˆ β0 + ˆ β1x + e

3 Unit:

yi = ˆ β0 + ˆ β1xi + ei = ¯ y0i + (y1i − y0i)xi + (y0i − ¯ y0i)

slide-22
SLIDE 22

Correlation Regression

x y 1 1 2 3 4 5 6 7

slide-23
SLIDE 23

Correlation Regression

x y 1 1 2 3 4 5 6 7 ¯ y0 ¯ y1

slide-24
SLIDE 24

Correlation Regression

x y 1 1 2 3 4 5 6 7 ¯ y0 ¯ y1 ∆x ∆y

slide-25
SLIDE 25

Correlation Regression

x y 1 1 2 3 4 5 6 7 ¯ y0 ¯ y1 ∆x ∆y = β1 ˆ β0

slide-26
SLIDE 26

Correlation Regression

x y 1 1 2 3 4 5 6 7 ˆ β1 ˆ β0

ˆ y = ˆ β0 + ˆ β1x

slide-27
SLIDE 27

Correlation Regression

x y 1 1 2 3 4 5 6 7

ˆ y = 2 + 3x

slide-28
SLIDE 28

Correlation Regression

x y 1 1 2 3 4 5 6 7

ˆ y = 2 + 3x yi = 2 + 3xi + ei ei

slide-29
SLIDE 29

Correlation Regression

Questions?

slide-30
SLIDE 30

Correlation Regression

Continuous X

If x is continuous, calculation is more complicated Rather than β1 being the mean-difference in outcomes, it is the slope across all values of x ˆ β1 = Cov(x, y)/Var(x)

slide-31
SLIDE 31

Correlation Regression

Calculations

xi yi xi − ¯ x yi − ¯ y (xi − ¯ x)(yi − ¯ y) (xi − ¯ x)2 1 1 ? ? ? ? 2 5 ? ? ? ? 3 3 ? ? ? ? 4 6 ? ? ? ? 5 2 ? ? ? ? 6 7 ? ? ? ? ¯ x ¯ y Cov(x, y) Var(x)

slide-32
SLIDE 32

Correlation Regression

x y 1 2 3 4 5 6 7 1 2 3 4 5 6 7

slide-33
SLIDE 33

Correlation Regression

x y 1 2 3 4 5 6 7 1 2 3 4 5 6 7 ¯ x ¯ y

slide-34
SLIDE 34

Correlation Regression

x y 1 2 3 4 5 6 7 1 2 3 4 5 6 7 ¯ x ¯ y

slide-35
SLIDE 35

Correlation Regression

x y 1 2 3 4 5 6 7 1 2 3 4 5 6 7 ¯ x ¯ y

slide-36
SLIDE 36

Correlation Regression

Calculations

xi yi xi − ¯ x yi − ¯ y (xi − ¯ x)(yi − ¯ y) (xi − ¯ x)2 1 1 ? ? ? ? 2 5 ? ? ? ? 3 3 ? ? ? ? 4 6 ? ? ? ? 5 2 ? ? ? ? 6 7 ? ? ? ? ¯ x ¯ y Cov(x, y) Var(x)

slide-37
SLIDE 37

Correlation Regression

Calculations

If x is continuous, calculation is more complicated:

  • β1 = Cov(x, y)/Var(x)

xi yi xi − ¯ x yi − ¯ y (xi − ¯ x)(yi − ¯ y) (xi − ¯ x)2 1 1 −2.¯ 6

  • 3

−6.6¯ 6 6.25 2 5 −1.¯ 3 +1 −2.00 2.25 3 3 −0.¯ 6

  • 1

−0.3¯ 3 0.25 4 6 +0.¯ 3 +2 −0.1¯ 6 0.25 5 2 +1.¯ 6

  • 2

−2.50 2.25 6 7 +2.¯ 3 +3 −8.3¯ 3 6.25 3.5 3.¯ 6 11 17.5

slide-38
SLIDE 38

Correlation Regression

Calculations

If x is continuous, calculation is more complicated:

  • β1 = Cov(x, y)/Var(x) = 11/17.5 = 0.627

xi yi xi − ¯ x yi − ¯ y (xi − ¯ x)(yi − ¯ y) (xi − ¯ x)2 1 1 −2.¯ 6

  • 3

−6.6¯ 6 6.25 2 5 −1.¯ 3 +1 −2.00 2.25 3 3 −0.¯ 6

  • 1

−0.3¯ 3 0.25 4 6 +0.¯ 3 +2 −0.1¯ 6 0.25 5 2 +1.¯ 6

  • 2

−2.50 2.25 6 7 +2.¯ 3 +3 −8.3¯ 3 6.25 3.5 3.¯ 6 11 17.5

slide-39
SLIDE 39

Correlation Regression

Intercept ˆ β0

Simple formula: ˆ β0 = ¯ y − ˆ β1¯ x

slide-40
SLIDE 40

Correlation Regression

Intercept ˆ β0

Simple formula: ˆ β0 = ¯ y − ˆ β1¯ x Intuition: OLS fit always runs through point (¯ x, ¯ y)

slide-41
SLIDE 41

Correlation Regression

Intercept ˆ β0

Simple formula: ˆ β0 = ¯ y − ˆ β1¯ x Intuition: OLS fit always runs through point (¯ x, ¯ y) Ex.: ˆ β0 = 3.¯ 6 − 0.627 ∗ 3.5 = 1.4¯ 6

slide-42
SLIDE 42

Correlation Regression

Intercept ˆ β0

Simple formula: ˆ β0 = ¯ y − ˆ β1¯ x Intuition: OLS fit always runs through point (¯ x, ¯ y) Ex.: ˆ β0 = 3.¯ 6 − 0.627 ∗ 3.5 = 1.4¯ 6 ˆ y = 1.4¯ 6 + 0.6857ˆ x

slide-43
SLIDE 43

Correlation Regression

x y 1 2 3 4 5 6 7 1 2 3 4 5 6 7 ¯ x ¯ y

slide-44
SLIDE 44

Correlation Regression

Systematic versus unsystematic components

slide-45
SLIDE 45

Correlation Regression

Systematic versus unsystematic components

Systematic: Regression line (slope)

Linear regression estimates the conditional means of the population data (i.e., E[Y |X])

slide-46
SLIDE 46

Correlation Regression

Systematic versus unsystematic components

Systematic: Regression line (slope)

Linear regression estimates the conditional means of the population data (i.e., E[Y |X])

Unsystematic: Error term is the deviation of observations from the line

The difference between each value yi and ˆ yi is the residual: ei OLS produces an estimate of β that minimizes the residual sum of squares

slide-47
SLIDE 47

Correlation Regression

Why are there residuals?

slide-48
SLIDE 48

Correlation Regression

Why are there residuals?

Fundamental randomness

slide-49
SLIDE 49

Correlation Regression

Why are there residuals?

Fundamental randomness Measurement error

slide-50
SLIDE 50

Correlation Regression

Why are there residuals?

Fundamental randomness Measurement error Omitted variables

slide-51
SLIDE 51

Correlation Regression

Minimum Mathematical Requirements

1 Do we need variation in X?

slide-52
SLIDE 52

Correlation Regression

Minimum Mathematical Requirements

1 Do we need variation in X?

Yes, otherwise dividing by zero

slide-53
SLIDE 53

Correlation Regression

Minimum Mathematical Requirements

1 Do we need variation in X?

Yes, otherwise dividing by zero

2 Do we need variation in Y ?

No, ˆ β1 can equal zero (Cor(X, Y ) = 0)

slide-54
SLIDE 54

Correlation Regression

Minimum Mathematical Requirements

1 Do we need variation in X?

Yes, otherwise dividing by zero

2 Do we need variation in Y ?

No, ˆ β1 can equal zero (Cor(X, Y ) = 0)

slide-55
SLIDE 55

Correlation Regression

Minimum Mathematical Requirements

1 Do we need variation in X?

Yes, otherwise dividing by zero

2 Do we need variation in Y ?

No, ˆ β1 can equal zero (Cor(X, Y ) = 0)

3 How many observations do we need?

slide-56
SLIDE 56

Correlation Regression

Minimum Mathematical Requirements

1 Do we need variation in X?

Yes, otherwise dividing by zero

2 Do we need variation in Y ?

No, ˆ β1 can equal zero (Cor(X, Y ) = 0)

3 How many observations do we need?

n ≥ k, where k is number of parameters to be estimated

slide-57
SLIDE 57

Correlation Regression

Correlation/Regression Equivalence

Definition: Corr(x, y) = ˆ rx,y = Cov(x,y)

(n−1)sxsy

Slope ˆ β1 and correlation ˆ rx,y are simply different scalings of Cov(x, y)

slide-58
SLIDE 58

Correlation Regression

Correlation/Regression Equivalence

Definition: Corr(x, y) = ˆ rx,y = Cov(x,y)

(n−1)sxsy

Slope ˆ β1 and correlation ˆ rx,y are simply different scalings of Cov(x, y) R2 = ˆ r 2

x,y = SSE SST = 1 − SSR SST

slide-59
SLIDE 59

Correlation Regression

Questions about OLS?

slide-60
SLIDE 60

Correlation Regression

Are Estimates Any Good?

slide-61
SLIDE 61

Correlation Regression

Are Estimates Any Good?

1 Works mathematically 2 Linear relationship between X and Y 3 X is measured without error 4 No missing data (or MCAR) 5 No confounding (next week)

slide-62
SLIDE 62

Correlation Regression

Linear Relationship

If linear, no problems If non-linear, we need to transform

Power terms (e.g., x 2, x 3) log (e.g., log(x)) Other transformations If categorical: convert to set of indicators Multivariate interactions (next week)

slide-63
SLIDE 63

Correlation Regression

Coefficient Interpretation

Four types of variables:

1 Indicator (0,1) 2 Categorical 3 Ordinal 4 Interval

How do we interpret a coefficient on each of these types of variables?

slide-64
SLIDE 64

Correlation Regression

Interpretation: Indicator

y = ˆ β0 + ˆ β1x + e β0 is the estimate of ¯ y when x = 0 β1 is the difference: ¯ yx=1 − ¯ yx=0

slide-65
SLIDE 65

Correlation Regression

Interpretation: Categorical

y = ˆ β0 + ˆ β1xx=1 + ˆ β2xx=2 + · · · + e β0 is the estimate of ¯ y when x = 0 β1 is the difference: ¯ yx=1 − ¯ yx=0 β2 is the difference: ¯ yx=2 − ¯ yx=0 Need to select one category as the reference category!

slide-66
SLIDE 66

Correlation Regression

Interpretation: Interval

y = ˆ β0 + ˆ β1x + e β0 is the estimate of ¯ y when x = 0 β1 is the slope of the relationship between x and y

Slope is constant across full domain of x

slide-67
SLIDE 67

Correlation Regression

Interpretation: Ordinal

Two options:

1 y = ˆ

β0 + ˆ β1x + e

2 y = ˆ

β0 + ˆ β1xx=1 + ˆ β2xx=2 + · · · + e

Have to choose whether to treat an

  • rdinal variable as categorical or interval
slide-68
SLIDE 68

Correlation Regression

Questions?

slide-69
SLIDE 69

Correlation Regression

What type of x variable is involved and how do we interpret the coefficient(s) on x for each of the following scenarios?

1 Body Mass Index (BMI) regressed on height 2 Monthly income ($) regressed on gender 3 Years of schooling regressed on birth region 4 Feeling thermometer toward Theresa May

regressed on party affiliation

5 Weekly hours worked regressed on civil service

pay grade

slide-70
SLIDE 70

Correlation Regression

slide-71
SLIDE 71

Correlation Regression

OLS Minimizes SSR

Total Sum of Squares (SST):

n

i=1(yi − ¯

y)2 We can partition SST into two parts (ANOVA):

Explained Sum of Squares (SSE) Residual Sum of Squares (SSR)

SST = SSE + SSR OLS is the line with the lowest SSR

slide-72
SLIDE 72

Correlation Regression

x y 1 2 3 4 5 6 7 1 2 3 4 5 6 7 ¯ x ¯ y

slide-73
SLIDE 73

Correlation Regression

x y 1 2 3 4 5 6 7 1 2 3 4 5 6 7 ¯ x ¯ y

slide-74
SLIDE 74

Correlation Regression

x y 1 2 3 4 5 6 7 1 2 3 4 5 6 7 ¯ x ¯ y

slide-75
SLIDE 75

Correlation Regression

x y 1 2 3 4 5 6 7 1 2 3 4 5 6 7 ¯ x ¯ y

slide-76
SLIDE 76

Correlation Regression

x y 1 2 3 4 5 6 7 1 2 3 4 5 6 7 ¯ x ¯ y

slide-77
SLIDE 77

Correlation Regression

x y 1 2 3 4 5 6 7 1 2 3 4 5 6 7 ¯ x ¯ y

slide-78
SLIDE 78

Correlation Regression

x y 1 2 3 4 5 6 7 1 2 3 4 5 6 7 ¯ x ¯ y

slide-79
SLIDE 79

Correlation Regression

RMSE (σ)

Definition: ˆ σ =

  • SSR

n−p, where p is

number of parameters estimated Interpretation:

How far, on average, are the observed y values from their corresponding fitted values ˆ y sd(y) is how far, on average, a given yi is from ¯ y σ is how far, on average, a given yi is from ˆ yi

Units: same as y (range 0 to sd(y))

slide-80
SLIDE 80

Correlation Regression