Multiple Linear Regression Often more than one predictor variable - - PowerPoint PPT Presentation

multiple linear regression
SMART_READER_LITE
LIVE PREVIEW

Multiple Linear Regression Often more than one predictor variable - - PowerPoint PPT Presentation

ST 370 Probability and Statistics for Engineers Multiple Linear Regression Often more than one predictor variable can be used to predict the value of a response variable. The basic approach of the simple linear regression model may be extended


slide-1
SLIDE 1

ST 370 Probability and Statistics for Engineers

Multiple Linear Regression

Often more than one predictor variable can be used to predict the value of a response variable. The basic approach of the simple linear regression model may be extended to include multiple predictors. The principles carry over, but the computations are more tedious, and hand calculation is largely infeasible.

1 / 8 Multiple Linear Regression

slide-2
SLIDE 2

ST 370 Probability and Statistics for Engineers

For example, consider the data on the strength of the bond between a component and its frame:

wireBond <- read.csv("Data/Table-01-02.csv") pairs(wireBond)

Clearly Length (x1) could be used to predict Strength (y), but also possibly Height (x2) or Length2 (x2

1).

2 / 8 Multiple Linear Regression

slide-3
SLIDE 3

ST 370 Probability and Statistics for Engineers

Multiple Linear Regression Model

The multiple linear regression model with k predictors is Y = β0 + β1x1 + β2x2 + · · · + βkxk + ǫ Notation When we have n observations on data like these, we write them Yi = β0 +

k

  • j=1

xi,jβj + ǫi, i = 1, 2, . . . , n; that is, xi,j is the value of the jth predictor xj in the ith observation.

3 / 8 Multiple Linear Regression Multiple Linear Regression Model

slide-4
SLIDE 4

ST 370 Probability and Statistics for Engineers

Predictors and Variables Each term in the equation is a predictor, but is not necessarily an independent variable. For example, consider the relationship between Strength and Langth:

plot(Strength ~ Length, wireBond)

Strength increases with Length, and roughly linearly, so we could use the single-variable equation Y = β0 + β1x + ǫ.

4 / 8 Multiple Linear Regression Multiple Linear Regression Model

slide-5
SLIDE 5

ST 370 Probability and Statistics for Engineers

Close examination suggests that the relationship may be curved, not linear, so we might want to fit the quadratic equation Y = β0 + β1x + β2x2 + ǫ. If we write x1 = x, x2 = x2, this becomes Y = β0 + β1x1 + β2x2 + ǫ, the multiple regression model with k = 2 predictors. But the equation brings in only one independent variable, Length.

5 / 8 Multiple Linear Regression Multiple Linear Regression Model

slide-6
SLIDE 6

ST 370 Probability and Statistics for Engineers

Least Squares As with the single-predictor model, we usually find parameter estimates using the least squares approach. For any proposed values b0, b1, . . . , bk we form the predicted values b0 + b1xi,1 + · · · + bkxi,k, i = 1, 2, . . . , n and the residuals ei = yi − (b0 + b1xi,1 + · · · + bkxi,k), i = 1, 2, . . . , n The sum of squares to be minimized is L(b0, b1, . . . , bk) =

n

  • i=1

e2

i = n

  • i=1

[yi − (b0 + b1xi,1 + · · · + bkxi,k)]2.

6 / 8 Multiple Linear Regression Multiple Linear Regression Model

slide-7
SLIDE 7

ST 370 Probability and Statistics for Engineers

The least squares estimates ˆ β0, ˆ β1, . . . , ˆ βk that minimize L(b0, b1, . . . , bk) cannot in general be written out in closed form, but have to be found by solving a set of equations. The residual sum of squares is again SSE =

n

  • i=1

e2

i ,

but the degrees of freedom for residuals are n − (k + 1), so the estimate of σ2 is ˆ σ2 = MSE = SSE n − (k + 1).

7 / 8 Multiple Linear Regression Multiple Linear Regression Model

slide-8
SLIDE 8

ST 370 Probability and Statistics for Engineers

Fitting the model Use lm() to fit the multiple regression model:

# the quadratic model: summary(lm(Strength ~ Length + I(Length^2), wireBond)) # the two-variable model: summary(lm(Strength ~ Length + Height, wireBond)) # quadratic in Strength, plus Height: summary(lm(Strength ~ Length + I(Length^2) + Height, wireBond))

Note The arithmetic operators “+”, “-”, “*”, “/”, and “^” have special meanings within a formula, so the predictor Length^2 must be “wrapped” in the identity function I(), otherwise it is misparsed as part of the formula.

8 / 8 Multiple Linear Regression Multiple Linear Regression Model