Oversigt Course 02429 Analysis of correlated data: Mixed Linear - - PowerPoint PPT Presentation

oversigt course 02429 analysis of correlated data mixed
SMART_READER_LITE
LIVE PREVIEW

Oversigt Course 02429 Analysis of correlated data: Mixed Linear - - PowerPoint PPT Presentation

Overview Oversigt Course 02429 Analysis of correlated data: Mixed Linear Models Module 1: Introduction to mixed models Overview 1 Simple intro example 2 Per Bruun Brockhoff The mixed model 3 DTU Compute Building 324 - room 220 Missing


slide-1
SLIDE 1

Course 02429 Analysis of correlated data: Mixed Linear Models Module 1: Introduction to mixed models Per Bruun Brockhoff

DTU Compute Building 324 - room 220 Technical University of Denmark 2800 Lyngby – Denmark e-mail: perbb@dtu.dk

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 1 / 20 Overview

Oversigt

1

Overview

2

Simple intro example

3

The mixed model

4

Missing value example

5

Why use mixed models

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 2 / 20 Overview

Course Preface

Applied statistics: Analysis of variance and regression analysis

Limited settings in basic courses Restrictive assumptions

This course:

Wider settings Relax assumptions on independence and variance homogeneity

Tool: Mixed Linear (Normal) Models Provides the basis for continuing to

Non-normal data (including binary, category and ordinal scales) Non-linear models

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 3 / 20 Overview

Modules of the course

Module 1: General introduction to mixed models. Randomized blocks design. Module 2: Factor structure diagrams. Module 3: Drying of beech wood - a case study, part I. Module 4: Mixed model theory, part I. Module 5: Hierarchical random effects Module 6: Model diagnostics Module 7: The analysis of split-plot design data Module 8: Analysis of covariance Module 9: Random coefficient models Module 10: Mixed model theory, part II Module 11: Repeated measures, simple methods. Module 12: Repeated measures, advanced methods.

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 4 / 20

slide-2
SLIDE 2

Overview

Content of Module 1:

Course material preface Introductory example. Randomized blocks design. Random versus fixed block effect. Comparison of fixed and mixed model. Example with missing values. Why use mixed models? R introduction.

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 5 / 20 Simple intro example

Oversigt

1

Overview

2

Simple intro example

3

The mixed model

4

Missing value example

5

Why use mixed models

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 6 / 20 Simple intro example

Introductory example: NIR predictions of HPLC measurements

Data: HPLC NIR Difference Tablet 1 10.4 10.1 0.3 Tablet 2 10.6 10.8

  • 0.2

Tablet 3 10.2 10.2 0.0 Tablet 4 10.1 9.9 0.2 Tablet 5 10.3 11

  • 0.7

Tablet 6 10.7 10.5 0.2 Tablet 7 10.3 10.2 0.1 Tablet 8 10.9 10.9 0.0 Tablet 9 10.1 10.4

  • 0.3

Tablet 10 9.8 9.9

  • 0.1

Aim: Study the method differences.

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 7 / 20 Simple intro example

Simple analysis by paired t-test

The uncertainty of the estimated difference: SEd = sd √n = 0.2953 √ 10 = 0.0934 Hypothesis test: t = d SEd = −0.05 0.0934 = −0.535 A 95%-confidence band: d ± t0.975(9)SEd ⇐ ⇒ −0.05 ± 0.21 Result: No significant method difference. (P-value= 0.61) The statistical model: di = µ + εi, ε ∼ N(0, σ2), Model parameter estimates: ˆ µ = d, ˆ σ = sd

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 8 / 20

slide-3
SLIDE 3

Simple intro example

Simple analysis by an ANOVA approach

Randomized Blocks setup with 10 "blocks" (the tablets) and two "treatments" (the methods): yij = µ + αi + βj + εij, εij ∼ N(0, σ2), Same analysis as the paired t-test: Source of Degrees of Sums of Mean F P variation freedom squares squares Tablets 9 2.0005 0.2223 5.10 0.0118 Methods 1 0.0125 0.0125 0.29 0.6054 Residual 9 0.3925 0.0436 The uncertainty of the average method difference: SE(y2 − y1) =

  • ˆ

σ2 1 10 + 1 10

  • = 0.0934

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 9 / 20 Simple intro example

The problem of the ANOVA approach

The uncertainty of the average NIR value within the model: SE(y1) = ˆ σ √ 10 = 0.066 This is "wrong"! The tablet-to-tablet variation is ignored. Using ONLY the 10 NIR-values gives: s1 = 0.4012, SE(y1) = s1 √ 10 = 0.127 The ANOVA approach:

Is valid only for statements about the 10 specific tablets in the experiment.

The 2nd approach:

Considers the 10 tablets as a random sample. (But ignores the information from the HPLC measurements) Is valid for tablets in general.

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 10 / 20 The mixed model

Oversigt

1

Overview

2

Simple intro example

3

The mixed model

4

Missing value example

5

Why use mixed models

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 11 / 20 The mixed model

The solution:

Combine the random sample assumption with a model for the entire data set! The mixed linear model: Considers the tablet differences as random effects: yij = µ + ai + βj + εij, εij ∼ N(0, σ2), ai ∼ N(0, σ2

T ).

Note: Statements about method DIFFERENCES are the same for the two kinds of validity.

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 12 / 20

slide-4
SLIDE 4

The mixed model

Comparison of fixed and mixed model

A model can be characterized by three features:

1

The expected value of the ijth observation yij

2

The variance of the ijth observation yij

3

The relation between two different observations (covariance/correlation)

A comparison of the two models: Fixed model Mixed model 1. E(yij) µ + αi + βj µ + βj 2. var(yij) σ2 σ2

T + σ2

3. cov(yij, yi′j′) σ2

T (if i = i

′)

(j = j

′)

0 (if i = i

′)

In summary: (for the mixed model)

The tablet differences become a part of the variance structure. The observations are no longer independent

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 13 / 20 The mixed model

Analysis by mixed model

ANOVA table as for the fixed model: Source of Degrees of Sums of Mean E(MS) variation freedom squares squares Tablets 9 2.0005 0.2223 2σ2

T + σ2

Methods 1 0.0125 0.0125 σ2 + 10 β2

j

Residual 9 0.3925 0.0436 σ2 Variance components estimated by: ˆ σ2 = 0.0436, ˆ σ2

T = 0.2223 − 0.0436

2 = 0.0894 The uncertainty of the average NIR-value in the mixed model is: SE(y1) =

  • ˆ

σ2

T + ˆ

σ2 √ 10 = √0.0894 + 0.0436 √ 10 = 0.115

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 14 / 20 Missing value example

Oversigt

1

Overview

2

Simple intro example

3

The mixed model

4

Missing value example

5

Why use mixed models

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 15 / 20 Missing value example

Example with missing values

Data:

HPLC NIR Difference Tablet 1 10.4 10.1 0.3 Tablet 2 10.6 10.8

  • 0.2

Tablet 3 10.2 10.2 0.0 Tablet 4 10.1 9.9 0.2 Tablet 5 10.3 11

  • 0.7

Tablet 6 10.7 10.5 0.2 Tablet 7 10.3 10.2 0.1 Tablet 8 10.9 10.9 0.0 Tablet 9 10.1 10.4

  • 0.3

Tablet 10 9.8 9.9

  • 0.1

Tablet 11 10.8 Tablet 12 9.8 Tablet 13 10.5 Tablet 14 10.3 Tablet 15 9.7 Tablet 16 10.3 Tablet 17 9.6 Tablet 18 10.0 Tablet 19 10.2 Tablet 20 9.9

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 16 / 20

slide-5
SLIDE 5

Missing value example

Analysis by fixed effects ANOVA

ANOVA table: Source of Degrees of Sums of Mean F P variation freedom squares squares Tablets 19 3.7230 0.1959 4.49 0.0129 Methods 1 0.0125 0.0125 0.29 0.6054 Residual 9 0.3925 0.0436 The results for the method differences EXACTLY as before: SE(ˆ β2 − ˆ β1) =

  • ˆ

σ2 1 10 + 1 10

  • = 0.0934

The information in Tablets 11-20 are NOT used!

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 17 / 20 Missing value example

Analysis by mixed model

The results (as given by PROC MIXED in SAS): ˆ σ2 = 0.0435, ˆ σ2

T = 0.1019

ˆ β2 − ˆ β1 = −0.07211, SE(ˆ β2 − ˆ β1) = 0.0870 Information from all data is used! Consider ONLY tablets 11-20: Two (independent) samples t-test setup with 5 tablets in each group: y1 = 10.22, s1 = 0.4658, y2 = 10.00, s2 = 0.2739 The results from the two separate analyzes can be summarized as: Tablets 1-10 Tablets 11-20 Difference

  • 0.05
  • 0.22

SE2 0.00872 0.0584 The mixed model analysis combines these two sets of information!

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 18 / 20 Why use mixed models

Oversigt

1

Overview

2

Simple intro example

3

The mixed model

4

Missing value example

5

Why use mixed models

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 19 / 20 Why use mixed models

Why use mixed models?

To avoid mistakes. To broaden the statistical inference. To recover all relevant information in the data. To be able to handle correlation structures in the data. To be able to handle non-homogeneous variances. To use the only reasonable model for the data!

Per Bruun Brockhoff (perbb@dtu.dk) Mixed Linear Models, Module 1 Fall 2014 20 / 20