= S ( X X )( X X ) X X = ( 1) p n j 1 j - - PowerPoint PPT Presentation

s x x x x x x 1 p n j 1 j j x s
SMART_READER_LITE
LIVE PREVIEW

= S ( X X )( X X ) X X = ( 1) p n j 1 j - - PowerPoint PPT Presentation

Simultaneous confidence statements and multiple comparisons (cf. section 5.4) From this we find that a 100(1- )% confidence region , , , ( , ) X X X N We assume that are i.i.d. 1 2 n p for the


slide-1
SLIDE 1

Simultaneous confidence statements and multiple comparisons (cf. section 5.4) We assume that are i.i.d.

1 2

, , ,

n

X X X … ( , )

p

N Σ

We have

1

1 ( )( ) 1

n j j j

n

=

′ = − − − ∑ S X X X X

1

1

n j j

n

=

= ∑ X X

1

Consider Hotelling’s statistic

2 1

( ) ( ) T n

′ = − − X S X

  • 2

,

( 1) is distributed as

p n p

p n T F n p

− −

where is F-distributed with p and n-p d.f.

, p n p

F

From this we find that a 100(1-α)% confidence region for the mean vector is the ellipsoid determined by all such that

1 ,

( 1) ( ) ( ) ( )

p n p

p n n F n p α

− −

− ′ − − ≤ − X S X

  • 2

n p −

We will see how we from this confidence region may derive simultaneous confidence intervals for linear combinations of the mean vector with Let a be a p-dimensional vector and define

1 1 2 2

1,2, ,

j j j j p jp

Z a X a X a X j n ′ = = + + + = a X ⋯ …

The Zj are i.i.d. random variables, and

2

( , )

j z z

Z N µ σ ∼

2

and

z z

µ σ ′ ′ = = a a Σa

3

A 100(1-α)% confidence interval for for a given vector a is based on the t-statistic

z

µ ′ = a /

z z

Z t S n µ − =

( )

n ′ ′ − = ′ a X a a Sa

The 100(1-α)% confidence interval for becomes

z

µ ′ = a

1 1

( / 2) ( / 2)

z z n z n

S S Z t Z t n n α µ α

− −

− ≤ ≤ +

  • r

1 1

( / 2) ( / 2)

n n

t t n n α α

− −

′ ′ ′ ′ ′ − ≤ ≤ + a Sa a Sa a X a a X

4

n n

For example we may let to obtain a confidence interval for

[1,0,0, ,0]′ = a …

1

µ

Or we may let to obtain a confidence interval for

[ 1,1,0, ,0]′ = − a …

2 1

µ µ −

slide-2
SLIDE 2

For illustration we will consider the t-intervals for the p means for the particular case where are i.i.d.

1 2

, , ,

n

X X X …

2

( , ) N σ

  • Ι

For we obtain the standard t-interval:

[1,0,0, ,0]′ = a …

11 11 1 1 1 1 1

( / 2) ( / 2)

n n

s s X t X t n n α µ α

− −

− ≤ ≤ +

are i.i.d.

5

2

( , )

p

N σ

  • Ι

Then

(all -intervals contain the s)

i

P t µ (1 ) (1 ) (1 ) α α α = − ⋅ − ⋅ ⋅ − ⋯ (1 ) p α = −

This will be (much) smaller than 1-α For a given vector a the confidence interval given above is the set of values for which

′ a

( )

1(

/ 2)

n

n t t α

′ ′ − = ≤ ′ a X a a Sa

  • r equivalently

2 2 2

( ( )) ( / 2) n t t α ′ − = ≤ a X

  • 6

2 2 1

( ( )) ( / 2)

n

n t t α

′ − = ≤ ′ a X

  • a Sa

We will derive confidence intervals that hold simultaneously for all possible choices of a with overall confidence coefficient 100(1-α)% We then have to replace by a larger value

2 1(

/ 2)

n

t α

We are then lead to the determination of

2 2

( ( )) max max n t ′ − = ′

a a

a X

  • a Sa

Using the maximization lemma we obtain

2 1

max ( ) ( ) t n

′ = − −

a

X S X

  • 1(

)

− S X

  • 2

( ( )) max n ′ − = ′

a

a X

  • a Sa

7

with the maximum occurring for a proportional to

1(

)

− S X

  • It follows that simultaneously for all a, the interval

, ,

( 1) ( 1) ( ) , ( )

p n p p n p

p n p n F F n p n p n n α α

− −

  ′ ′ − − ′ ′ − +   − −     a Sa a Sa a X a X

will contain with probability 1-α

′ a

In particular for one obtains the following confidence interval for (similarly for the

  • ther means)

[1,0,0, ,0]′ = a …

1

µ

11 11 1 , 1 ,

( 1) ( 1) ( ) , ( )

p n p p n p

s s p n p n X F X F n p n p n n α α

− −

  − − − +   − −    

For p=2 the simultaneous

8

For p=2 the simultaneous confidence intervals for the means are the projections of the confidence ellipse (a similar results holds in higher dimensions)

slide-3
SLIDE 3

Note that while one in the “one-at-a-time” confidence intervals multiplies by one for the simultaneous confidence intervals has to multiply by

/ n ′ a Sa

1(

/ 2)

n

t α

,

( 1) ( )

p n p

p n F n p α

− −

9

Ratio for p = 2, 5 and 10:

If we are only interested in a fixed number of linear combinations we may use the Bonferroni method

1 2

, , ,

m

′ ′ ′ a a a … ( / 2) , ( / 2)

i i i i

t t α α   ′ ′ ′ ′ − +   a Sa a Sa a X a X

For we then use the modified t-interval

i

′ a

10

1 1

( / 2) , ( / 2)

i i i i i n i i n i

t t n n α α

− −

′ ′ − +       a X a X

Let Ci denote the confidence statement about

i

′ a

Then

( is true) 1

i i

P C α = −

We obtain

(all are true)

i

P C 1 (at least one is false)

i

P C = −

1 2

1 ({ is false} { is false} { is false})

m

P C C C = − ∪ ∪ ∪ ⋯ 1 ( is false)

m i

P C ≥ −∑

11

1

1 ( is false)

i i

P C

=

≥ −∑

1 2

1 ( )

m

α α α = − + +⋯

If we let we obtain

/

i

m α α = (all are true) 1

i

P C α ≥ −

In particular the Bonferroni confidence intervals for the p means become

1 1

, 2 2

ii ii i n i n

s s X t X t p p n n α α

− −

      − +              

12

slide-4
SLIDE 4

Note that while the simultaneous confidence intervals multiplies by one for the Bonferroni intervals multiplies by

/ n ′ a Sa

1(

/ 2 )

n

t m α

,

( 1) ( )

p n p

p n F n p α

− −

Ratio for p = 2, 5 and 10 (when m = p):

13