[PPT] - The Within-B-Swap (BS) Design is A- and D-optimal for estimating the PowerPoint Presentation

SLIDE 1

Institut für Medizinische Statistik

1

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

The Within-B-Swap (BS) Design is A- and D-optimal for estimating the linear contrast for the treatment effect in 3-factorial cDNA microarray experiments

Dipl.-Stat. Sven Stanzel

Prof. Dr. rer. nat. Ralf-Dieter Hilgers

Institute for Medical Statistics RWTH Aachen

8th international workshop on Model-Oriented Design and Analysis Almagro, Spain June 4-8, 2007

SLIDE 2

Institut für Medizinische Statistik

2

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

Outline

Introduction
Statistical model
Within-B-Swap Design
Equivalence Theorems
Results
Discussion

SLIDE 3

Institut für Medizinische Statistik

3

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

Design

3-factorial experimental setting:

N arrays
2 colours (red/green) [ → repeated observations at each exp. unit]
L treatments
K cell lines
conditions A and B to be compared on each array can be chosen from L⋅K

possible combinations of L treatments and K cell lines

balanced incomplete block designs [BIBD]:
first blocking factor:

condition

second blocking factor: dye

SLIDE 4

Institut für Medizinische Statistik

4

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

Statistical model

Fixed effects gene-specific linear model for log-ratios (Landgrebe et al., 2006):

δg , δr: (fixed) dye effects of green, red dye τkl: (fixed) combination effect of cell line k (k = 1,…,K) and treatment l (l = 1,...,L) (concrete) design matrix

where

Z = Xθ + ε

Z = (Z1,…,ZN)T ~ N×1 vector of log ratios observed on N arrays ε = (ε1,…,εN)T ~ (0N , σ2IN) residual vector θ = (δg,δr,τ11,…,τKL)T ~ (KL+2)×1 parameter vector

[ ]

2) (KL N ~ ~ | |

N N

+ × = X 1

1

X

L) 1,..., l K; 1,..., k N; 1,..., (t ) ( l treatment k line cell : array t 0, labelling) dye ( l treatment k line cell : array t 1,

labelling)

dye ( l treatment k line cell : array t 1, x ~

tkl

= = = ⎪ ⎭ ⎪ ⎬ ⎫ ⎪ ⎩ ⎪ ⎨ ⎧ ↔ ↔ ↔ + = used not red green

(1)

SLIDE 5

Institut für Medizinische Statistik

5

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

Linear contrasts

in general:
contrast matrix:

C ~ (KL+2)×q ; Rg(C) := r, r<(KL+2), r≤q

linear contrasts: CTθ , CT1KL+2 = 0q
Linear contrast for treatment effect:

: matrix specifying all pairwise comparisons of L treatments f.e. ,

~

T L T K 2 L 2 L

⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎣ ⎡ ⊗ =

⎟ ⎠ ⎞ ⎜ ⎝ ⎛ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛

A 1 C

L 2 L ~ ~

L

× ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ A

⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎣ ⎡ = 1

1

1

1

1

1

~

3

A

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ = 1

1

1

1

1

1

1

1

1

1

1

1

~

4

A

useful result:

T L L L L L L T L

L 1 where , L ~ ~ 1 1 I J J A A − = ⋅ = (centering matrix)

SLIDE 6

Institut für Medizinische Statistik

6

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

Within B Swap (BS) Design

often used in practical situations
equally weighted support points (SP)
pairwise comparison of L treatments (factor A)

within each cell line (factor B)

estimability

→ candidate design ξ*

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = 2 L 2K S

SLIDE 7

Institut für Medizinische Statistik

7

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

Example

a1 a2 a1 a3 a2 a3

a2 a1 a3 a1 a3 a2

design matrix:

SP weight g r a1 a2 a3 b1 b2 b3 1 1/12 1 -1 1 -1 0 0 0 0 2 1/12 1 -1 1 0 -1 0 0 0 3 1/12 1 -1 0 1 -1 0 0 0 4 1/12 1 -1 -1 1 0 0 0 0 5 1/12 1 -1 -1 0 1 0 0 0 6 1/12 1 -1 0 -1 1 0 0 0 7 1/12 1 -1 0 0 0 1 -1 0 8 1/12 1 -1 0 0 0 1 0 -1 9 1/12 1 -1 0 0 0 0 1 -1 10 1/12 1 -1 0 0 0 -1 1 0 11 1/12 1 -1 0 0 0 -1 0 1 12 1/12 1 -1 0 0 0 0 -1 1

3

A ~ −

3

A ~

b1 b2 b1 b3 b2 b3 b2 b1 b3 b1 b3 b2 SP 10 SP 11 SP 12 SP 7 SP 8 SP 9 SP 4 SP 5 SP 6 SP 1 SP 2 SP 3

L=3 (1,2,3), K=2 (a,b) → 12 support points (arrays)

SLIDE 8

Institut für Medizinische Statistik

8

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

Notations (1)

: number of discrete design points x1,…,xm
: (discrete) design space
Ω: class of discrete designs
moment matrix:
(corresponding) generalized inverse: G:=M-

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ ⋅ = 2 KL 2 m

T i i m 1 i i

p : x x M ∑

=

1 p m; 1,..., i 1 p ; p ,..., p ,..., ξ

m 1 i i i m 1 m 1

= = ∀ ≤ ≤ ⎭ ⎬ ⎫ ⎩ ⎨ ⎧ =

∑

=

x x

} ,..., { Ξ

m 1

x x =

SLIDE 9

Institut für Medizinische Statistik

9

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

Moment matrix – BS Design

( )

L c 1 1 1 1

L K 1 KL KL T KL T KL

⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ⊗ − − = J I M

(~ (KL+2) × (KL+2) )

SLIDE 10

Institut für Medizinische Statistik

10

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

Generalized inverse – BS Design

L 1 c 1 1 1 1 1

L K 1 KL KL T KL T KL

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ ⊗ − − = J I G

(~ (KL+2) × (KL+2) )

SLIDE 11

Institut für Medizinische Statistik

11

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

Notations (2)

ξ*: candidate design; estimable with respect to CTθ

(here: BS Design)

proof of optimality: Equivalence Theorem for Matrix Means

(Pukelsheim, 1972, p.180)

extension for singular case (Pukelsheim, 1972, p. 205)

p = -1: A-optimality p = 0: D-optimality

SLIDE 12

Institut für Medizinische Statistik

12

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

D-optimality

Theorem 1 (Generalized Equivalence Theorem for D-optimality):

The design ξ* ∈ Ω is said to be D-optimal for estimating the linear contrast CTθ if and only if there exists a generalized inverse G=M- of the moment matrix M that satisfies the normality inequality

(N1)

for all discrete design points x ∈ Ξ, with strict equality for all support points of ξ*.

( )

[ ]

tr ) (

T T T T T T

GC C GC C x G C GC C GC x

+ +

≤

(Pukelsheim, 1993, p. 180/205)

SLIDE 13

Institut für Medizinische Statistik

13

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

A-optimality

Theorem 2 (Generalized Equivalence Theorem for A-optimality):

The design ξ* ∈ Ω is said to be A-optimal for estimating the linear contrast CTθ if and only if there exists a generalized inverse G=M- of the moment matrix M that satisfies the normality inequality

(N2)

for all discrete design points x ∈ Ξ, with strict equality for all support points of ξ*. (Pukelsheim, 1993, p. 180/205)

[ ]

) ( ) ( tr ) ( ) ( ) (

2 T T T T 2 T T T

GC C GC C x G C GC C GC C GC C GC x

T + + +

≤

SLIDE 14

Institut für Medizinische Statistik

14

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

Results

Theorem 3 (D-optimality of the BS design):

The Within-B-Swap (BS) design is D-optimal for estimating the linear contrast CTθ for the treatment effect in model (1). Proof:

shown:

D-optimality (sketch !!)

similar:

A-optimality

Theorem 4 (A-optimality of the BS design):

The Within-B-Swap (BS) design is A-optimal for estimating the linear contrast CTθ for the treatment effect in model (1).

SLIDE 15

Institut für Medizinische Statistik

15

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

Proof – D-optimality (1)

linear contrast: CTθ , with contrast matrix

T L L 2 T

~ ~ 2 L L K A A GC C ⋅ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

( )

Τ L L 2 T

~ ~ 2 L L K 1 Α Α GC C ⋅ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

+

design points: with

[ ]

T KL 11

x ,..., x 1, 1,− = x

L 1,..., l K; 1,..., k ; else 0, (r) l treatment k line cell 1,

(g)

l treatment k line cell 1, x kl = = ⎪ ⎭ ⎪ ⎬ ⎫ ⎪ ⎩ ⎪ ⎨ ⎧ ↔ ↔ + =

and

∑

= ⋅ = L 1 l kl k

x L 1 x ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎣ ⎡ ⊗ ⋅ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

⎟ ⎠ ⎞ ⎜ ⎝ ⎛ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛

~ L 1 2 L K

L T K 2 L 2 L T T

A 1 G C ~

T L T K 2 L 2 L

⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎣ ⎡ ⊗ =

⎟ ⎠ ⎞ ⎜ ⎝ ⎛ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛

A 1 C

With some algebra, we obtain from C, M and G (as before) :

, ,

SLIDE 16

Institut für Medizinische Statistik

16

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

Proof – D-optimality (2)

Using the properties of the matrices , it can be shown that In summary, the normality inequality (N1) of general equivalence theorem 1 can be transferred into the equivalent form:

L

~ A To proof (∆), 3 cases have to be distinguished:

. x L 1 x

L 1 l ql q

∑

= ⋅ =

( )

∑∑ ∑

= = = ⋅ ⎥

⎦ ⎤ ⎢ ⎣ ⎡ − ⋅ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

K 1 k L 1 l K 1 q q ql kl 2 2 T T T L L T

x x x 2 L K ~ ~ x G C A A GC x

, where

( )

2 x x x

K 1 k L 1 l K 1 q q ql kl

≤ ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ − ⋅

∑∑ ∑

= = = ⋅

(∆)

SLIDE 17

Institut für Medizinische Statistik

17

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

Proof – D-optimality (3)

Case 2: design points specifying comparisons of treatments j, j‘ in two different cell lines i,i‘

■

( )

L 1 1 L 1 1 1) ( L 1 1 L 1 1 1 x x x

K 1 k L 1 l K 1 q q ql kl

= ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + − + ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − ⋅ − + ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + − + ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − ⋅ = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ −

∑∑ ∑

= = = ⋅

( )

2 L 1 1 L 1 1) ( L 1 L 1 1 1 x x x

K 1 k L 1 l K 1 q q ql kl

= ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + − + ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − ⋅ − + ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + + ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − ⋅ = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ −

∑∑ ∑

= = = ⋅

Case 3: design points specifying comparisons of cell lines i,i‘ for the same treatment j Case 1: support points of BS (comparison of treatments j, j‘ in same cell line i)

( )

2 1 1) ( 1 1 x x x

K 1 k L 1 l K 1 q q ql kl

= − ⋅ − + ⋅ = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ −

∑∑ ∑

= = = ⋅

SLIDE 18

Institut für Medizinische Statistik

18

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

Discussion

The Within-B-Swap (BS) Design is A- and D-optimal for estimating the

linear contrast for the treatment effect in the Landgrebe model.

generalized equivalence theorems for matrix means
solution of optimization problem: independent of N, K, L !!!
robustness of BS Design
Further results:
φp-optimality for integer p ∈ (-∞;1]
E-optimality
interaction effect (solution conditioned on relation between K and L !!)
combination of treatment effect and interaction effect

SLIDE 19

Institut für Medizinische Statistik

19

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

Further research

φp-optimality for non-integer p
ptimal design for estimating cell line effect
ther contrast sets:

Helmert contrasts / polynomial contrasts (→ poster Hilgers)

ther multi-factorial experimental setups (4-factorial,…)
ther models:
„overall“ models (gene effect, correlation structures between genes??)
comparison of experimental conditions with fixed reference condition

[including „Common Reference Design“]

SLIDE 20

Institut für Medizinische Statistik

20

MODA 8, Almagro, Spain, June 2007, Dipl.-Stat. Sven Stanzel

References

Harville, D.A. (1997): Matrix algebra from a statistician‘s perspective. Springer, New York. Landgrebe, J., Bretz, F., Brunner, E. (2006): Efficient design and analysis of two colour factorial microarray experiments. Computational Statistics and Data Analysis 2006; 50: 499-517. Pukelsheim, F. (1993): Optimal Design of Experiments. Wiley, New York. Searle, S. R. (1971): Linear Models. Wiley, New York. Searle, S. R. (1982): Matrix Algebra useful for Statistics. Wiley, New York. Speed, T. (2003): Statistical analysis of gene expression microarray data. Chapman and Hall, Boca Raton. Stanzel, S., Hilgers, R.D. (2007): The Within-B-Swap (BS) Design is A- and D-optimal for estimating the Linear Contrast for the Treatment Effect in 3-Factorial cDNA Microarray

Experiments. Proceedings of the 8th international workshop on Model-Oriented Design and
Analysis. Almagro, Spain, June 4-8, 2007.