= an orthogonal matrix T X X l l Cov( , ) i k ij kj j = - - PowerPoint PPT Presentation

an orthogonal matrix t x x l l cov i k ij kj j 1 x f l
SMART_READER_LITE
LIVE PREVIEW

= an orthogonal matrix T X X l l Cov( , ) i k ij kj j = - - PowerPoint PPT Presentation

Factor analysis (cf. section 9.3) In matrix notation the factor model takes the form: An observable random vector X of dimension p has mean X LF = + vector and covariance matrix L Here is the p x m matrix


slide-1
SLIDE 1

An observable random vector X of dimension p has mean vector and covariance matrix Σ The (orthogonal) factor model postulates that X depends linearly on unobservable random variables (latent variables) F1, F2, ..., Fm, called common factors and p additional (unobservable) sources of variation ε1, ε2, ...., εp, called errors or specific factors Factor analysis (cf. section 9.3)

1

Model formulation:

1 1 11 1 12 2 1 1 m m

X l F l F l F µ ε − = + + + ⋯

errors or specific factors

1 1 2 2 p p p p pm m p

X l F l F l F µ ε − = + + + ⋯

The coefficient is called the loading of the i-th variable on the j-th factor

ij

l ⋮ − = + X

  • LF

ε

In matrix notation the factor model takes the form: Here is the p x m matrix of factor loadings, is the m-dimisional vector of common factors, and is the p-vector of errors

{ }

ij

l = L

[ ]

1 2

, , ,

m

F F F ′ = F …

1 2

, , ,

p

ε ε ε ′   =   ε …

2

( ) Cov( ) ( ) E E = ′ = = F F FF I

p-vector of errors Assumptions:

1 2

( ) Cov( ) ( ) diag{ , , , }

p

E E ψ ψ ψ = ′ = = = ε ε εε Ψ … Cov( , ) = ε F

The model implies that

cov( ) {( )( ) } E ′ = = − − Σ X X X

= + LL Ψ

We may write

m

Cov( , ) {( ) } E ′ = − X F X F = L

3

Var( )

ii i

X σ =

2 1 m ij i j

l ψ

=

= +

2

def communality specific variance

i i

h ψ = +

  • 1

Cov( , )

m i k ij kj j

X X l l

=

=∑ Cov( , )

i j ij

X F l =

Let T be a m x m orthogonal matrix

− = + X

  • LF

ε

The factor model may then be reformulated as

* *

= + L F ε

where

* *

and ′ = = L LT F T F

It is impossible on the basis of observations to

4

It is impossible on the basis of observations to distinguish the loadings L from the loadings L* Thus the factor loadings L are determined only up to an orthogonal matrix T

slide-2
SLIDE 2

There are different methods for estimation of the factor model We will consider:

  • estimation using principal components
  • maximum likelihood estimation

5

  • maximum likelihood estimation

For both methods the solution may be rotated by multiplication by an orthogonal matrix to simplify the interpretation of factors (to be described later) Assume that Σ have eigenvalue-eigenvector pairs where By the spectral decomposition we may write

1 1 2 2

( , ), ( , ), , ( , )

p p

λ λ λ e e e …

1 2 p

λ λ λ ≥ ≥ ≥ ≥ ⋯

1 1 1 2 2 2 p p p

λ λ λ ′ ′ ′ = + + + Σ e e e e e e ⋯

We first consider estimation using principal components

6

1 1 2 2 1 1 2 2 p p p p

λ λ λ λ λ λ   ′     ′     =           ′   e e e e e e

⋯⋯⋯ ⋯⋯⋯ ⋯⋯⋯

⋮ ⋮ ⋯ ⋮ ⋮ This is of the form with m=p and

′ = + Σ LL Ψ = Ψ

If the last p – m eigenvalues are small, we may neglect in the representation of Σ The representation on the previous slide is not particularly useful since we seek a representation with just a few common factors

1 1 1 m m m p p p

λ λ

+ + +

′ ′ + + e e e e ⋯

This gives:

7

1 1 2 2 1 1 2 2 m m m m

λ λ λ λ λ λ   ′     ′     ≈           ′   e e Σ e e e e

⋯⋯⋯ ⋯⋯⋯ ⋯⋯⋯

⋮ ⋮ ⋯ ⋮ ⋮

This gives: Allowing for specific factors (errors) we obtain

′ ≈ + Σ LL Ψ

1 1 2 2 1 1 2 2 m m

λ λ λ λ λ   ′     ′     = +         e e e e e Ψ

⋯⋯⋯ ⋯⋯⋯

⋮ ⋮ ⋯ ⋮ ⋮

8

where is given by

1 2

diag{ , , , }

p

ψ ψ ψ = Ψ …

2 1 m i ii ij j

l ψ σ

=

= −∑

m m

λ       ′   e

⋯⋯⋯

slide-3
SLIDE 3

9

The total sample variance is

11 22

tr( )

pp

s s s + + + = S ⋯

The contribution to this from the j-th factor is

2 2 2 1 2

ˆ ˆ ( ) ( )

j j pj j j j j j

l l l λ λ λ ′ + + + = = e e ⌢ ɶ ɶ ɶ ⋯

10

Thus contribution of total sample variance due to j-th factor is

11 22

ˆ

j pp

s s s λ + + + ⋯

(i.e. as for principal components) How do we determine the number of factors? (if not given apriori) We want the m factors to explain as a fairly large proportion of the total sample variance, so may chose m so that this is achieved (subjectively) For factor analysis of the correlation matrix R one may

11

For factor analysis of the correlation matrix R one may let m be the number of eigenvalues larger than 1 One would also like the residual matrix

( ) ′ − + Ψ S LL ɶ ɶ ɶ

to have small off-diagonal element (the diagonal elements are zero) Example 9.3 (contd): In a consumer preference study a random sample of customers were asked to rate several attributes of a new product using a 7-point scale Sample correlation matrix

12

slide-4
SLIDE 4

The first two eigenvalues are the only ones larger than 1 and a model with two common factors accounts for 93% of the total (standardized) sample variance

13

Example 9.4 Weekly rates of return for five stocks on New York Stock Exchange Jan 2004 through Dec 2005

14 15

We then consider estimation by maximum likelihood Then X is multivariate normal with mean vector and covariance matrix of the form We now assume that the common factors F and the specific factors ε are multivariate normal

′ = + Σ LL Ψ

Likelihood (based on n observations x1, x2, .... , xn ):

16

1 −

′ L Ψ L

This gives the maximum likelihood estimates

1 / 2 / 2 1

1 2

1 ( , ) exp ( ) ( ) (2 )

n j j n np j

L π

− =

  ′ = − − −    

Σ x Σ x

  • Σ

ˆ ˆ ˆ , (and ) = L Ψ

  • x

The likelihood may be maximized numerically (under the condition that is a diagonal matrix)

slide-5
SLIDE 5

The maximum likelihood estimates of the communalities are

2 2 2 2 1 2

ˆ ˆ ˆ ˆ ( 1,...., )

i i i im

h l l l i p = + + + = ⋯

The contribution of the total sample variance due to

17

The contribution of the total sample variance due to the j-th factor is

2 2 2 1 2 11 22

ˆ ˆ ˆ

j j pj pp

l l l s s s + + + + + + ⋯ ⋯

In general the correlation matrix

12 1 21 2 1 2

1 1 1

p p p p

ρ ρ ρ ρ ρ ρ       =        

ρ

⋯ ⋯ ⋮ ⋮ ⋱ ⋮ ⋯

may be written as where

1/ 2 1/2 − −

= V ΣV

ρ

18

is the inverse of the standard deviation matrix may be written as where = V ΣV

ρ

11 22 1/2

1 1 1

pp

σ σ σ

      =         V ⋯ ⋯ ⋮ ⋮ ⋱ ⋮ ⋯

When we have that

′ = + Σ LL Ψ

1/ 2 1/2 − −

= V ΣV

ρ

1/ 2 1/ 2 1/ 2 1/ 2

( )( )

− − − −

′ = + V L V L V Ψ V ′ = +

z z z

L L Ψ

1/ 2 1/ 2

( )

− −

′ = + V LL Ψ V

19

where

1/ 2 1/ 2 1/ 2

and (diagonal)

− − −

= =

z z

L V L Ψ V Ψ V We obtain maximum likelihood estimates by inserting the maximum likelihood estimates for

1/ 2

, and

L V Ψ Example 9.5 Weekly rates of return for five stocks on New York Stock Exchange Jan 2004 through Dec 2005

20

slide-6
SLIDE 6

Under the assumption that X is multivariate normal with mean vector and covariance matrix Σ we may test the hypothesis that a factor model with m factors holds

0 :

H ′ = + Σ LL Ψ

We may use the likelihood ratio test This corresponds to test the null hypothesis

21

We may use the likelihood ratio test When we assume no structure on Σ the maximum of the likelihood becomes

/ 2 / 2 / 2

1 max ( , ) (2 )

np n np n

L e π

=

Σ S

where Sn = (n-1)S/n When one may show that the maximum of the likelihood becomes

′ = + Σ LL Ψ

where

/ 2

ˆ | |

n −

  Λ =   Σ S ˆ ˆ ˆ ˆ ′ = + Σ LL Ψ

The likelihood ratio takes the form

/ 2 / 2 / 2

1 max ( , ) ˆ (2 )

np n np

L e π

− ′ = +

=

,Σ LL Ψ

Σ Σ

22

| |

n

Λ =     S

The likelihood ratio takes the form For testing we may use that under H0 ˆ | | 2log log | |

n

n   − Λ =     Σ S is approximately chi-squared distributed with p(p+1)/2 – [p(m+1) – m(m – 1)/2] degrees of freedom