Deep Learning Basics Lecture 7: Factor Analysis
Princeton University COS 495 Instructor: Yingyu Liang
Lecture 7: Factor Analysis Princeton University COS 495 Instructor: - - PowerPoint PPT Presentation
Deep Learning Basics Lecture 7: Factor Analysis Princeton University COS 495 Instructor: Yingyu Liang Supervised v.s. Unsupervised Math formulation for supervised learning Given training data , : 1
Princeton University COS 495 Instructor: Yingyu Liang
π π =
1 π Οπ=1 π
π(π, π¦π, π§π)
π π = π½ π¦,π§ ~πΈ[π(π, π¦, π§)]
Dimension: 300x300 = 90,000
Movie 1 Movie 2 Movie 3 Movie 4 Movie 5 Movie 6 β¦ User 1 5 ? ? 1 3 ? User 2 ? ? 3 1 2 5 User 3 4 3 1 ? 5 1 β¦
Example from Nina Balcan
set of high dimensional data π¦π: 1 β€ π β€ π
singular value decomposition of the data)
into a (typically lower dimensional) subspace so that the variance of the projected data is maximized.
Figure from isomorphismes @stackexchange
Figure from amoeba@stackexchange
variance in the data
any other direction
second PCβs
π
π¦π = 0
π
π€ππ¦π = 0
max
π€
ΰ·
π=1 π
(π€ππ¦π)2 , π‘. π’. π€ππ€ = 1 equivalent to max
π€
π€πππππ€ , π‘. π’. π€ππ€ = 1 where the columns of π are the data points
max
π€
π€πππππ€ , π‘. π’. π€ππ€ = 1 where the columns of π are the data points
max
π€
π€πππππ€ β ππ€ππ€
π ππ€ = 0 β
πππ β ππ½ π€ = 0 β ππππ€ = ππ€
where Ξ is a diagonal matrix
π Γ π orthogonal matrix, Ξ£ is a π Γ π rectangular diagonal matrix with non-negative real numbers on the diagonal, and π is an π Γ π
max
π€
π€πππππ€ , π‘. π’. π€ππ€ = 1
minimum MSE reconstruction min
π€
1 π ΰ·
π=1 π
||π¦π β π€π€ππ¦π||2 , π‘. π’. π€ππ€ = 1
minimum MSE reconstruction min
π€
1 π ΰ·
π=1 π
||π¦π β π€π€ππ¦π||2 , π‘. π’. π€ππ€ = 1
Figure from Nina Balcan
β π€ π¦π
π π¦ β
π π¦ β PCA structure assumption: β low dimension. What about
π π¦ β
π π¦ β = πβ + π 0,
1 πΎ π½ , β is sparse,
π 2 exp(β π 2 β 1)
π π¦ β
ββ = arg max
β
log π β π¦ ββ = arg min
β π β 1 + πΎ π¦ β πβ 2 2
receptive field properties by learning a sparse code for natural images." Nature 381.6583 (1996): 607-609.
Networks