Machine Learning for Signal Processing
Fundamentals of Linear Algebra
Class 2. 6 Sep 2016 Instructor: Bhiksha Raj
11-755/18-797 1
Machine Learning for Signal Processing Fundamentals of Linear - - PowerPoint PPT Presentation
Machine Learning for Signal Processing Fundamentals of Linear Algebra Class 2. 6 Sep 2016 Instructor: Bhiksha Raj 11-755/18-797 1 Overview Vectors and matrices Basic vector/matrix operations Various matrix types Projections
11-755/18-797 1
11-755/18-797 2
– Appears repeatedly in the form of Eigen analysis, SVD, Factor analysis – Appears through various properties of matrices that are used in machine learning
– Often used in the processing of data of various kinds – Will use sound and images as examples
– Very small subset of all that’s used – Important subset, intended to help you recollect
11-755/18-797 3
11-755/18-797 4
for i=1:n for j=1:m c(i)=c(i)+y(j)*x(i)*a(i,j) end end C=x*A*y
y j xiaij
i
j
y A x
T
11-755/18-797 5
Rotation + Projection + Scaling + Perspective
Time Frequency
From Bach’s Fugue in Gm Decomposition (NMF)
– a = 2, a = 3.14, a = -1000, etc.
scalars
11-755/18-797 6
5 1 1 . 3 6 2 . 2 1 A a 1 2 3
, a 3.14 32
– Examples: [3 4 5], [a b c d], .. – [3 4 5] != [4 3 5] Order is important
N-dimensional space
11-755/18-797 7
x z y 3 4 5 (3,4,5) (4,3,5)
numerical attributes
– X, Y, Z coordinates
– [height(cm) weight(kg)]
– [175 72]
– A location in Manhattan
11-755/18-797 8 [-2.5av 6st] [2av 4st] [1av 8st]
– Represented as
– As the crow flies – Assuming Euclidean Geometry
11-755/18-797 9 [-2av 17st] [-6av 10st]
a b ...
a2 b2 ...
2
x
3 4 5 (3,4,5) Length = sqrt(32 + 42 + 52)
– 2.5 x (3,4,5) = (7.5, 10, 12.5)
– ||2.5 x (3,4,5)|| = 2.5x|| (3, 4, 5)||
11-755/18-797 10
3 (3,4,5) (7.5, 10, 12.5) Multiplication by scalar “stretches” the vector
11-755/18-797 11
3 4 5 (3,4,5) 3
(3,-2,-3) (6,2,2)
11-755/18-797 12
– The set includes the zero vector (of all zeros) – The set is “closed” under addition
scalars a and b
– For every X in the set, the set also includes the additive inverse Y = -X, such that X + Y = 0
11-755/18-797 13
11-755/18-797 14
– Note we used the term three-component, rather than three- dimensional
11-755/18-797 15
11-755/18-797 16
11-755/18-797 17
the entire set?
– There may be multiple such sets
– The set is a “basis” set
11-755/18-797 18
11-755/18-797 19
11-755/18-797 20
11-755/18-797 21
11-755/18-797 22
the our discussions of signal representations
11-755/18-797 23
d c b a
m l k j i h g f e d c b a
– The space of all vectors that can be composed from the rows of the matrix is the row space of the matrix
– The space of all vectors that can be composed from the columns of the matrix is the column space of the matrix
11-755/18-797 24
f e d c b a R f e d c b a R
columns
– c = 3x1 matrix: 3 rows and 1 column – r = 1x3 matrix: 1 row and 3 columns – S = 2 x 2 matrix – R = 2 x 3 matrix – Pacman = 321 x 399 matrix
11-755/18-797 25
c b a c b a r c ,
f e d c b a d c b a R S ,
– Row and Column = position
– Triples of x,y and value
– “Unraveling” the matrix
matrix that forms the image
– Representations 2 and 4 are equivalent
11-755/18-797 26
1 . 1 . . 1 . 1 1 10 . 10 . 6 5 . 1 . 2 1 10 . 2 . 2 2 . 2 . 1 1
1 . . . 1 1 . 1 1
Y X v Values only; X and Y are implicit
11-755/18-797 27
a b a
1
a2 a3 b
1
b2 b3 a1 b
1
a2 b2 a3 b3 a b a
1
a2 a3 b
1
b2 b3 a
1 b 1
a2 b2 a3 b3 A B a
11
a
12
a21 a22 b
11
b
12
b21 b22 a
11 b 11
a
12 b 12
a21 b21 a22 b22
versa)
transposed in order
11-755/18-797 28
x a b c , xT a b c
X a b c d e f , XT a d b e c f
y a b c
, yT a b c
M , MT
– Vectors must have the same number of elements – Row vector times column vector = scalar
– Column vector times row vector = matrix
11-755/18-797 29
a b c d e f
a d ae a f b d be b f c d c e c f
a b c
d e f a d be c f
dc db da c b a d cd bd ad d c b a .
– Coordinates are yards, not ave/st
– a = [200 1600], b = [770 300]
relates to the length of a projection
– How much of the first vector have we covered by following the second one? – Must normalize by the length of the “target” vector
11-755/18-797 30 [200yd 1600yd] norm ≈ 1612 [770yd 300yd] norm ≈ 826
a bT a 200 1600
770
300 200 1600
393yd
norm ≈ 393yd
– Energy at a discrete set of frequencies – Actually 1 x 4096 – X axis is the index of the number in the vector
– Y axis is the value of the number in the vector
11-755/18-797 31
frequency Sqrt(energy) frequency frequency
1 . . . 1 54 . 9 11
1 . 14 . 16 . . 24 . 3
. 13 . 3 . .
C E C2
– How much can you fake a C by playing an E – C.E / |C||E| = 0.1 – Not very much
– C.C2 / |C| /|C2| = 0.5 – Not bad, you can fake it
11-755/18-797 32
frequency Sqrt(energy) frequency frequency
1 . . . 1 54 . 9 11
1 . 14 . 16 . . 24 . 3
. 13 . 3 . .
C E C2
– Shows how the energy in each frequency varies with time – The pattern in each column is a scaled version of the spectrum – Each row is a scaled version of the modulation
11-755/18-797 33
11-755/18-797 34
34 33 32 31 24 23 22 21 14 13 12 11 34 33 32 31 24 23 22 21 14 13 12 11
11-755/18-797 35
3 33 2 32 1 31 3 23 2 22 1 21 3 13 2 12 1 11 4 3 2 1 34 33 32 31 24 23 22 21 14 13 12 11
b a b a b a b a b a b a b a b a b a b b b b a a a a a a a a a a a a B A
11-755/18-797 36
b a b a b a a B A
2 1 2 1
cd bd ad d c b a .
11-755/18-797 37
2 1 2 1
. . b a b a b b a B A
dc db da c b a d .
vector to a column space vector
numbers in the vector
all vectors that can be formed by mixing its columns
11-755/18-797 38
f c z e b y d a x z y x f e d c b a
– Converts a vector in the column space to one in the row space
11-755/18-797 39
f e d y c b a x f e d c b a y x
11-755/18-797 40
6 . 1 3 . 1 7 . 3 . Y
Row space Column space
new axes
– X axis = normal to the second row vector
11-755/18-797 41
6 . 1 3 . 1 7 . 3 . Y
by the 1..k-1,k+1..N-th row vectors in the matrix
– Any set of K-1 vectors represent a hyperplane of dimension K-1 or less
the k-th row vector
– Expressed in inverse-lengths of the vector
11-755/18-797 42
i f c h e b g d a
– The three column vectors of the matrix X are the spectra of three notes – The multiplying column vector Y is just a mixing vector – The result is a sound that is the mixture of the three notes
11-755/18-797 43
1 . . . 24 9 . . 3 1
X
1 2 1
Y
2 . . 7
=
– The images are arranged as columns
– The result of the multiplication is rearranged as an image
11-755/18-797 44
200 x 200 200 x 200 200 x 200 40000 x 2
75 . 25 .
40000 x 1 2 x 1
11-755/18-797 45
2 2 1 2 2 1 1 1 2 1 2 1
and the number of columns from the second matrix
11-755/18-797 46
A B a1 a2 b1 b2 a1 b1 a1 b2 a2 b1 a2 b2
11-755/18-797 47
2 2 1 1 2 1 2 1
11-755/18-797 48
NK N MN N K M K M NK N NK MN M N N
b b a a b b a a b b a a b b b b a a a a a a . . . ... . . . . . . . . . . . . . . . . . . . . .
1 1 2 21 2 12 1 11 1 11 1 11 1 2 21 1 11
The outer product of the first column of A and the first row of
B + outer product of the second column of A and the second row of B + ….
Sum of outer products
2 2 2 2 2 1 2 1
. b a b a b b a a B A
11-755/18-797 49
1 . . . 24 9 . . 3 1
X
. . . . . 1 95 . 9 . 8 . 7 . 6 . 5 . . . . . . . 5 . 5 . 7 . 9 . 1 . . . . . 5 . 75 . 1 75 . 5 .
Y
11-755/18-797 50
1 . . . 24 9 . . 3 1
X
. . . . . 1 95 . 9 . 8 . 7 . 6 . 5 . . . . . . . 5 . 5 . 7 . 9 . 1 . . . . . 5 . 75 . 1 75 . 5 .
Y
11-755/18-797 51
1 . . . 24 9 . . 3 1
X
. . . . . 1 95 . 9 . 8 . 7 . 6 . 5 . . . . . . . 5 . 5 . 7 . 9 . 1 . . . . . 5 . 75 . 1 75 . 5 .
Y
11-755/18-797 52
. . . . . 1 95 . 9 . 8 . 7 . 6 . 5 . . . . . . . 5 . 5 . 7 . 9 . 1 . . . . . 5 . 75 . 1 75 . 5 .
1 . . . 24 9 . . 3 1
X
11-755/18-797 53
. . . . . 1 95 . 9 . 8 . 7 . 6 . 5 . . . . . . . 5 . 5 . 7 . 9 . 1 . . . . . 5 . 75 . 1 75 . 5 .
1 . . . 24 9 . . 3 1
X
11-755/18-797 54
11-755/18-797 55
. . . . . .
2 2 1 1
j i j i
1 9 . 8 . 7 . 6 . 5 . 4 . 3 . 2 . 1 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 1
– The columns represent a sequence of images of decreasing intensity
11-755/18-797 56
1 9 . 8 . 7 . 6 . 5 . 4 . 3 . 2 . 1 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 1
. . . . . .
2 2 1 1
j i j i
. . . . . . 8 . 9 . . . . . . . . . . . . . . . . . . . . . . . . . . 8 . 9 . . . . . . . 8 . 9 .
2 2 2 1 1 1 N N N
i i i i i i i i i
11-755/18-797 57
1 9 . 8 . 7 . 6 . 5 . 4 . 3 . 2 . 1 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 1
. . . . . .
2 2 1 1
j i j i
11-755/18-797 58
. . . . . .
2 2 1 1
j i j i
1 9 . 8 . 7 . 6 . 5 . 4 . 3 . 2 . 1 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 1
11-755/18-797 59
11-755/18-797 60
– All diagonal elements are 1.0 – All off-diagonal elements are 0.0
11-755/18-797 61
1 1 Y
– May flip axes
11-755/18-797 62
1 2 Y
11-755/18-797 63
1 . 1 . . 1 . 1 1 10 . 10 . 6 5 . 1 . 2 1 10 . 2 . 2 2 . 2 . 1 1 1 1 2
representation
the X axis
– The Y axis and pixel value are scaled by identity
11-755/18-797 64
) 2 x ( Newpic . . . . . . . 5 . . 5 . 1 5 . . 5 . 1 N N DA A
11-755/18-797 65
D = N is the width
image
1 2 1 P Newpic B G R P
11-755/18-797 66
– The row entries are axis vectors in a different order – The result is a combination of rotations and reflections
arrangement of the elements in a vector
11-755/18-797 67
x z y z y x 1 1 1
3 4 5 (3,4,5) X Y Z 4 5 3 X (old Y) Y (old Z) Z (old X)
11-755/18-797 68
1 1 1 P 1 1 1 P
1 . 1 . . 1 . 1 1 10 . 10 . 6 5 . 1 . 2 1 10 . 2 . 2 2 . 2 . 1 1
– Object represented as a matrix of 3-Dimensional “position” vectors – Positions identify each point on the surface
11-755/18-797 69
1 1 1 P 1 1 1 P
N N N
z z z y y y x x x . . . . . .
2 1 2 1 2 1
– The new axes are at an angle q to the old one
11-755/18-797 70
' ' cos sin sin cos y x X y x X
new
q q q q
q
R
X Y (x,y)
new
X X R
q
X Y (x,y) (x’,y’)
q q q q cos sin ' sin cos ' y x y y x x
x’ x y’ y q
– Rotation only applies on the “coordinate” rows – The value does not change – Why is pacman grainy?
11-755/18-797 71
1 . 1 . . 1 . 1 1 . . 10 . 6 5 . 1 . 2 1 . . 2 . 2 2 . 2 . 1 1 1 45 cos 45 sin 45 sin 45 cos R 1 . 1 . . 1 . 1 1 . . 2 12 . 2 8 2 7 . 2 3 . 2 3 2 . . 2 8 . 2 4 2 3 . 2 . 2
11-755/18-797 72
X Y Z q Xnew Ynew Znew a
looked at it from above the plane shown by the grid?
– Normal to the plane – Answer: the figure to the right
11-755/18-797 73
– What is the corresponding vector on the plane that is “closest approximation” to it? – What is the transform that converts the vector to its approximation on the plane?
11-755/18-797 74
90degrees projection
11-755/18-797 75
90degrees projection error
– Arranged as a matrix [W1 W2 ..]
– Any vector can be projected onto this plane – The matrix P that rotates and scales the vector so that it becomes its projection is a projection matrix
11-755/18-797 76
90degrees projection W1 W2
– P = W (WTW)-1 WT
expressed as a matrix will give you the same projection matrix
– P = V (VTV)-1 VT
11-755/18-797 77
90degrees projection W1 W2
11-755/18-797 78
– ANY two so long as they have different angles
11-755/18-797 79
the plane in 3D
– The result of the projection is a 3-D vector – P = W (WTW)-1 WT = 3x3, PX = 3x1 – The image must be rotated till the plane is in the plane of the paper
11-755/18-797 80
– PX = X if X is on the plane – If the object is already on the plane, there is no further projection to be performed
– P(PX) = PX
– P2 = P
81 11-755/18-797
– We often cannot do so – But we can explain a significant portion of it
– In our previous example, the “data” were all the points on a cone, and the bases were vectors on the plane
11-755/18-797 82
11-755/18-797 83
How much of the above music was composed of the
above notes
I.e. how much can it be explained by the notes
11-755/18-797 84
M = spectrogram; W = note P = W (WTW)-1 WT Projected Spectrogram = P * M
M = W =
11-755/18-797 85
Floored all matrix values below a threshold to zero
M = W =
11-755/18-797 86
P = W (WTW)-1 WT Projected Spectrogram = P * M
M = W =
11-755/18-797 87
P = W (WTW)-1 WT Projected Spectrogram = P * M
M = W =
– Approximation: Vapprox = a*note1 + b*note2 + c*note3.. – Error vector E = V – Vapprox – Squared error energy for V e(V) = norm(E)2 – Total error = sum over all V { e(V) } = SV e(V)
is minimized
– It does not give you “a”, “b”, “c”.. Though
11-755/18-797 88
c b a Vapprox
note1 note2 note3