Machine Learning for Signal Processing Clustering
Bhiksha Raj Class 11. 13 Oct 2016
1
Processing Clustering Bhiksha Raj Class 11. 13 Oct 2016 1 - - PowerPoint PPT Presentation
Machine Learning for Signal Processing Clustering Bhiksha Raj Class 11. 13 Oct 2016 1 Statistical Modelling and Latent Structure Much of statistical modelling attempts to identify latent structure in the data Structure that is not
1
the data
– Structure that is not immediately apparent from the observed data – But which, if known, helps us explain it better, and make predictions from or about it
proximity
– First-level structure (as opposed to deep structure)
course
2
3
4
– Clustering is the determination of naturally occurring grouping of data/instances (with low within- group variability and high between- group variability)
5
– Clustering is the determination of naturally occurring grouping of data/instances (with low within- group variability and high between- group variability)
6
– Clustering is the determination of naturally occurring grouping of data/instances (with low within- group variability and high between- group variability)
– Find groupings of data such that the groups optimize a “within-group- variability” objective function of some kind
7
– Clustering is the determination of naturally occurring grouping of data/instances (with low within- group variability and high between- group variability)
– Find groupings of data such that the groups optimize a “within-group- variability” objective function of some kind – The objective function used affects the nature of the discovered clusters
– Clustering is the determination of naturally occurring grouping of data/instances (with low within- group variability and high between- group variability)
– Find groupings of data such that the groups optimize a “within-group- variability” objective function of some kind – The objective function used affects the nature of the discovered clusters
9
10
11
12
we perform the quantization?
13
TRAINING QUANTIZATION
x x
– But how do you represent the video?
14
15
Representation: Each number is the #frames assigned to the codeword 30 17 4 12 16 Training: Each point is a video frame
16
17
18
19
20
21
22
23
– Total distance between each element in the cluster and every
– Distance between the two farthest points in the cluster – Total distance of every element in the cluster from the centroid of the cluster – Distance measures are often weighted Minkowski metrics
n n M M M n n
b a w b a w b a w dist ...
2 2 2 1 1 1
24
25
26
27
28
29
N i M i
30
– Each digital value represents an equally wide range of analog values – Regardless of distribution of data – Digital-to-analog conversion represented by a “uniform” table
31
Signal Value Bits Mapped to S >= 3.75v 11 3 * const 3.75v > S >= 2.5v 10 2 * const 2.5v > S >= 1.25v 01 1 * const 1.25v > S >= 0v
Analog value (arrows are quantization levels) Probability of analog value
– Each digital value represents a different range of analog values
– Digital-to-analog conversion represented by a “non-uniform” table
32
Analog value (arrows are quantization levels) Probability of analog value
Signal Value Bits Mapped to S >= 4v 11 4.5 4v > S >= 2.5v 10 3.25 2.5v > S >= 1v 01 1.25 1.0v > S >= 0v 0.5
33
Analog value Probability of analog value
34
Analog value (arrows show quantization levels) Probability of analog value
quantization points
– Right column entries of quantization table
nearest quantization point
points
35
– Right column entries of quantization table
– Draw boundaries
36
– Right column entries of quantization table
– Draw boundaries
37
– Right column entries of quantization table
– Draw boundaries
38
– McQueen, J. 1967. “Some methods for classification and analysis of multivariate observations.” Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 281-297
– Initially group data into the required number of clusters somehow (initialization) – Assign each data point to the closest cluster – Once all data points are assigned to clusters, redefine clusters – Iterate
39
– Every cluster has a centroid – The centroid represents the cluster
– Weight = 1 for basic scheme
40
cluster i i i cluster i i cluster
41
1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
42
1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
43
1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
44
1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
45
1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
46
1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
47
1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
48
1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
49
1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
50
1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points are clustered, recompute centroids 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i i cluster i i cluster
x w w m 1
51
1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points are clustered, recompute centroids 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i i cluster i i cluster
x w w m 1
52
cluster i i cluster cluster
2
cluster cluster cluster
53
54
1.
2.
Perturb centroid of cluster slightly (by < 5%) to generate two centroids
3.
4.
5.
55
1.
2.
Perturb centroid of cluster slightly (by < 5%) to generate two centroids
3.
4.
5.
56
1.
2.
Perturb centroid of cluster slightly (by < 5%) to generate two centroids
3.
4.
5.
57
1.
2.
Perturb centroid of cluster slightly (by < 5%) to generate two centroids
3.
4.
5.
58
1.
2.
Perturb centroid of cluster slightly (by < 5%) to generate two centroids
3.
4.
5.
59
1.
2.
Perturb centroid of cluster slightly (by < 5%) to generate two centroids
3.
4.
5.
– Perturb centroid of cluster slightly (by < 5%) to generate two centroids
60
– Perturb centroid of cluster slightly (by < 5%) to generate two centroids
61
62
– E.g. distance from a circle
– I.e Kernel distances – Kernel K-means
– Spectral clustering
– Normalized cuts..
63
f([x,y]) -> [x,y,z] x = x y = y z = a(x2 + y2)
the desired patterns become natural clusters
– E.g. the quadratic transform above
to compute
– Yet only carry the same information in the lower-dimensional space
64
f([x,y]) -> [x,y,z] x = x y = y z = a(x2 + y2)
65
66
67
68
69
70
71
72
K(x,y)= (xT y + c)2
73
cluster i i cluster cluster
cluster i i i cluster i i i cluster i i cluster
74
cluster i i cluster cluster
cluster i i i cluster i i i cluster i i cluster
RECALL: We may never actually be able to compute this mean because F(x) is not known
– Cluster has 1 point
75
cluster i i i cluster i i cluster
2 cluster cluster cluster
F F F F F
cluster i i i T cluster i i i 2 cluster
) x ( w C ) x ( ) x ( w C ) x ( || m ) x ( || ) cluster , x ( d F F F F F F
cluster i cluster i cluster j j T i j i 2 i T i T
) x ( ) x ( w w C ) x ( ) x ( w C 2 ) x ( ) x (
cluster i cluster i cluster j j i j i 2 i i
) x , x ( K w w C ) x , x ( K w C 2 ) x , x ( K
Computed entirely using only the kernel function!
76
1. Initialize a set of clusters randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
77
1. Initialize a set of clusters randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
The centroids are virtual: we don’t actually compute them explicitly!
cluster i i i cluster i i cluster
x w w m 1
78
1. Initialize a set of clusters randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
cluster i cluster i cluster j j i j i i i cluster
x x K w w C x x K w C x x K d ) , ( ) , ( 2 ) , (
2
79
1. Initialize a set of clusters randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
80
1. Initialize a set of clusters randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
81
1. Initialize a set of clusters randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
82
1. Initialize a set of clusters randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
83
1. Initialize a set of clusters randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
84
1. Initialize a set of clusters randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
85
1. Initialize a set of clusters randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points clustered, recompute cluster centroid 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i cluster cluster
x N m 1
86
1. Initialize a set of clusters randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points are clustered, recompute centroids 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i i cluster i i cluster
x w w m 1
means
know the high-dimensional space
inner products in it
87
1. Initialize a set of clusters randomly 2. For each data point x, find the distance from the centroid for each cluster
Put data point in the cluster of the closest centroid
minimum 4. When all data points are clustered, recompute centroids 5. If not converged, go back to 2
) , (
cluster cluster
m x d distance
cluster i i i cluster i i cluster
x w w m 1
means
know the high-dimensional space
inner products in it
88
89
90
91