Clusters and features from combinatorial stochastic processes - - PowerPoint PPT Presentation
Clusters and features from combinatorial stochastic processes - - PowerPoint PPT Presentation
Clusters and features from combinatorial stochastic processes Tamara Broderick, Michael I. Jordan, Jim Pitman UC Berkeley Clustering/Partition 1 Clustering/Partition clusters, classes, blocks (of a partition) 1
1
Clustering/Partition
“clusters”, “classes”, “blocks (of a partition)”
1
Clustering/Partition
C a t D
- g
M
- u
s e
1
“clusters”, “classes”, “blocks (of a partition)” Clustering/Partition
Clustering/Partition Picture 1 Picture 2 Picture 3 Picture 4 Picture 5 Picture 6 Picture 7 C a t D
- g
M
- u
s e L i z a r d S h e e p
2
Latent feature allocation
“features”, “topics”
Picture 1 Picture 2 Picture 3 Picture 4 Picture 5 Picture 6 Picture 7 C a t D
- g
M
- u
s e L i z a r d S h e e p
3
- Exchangeable
- Finite # of features
per data point
Characterizations
4
- Exchangeable cluster distributions are characterized
- What about exchangeable feature distributions?
Exchangeable probability functions
5
P( ) = p (NN,1, . . . , NN,K)
1 2 N 2 K ... ... 1
Exchangeable probability functions
5
P(
1 2 N 2 K ... ... 1
) = p (SN,1, . . . , SN,K)
Exchangeable probability functions
5
P(
1 2 N 2 K ... ... 1
Size of Kth cluster ) = p (SN,1, . . . , SN,K)
Exchangeable probability functions
5
P( Exchangeable partition probability function (EPPF)
1 2 N 1 2 K ... ...
[Pitman 1995]
) = p (SN,1, . . . , SN,K)
Exchangeable probability functions
6
“Exchangeable feature probability function” (EFPF)?
Example: Indian buffet process
7
For n = 1, 2, ...
- 1. Data point n has an existing feature
k that has already occurred times with probability
- 2. Number of new features for data
point n: K+
n = Poisson
- γ
θ θ + n − 1
- Mn−1,k
Mn−1,k θ + n − 1
n = 1 2 N ... k = 1 2 K ...
[Griffiths, Ghahramani 2005]
7
For n = 1, 2, ...
- 1. Data point n has an existing feature
k that has already occurred times with probability
- 2. Number of new features for data
point n: K+
n = Poisson
- γ
θ θ + n − 1
- Mn−1,k
Mn−1,k θ + n − 1
[Griffiths, Ghahramani 2005]
Example: Indian buffet process
n = 1 2 N ... k = 1 2 K ...
7
For n = 1, 2, ..., N
- 1. Data point n has an existing feature
k that has already occurred times with probability
- 2. Number of new features for data
point n: K+
n = Poisson
- γ
θ θ + n − 1
- Mn−1,k
Mn−1,k θ + n − 1
[Griffiths, Ghahramani 2005]
Example: Indian buffet process
n = 1 2 N ... k = 1 2 K ...
For n = 1, 2, ..., N
- 1. Data point n has an existing feature
k that has already occurred times with probability
- 2. Number of new features for data
point n: Sn−1,k θ + n − 1
7
K+
n = Poisson
- γ
θ θ + n − 1
- [Griffiths, Ghahramani 2005]
Example: Indian buffet process
n = 1 2 N ... k = 1 2 K ...
Sn−1,k
7
For n = 1, 2, ..., N
- 1. Data point n has an existing feature
k that has already occurred times with probability
- 2. Number of new features for data
point n: K+
n = Poisson
- γ
θ θ + n − 1
- [Griffiths, Ghahramani 2005]
Example: Indian buffet process
n = 1 2 N ... k = 1 2 K ...
Sn−1,k θ + n − 1 Sn−1,k
7 [Griffiths, Ghahramani 2005]
Example: Indian buffet process
n = 1 2 N ... k = 1 2 K ...
For n = 1, 2, ..., N
- 1. Data point n has an existing feature
k that has already occurred times with probability
- 2. Number of new features for data
point n: K+
n = Poisson
- γ
θ θ + n − 1
- Sn−1,k
θ + n − 1 Sn−1,k
7 [Griffiths, Ghahramani 2005]
Example: Indian buffet process
n = 1 2 N ... k = 1 2 K ...
For n = 1, 2, ..., N
- 1. Data point n has an existing feature
k that has already occurred times with probability
- 2. Number of new features for data
point n: K+
n = Poisson
- γ
θ θ + n − 1
- Sn−1,k
θ + n − 1 Sn−1,k
7 [Griffiths, Ghahramani 2005]
Example: Indian buffet process
n = 1 2 N ... k = 1 2 K ...
For n = 1, 2, ..., N
- 1. Data point n has an existing feature
k that has already occurred times with probability
- 2. Number of new features for data
point n: K+
n = Poisson
- γ
θ θ + n − 1
- Sn−1,k
θ + n − 1 Sn−1,k
7 [Griffiths, Ghahramani 2005]
Example: Indian buffet process
n = 1 2 N ... k = 1 2 K ...
For n = 1, 2, ..., N
- 1. Data point n has an existing feature
k that has already occurred times with probability
- 2. Number of new features for data
point n: K+
n = Poisson
- γ
θ θ + n − 1
- Sn−1,k
θ + n − 1 Sn−1,k
7 [Griffiths, Ghahramani 2005]
Example: Indian buffet process
n = 1 2 N ... k = 1 2 K ...
For n = 1, 2, ..., N
- 1. Data point n has an existing feature
k that has already occurred times with probability
- 2. Number of new features for data
point n: K+
n = Poisson
- γ
θ θ + n − 1
- Sn−1,k
θ + n − 1 Sn−1,k
7 [Griffiths, Ghahramani 2005]
Example: Indian buffet process
n = 1 2 N ... k = 1 2 K ...
For n = 1, 2, ..., N
- 1. Data point n has an existing feature
k that has already occurred times with probability
- 2. Number of new features for data
point n: K+
n = Poisson
- γ
θ θ + n − 1
- Sn−1,k
θ + n − 1 Sn−1,k
7 [Griffiths, Ghahramani 2005]
Example: Indian buffet process
n = 1 2 N ... k = 1 2 K ...
For n = 1, 2, ..., N
- 1. Data point n has an existing feature
k that has already occurred times with probability
- 2. Number of new features for data
point n: K+
n = Poisson
- γ
θ θ + n − 1
- Sn−1,k
θ + n − 1 Sn−1,k
7
n = 1 2 N ... k = 1 2 K ...
[Griffiths, Ghahramani 2005]
Example: Indian buffet process
For n = 1, 2, ..., N
- 1. Data point n has an existing feature
k that has already occurred times with probability
- 2. Number of new features for data
point n: K+
n = Poisson
- γ
θ θ + n − 1
- Sn−1,k
θ + n − 1 Sn−1,k
Exchangeable probability functions
8
“Exchangeable feature probability function” (EFPF)?
Exchangeable probability functions
8
k = 1 2 K ...
P( ) “Exchangeable feature probability function” (EFPF)? Example: Indian buffet process (IBP)
n = 1 2 N ...
= 1 KN!(θγ)KN exp
- −θγ
N
- n=1
(θ + n − 1)−1 KN
- k=1
Γ(SN,k)Γ(N − SN,k + θ) Γ(N + θ)
Exchangeable probability functions
8
P( ) “Exchangeable feature probability function” (EFPF)? Example: Indian buffet process (IBP)
k = 1 2 K ... n = 1 2 N ...
Exchangeable probability functions
8
P( ) “Exchangeable feature probability function” (EFPF)? Example: Indian buffet process (IBP) Size of kth feature
k = 1 2 K ... n = 1 2 N ...
= 1 KN!(θγ)KN exp
- −θγ
N
- n=1
(θ + n − 1)−1 KN
- k=1
Γ(SN,k)Γ(N − SN,k + θ) Γ(N + θ)
Exchangeable probability functions
8
P( ) “Exchangeable feature probability function” (EFPF)? Example: Indian buffet process (IBP) Number of features Size of kth feature
k = 1 2 K ... n = 1 2 N ...
= 1 KN!(θγ)KN exp
- −θγ
N
- n=1
(θ + n − 1)−1 KN
- k=1
Γ(SN,k)Γ(N − SN,k + θ) Γ(N + θ)
Exchangeable probability functions
8
P( ) “Exchangeable feature probability function” (EFPF)? Example: Indian buffet process (IBP) Number of features Size of kth feature Number of data points
k = 1 2 K ... n = 1 2 N ...
= 1 KN!(θγ)KN exp
- −θγ
N
- n=1
(θ + n − 1)−1 KN
- k=1
Γ(SN,k)Γ(N − SN,k + θ) Γ(N + θ)
= p(N; SN,1, SN,2, . . . , SN,K)
Exchangeable probability functions
8
P( ) “Exchangeable feature probability function” (EFPF)? Example: Indian buffet process (IBP) Number of features Size of kth feature Number of data points
[Broderick, Jordan, Pitman 2012]
k = 1 2 K ... n = 1 2 N ...
= 1 KN!(θγ)KN exp
- −θγ
N
- n=1
(θ + n − 1)−1 KN
- k=1
Γ(SN,k)Γ(N − SN,k + θ) Γ(N + θ)
Exchangeable probability functions
8
P( ) “Exchangeable feature probability function” (EFPF)? Example: Indian buffet process (IBP) Number of features Size of kth feature Number of data points
[Broderick, Jordan, Pitman 2012]
“EFPF”
k = 1 2 K ... n = 1 2 N ...
= 1 KN!(θγ)KN exp
- −θγ
N
- n=1
(θ + n − 1)−1 KN
- k=1
Γ(SN,k)Γ(N − SN,k + θ) Γ(N + θ)
= p(N; SN,1, SN,2, . . . , SN,K)
Exchangeable probability functions
9
Counterexample “Exchangeable feature probability function” (EFPF)?
[Broderick, Jordan, Pitman 2012]
n = 1 2 N ...
Exchangeable probability functions
9
Counterexample P(row = ) = p1 ) = p2 ) = p3 ) = p4 P(row = P(row = P(row = “Exchangeable feature probability function” (EFPF)?
[Broderick, Jordan, Pitman 2012]
n = 1 2 N ...
Exchangeable probability functions
9
Counterexample P(row = ) = p1 ) = p2 ) = p3 ) = p4 P(row = P(row = P(row = “Exchangeable feature probability function” (EFPF)?
[Broderick, Jordan, Pitman 2012]
n = 1 2 N ...
Exchangeable probability functions
9
Counterexample P(row = ) = p1 ) = p2 ) = p3 ) = p4 P(row = P(row = P(row = P( “Exchangeable feature probability function” (EFPF)?
[Broderick, Jordan, Pitman 2012]
) ) P(
n = 1 2 N ...
Exchangeable probability functions
9
Counterexample P(row = ) = p1 ) = p2 ) = p3 ) = p4 P(row = P(row = P(row = P( ) = P( ) p1p2 = p3p4 “Exchangeable feature probability function” (EFPF)?
[Broderick, Jordan, Pitman 2012]
n = 1 2 N ...
Exchangeable probability functions
9
Counterexample P(row = ) = p1 ) = p2 ) = p3 ) = p4 P(row = P(row = P(row = P( ) = P( ) p1p2 = p3p4 “Exchangeable feature probability function” (EFPF)?
[Broderick, Jordan, Pitman 2012]
n = 1 2 N ...
Exchangeable probability functions
Exchangeable cluster distributions = Cluster distributions with EPPFs Exchangeable feature distributions Two-feature example IBP Feature distributions with EFPFs
10 [Broderick, Jordan, Pitman 2012]
Paintboxes
Exchangeable partition: Kingman paintbox
[Kingman 1978] 11
Paintboxes
[Kingman 1978] 11
Exchangeable partition: Kingman paintbox
Paintboxes
[Kingman 1978] 11
Exchangeable partition: Kingman paintbox
Paintboxes
1 1
[Kingman 1978]
Exchangeable partition: Kingman paintbox
11
Paintboxes
1 2 1 2
[Kingman 1978]
Exchangeable partition: Kingman paintbox
11
Paintboxes
1 2 3 1 2 3
[Kingman 1978]
Exchangeable partition: Kingman paintbox
11
Paintboxes
1 2 3 4 1 2 3 4
[Kingman 1978]
Exchangeable partition: Kingman paintbox
11
Paintboxes
1 2 3 4 5 1 2 3 4 5
[Kingman 1978]
Exchangeable partition: Kingman paintbox
11
Paintboxes
1 2 3 4 6 5 1 2 3 4 5 6
[Kingman 1978]
Exchangeable partition: Kingman paintbox
11
Paintboxes
1 2 3 4 6 5 7 1 2 3 4 5 6 7
[Kingman 1978]
Exchangeable partition: Kingman paintbox
11
Paintboxes
1 2 3 4 6 5 7 1 2 3 4 5 6 7 Cat cluster Dog cluster ...
[Kingman 1978]
Exchangeable partition: Kingman paintbox
11
Paintboxes
1 2 3 4 6 5 7 1 2 3 4 5 6 7 Cat cluster Dog cluster Mouse cluster Lizard cluster Sheep cluster Horse cluster
[Kingman 1978]
Exchangeable partition: Kingman paintbox
12
Paintboxes
[Broderick, Pitman, Jordan (submitted)]
Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature
13
Paintboxes
Exchangeable feature allocation: feature paintbox
[Broderick, Pitman, Jordan (submitted)]
Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature
13
Paintboxes
1 1 Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature
13 [Broderick, Pitman, Jordan (submitted)]
Exchangeable feature allocation: feature paintbox
Paintboxes
1 2 1 2 Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature
13 [Broderick, Pitman, Jordan (submitted)]
Exchangeable feature allocation: feature paintbox
Paintboxes
1 2 3 1 2 3 Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature
13 [Broderick, Pitman, Jordan (submitted)]
Exchangeable feature allocation: feature paintbox
Paintboxes
1 2 3 4 1 2 3 4 Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature
13 [Broderick, Pitman, Jordan (submitted)]
Exchangeable feature allocation: feature paintbox
Paintboxes
1 2 3 4 5 1 2 3 4 5 Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature
13 [Broderick, Pitman, Jordan (submitted)]
Exchangeable feature allocation: feature paintbox
Paintboxes
1 2 3 4 6 5 1 2 3 4 5 6 Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature
13 [Broderick, Pitman, Jordan (submitted)]
Exchangeable feature allocation: feature paintbox
Paintboxes
1 2 3 4 6 5 7 1 2 3 4 5 6 7 Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature
13 [Broderick, Pitman, Jordan (submitted)]
Exchangeable feature allocation: feature paintbox
Paintboxes
Exchangeable cluster distributions = Cluster distributions with EPPFs Exchangeable feature distributions Two-feature example IBP Feature distributions with EFPFs
14 [Broderick, Jordan, Pitman 2012]
Exchangeable cluster distributions = Cluster distributions with EPPFs = Kingman paintbox partitions Exchangeable feature distributions = Feature paintbox allocations Feature distributions with EFPFs Two-feature example IBP
Paintboxes
14 [Broderick, Pitman, Jordan (submitted)]
Paintboxes
Two feature example
Feature 1 Feature 2
p1 P(row = ) = p1 ) = p2 ) = p3 ) = p4 P(row = P(row = P(row = p2 p3 p4
15
Paintboxes
Indian buffet process: beta feature frequencies
16 [Thibaux, Jordan 2007]
For m = 1, 2, ...
- 1. Draw
Set
- 2. For k =
Draw an atom mass of size K+
m = Poisson
- γ
θ θ + m − 1
- Paintboxes
Indian buffet process: beta feature frequencies
16 [Thibaux, Jordan 2007]
Km =
m
- j=1
K+
m
Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)
For m = 1, 2, ...
- 1. Draw
Set
- 2. For k =
Draw a frequency of size K+
m = Poisson
- γ
θ θ + m − 1
- Paintboxes
Indian buffet process: beta feature frequencies
[Thibaux, Jordan 2007]
Km =
m
- j=1
K+
m
Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)
16
Paintboxes
Indian buffet process: beta feature frequencies
1
[Thibaux, Jordan 2007] 16
For m = 1, 2, ...
- 1. Draw
Set
- 2. For k =
Draw a frequency of size K+
m = Poisson
- γ
θ θ + m − 1
- Km =
m
- j=1
K+
m
Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)
Paintboxes
Indian buffet process: beta feature frequencies
1 q1 q2
[Thibaux, Jordan 2007] 16
For m = 1, 2, ...
- 1. Draw
Set
- 2. For k =
Draw a frequency of size K+
m = Poisson
- γ
θ θ + m − 1
- Km =
m
- j=1
K+
m
Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)
Paintboxes
Indian buffet process: beta feature frequencies
1 q1 q2 q3
[Thibaux, Jordan 2007] 16
For m = 1, 2, ...
- 1. Draw
Set
- 2. For k =
Draw a frequency of size K+
m = Poisson
- γ
θ θ + m − 1
- Km =
m
- j=1
K+
m
Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)
Paintboxes
Indian buffet process: beta feature frequencies
1 q1 q2 q3 q4 q5 q6
[Thibaux, Jordan 2007] 16
For m = 1, 2, ...
- 1. Draw
Set
- 2. For k =
Draw a frequency of size K+
m = Poisson
- γ
θ θ + m − 1
- Km =
m
- j=1
K+
m
Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)
Paintboxes
Indian buffet process: beta feature frequencies
1 q1 q2 q3 q4 q5
...
q6
[Thibaux, Jordan 2007] 16
For m = 1, 2, ...
- 1. Draw
Set
- 2. For k =
Draw a frequency of size K+
m = Poisson
- γ
θ θ + m − 1
- Km =
m
- j=1
K+
m
Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)
Paintboxes
Indian buffet process: beta feature frequencies
1 q1 q2 q3 q4 q5
...
q6
[Thibaux, Jordan 2007] 16
For m = 1, 2, ...
- 1. Draw
Set
- 2. For k =
Draw a frequency of size K+
m = Poisson
- γ
θ θ + m − 1
- Km =
m
- j=1
K+
m
Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)
Paintboxes
Indian buffet process: beta feature frequencies
1 q1 q2 q3 q4 q5
...
q6
[Thibaux, Jordan 2007] 16
For m = 1, 2, ...
- 1. Draw
Set
- 2. For k =
Draw a frequency of size K+
m = Poisson
- γ
θ θ + m − 1
- Km =
m
- j=1
K+
m
Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)
Paintboxes
Indian buffet process: beta feature frequencies
1 q1 q2 q3 q4 q5
...
q6
[Thibaux, Jordan 2007] 16
For m = 1, 2, ...
- 1. Draw
Set
- 2. For k =
Draw a frequency of size K+
m = Poisson
- γ
θ θ + m − 1
- Km =
m
- j=1
K+
m
Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)
Paintboxes
Indian buffet process: beta feature frequencies
1 q1 q2 q3 q4 q5
...
q6
[Thibaux, Jordan 2007] 16
For m = 1, 2, ...
- 1. Draw
Set
- 2. For k =
Draw a frequency of size K+
m = Poisson
- γ
θ θ + m − 1
- Km =
m
- j=1
K+
m
Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)
Paintboxes
Indian buffet process: beta feature frequencies
1 q1 q2 q3 q4 q5 q6
...
17
Paintboxes
Indian buffet process: beta feature frequencies
18
Paintboxes
Indian buffet process: beta feature frequencies
18
Paintboxes
Indian buffet process: beta feature frequencies
18
Paintboxes
Indian buffet process: beta feature frequencies
18
Paintboxes
Indian buffet process: beta feature frequencies
...
18
1 q1 q2 q3 q4 q5
...
q6
Paintboxes
19
“Frequency models”
1 q1 q2 q3 q4 q5
...
q6
Paintboxes
19 [Broderick, Pitman, Jordan (submitted)]
Paintboxes
Two feature example
Feature 1 Feature 2
p1 P(row = ) = p1 ) = p2 ) = p3 ) = p4 P(row = P(row = P(row = p2 p3 p4
20
Paintboxes
Two feature example
Feature 1 Feature 2
p1 P(row = ) = p1 ) = p2 ) = p3 ) = p4 P(row = P(row = P(row = p2 p3 p4
20
Not a frequency model
Exchangeable cluster distributions = Cluster distributions with EPPFs = Kingman paintbox partitions Exchangeable feature distributions = Feature paintbox allocations Feature distributions with EFPFs Two-feature example IBP
Paintboxes
21 [Broderick, Pitman, Jordan (submitted)]
Exchangeable cluster distributions = Cluster distributions with EPPFs = Kingman paintbox partitions Exchangeable feature distributions = Feature paintbox allocations Frequency models Two-feature example IBP
Paintboxes
21 [Broderick, Pitman, Jordan (submitted)]
1 q1 q2 q3 q4 q5
...
q6
Frequency models: EFPFs?
22
1 q1 q2 q3 q4 q5
...
q6
Frequency models: EFPFs?
22
1 q1 q2 q3 q4 q5
...
q6
Frequency models: EFPFs?
22
1 q1 q2 q3 q4 q5
...
q6 n = 1 2 N ... k = 1 2 K ...
Frequency models: EFPFs?
22
1 q1 q2 q3 q4 q5
...
q6 n = 1 2 N ... k = 1 2 K ...
P( )
Frequency models: EFPFs?
22
1 q1 q2 q3 q4 q5
...
q6 n = 1 2 N ... k = 1 2 K ...
P( )
Frequency models: EFPFs?
22
qSN,1
1
(1 − q1)N−SN,1
1 q1 q2 q3 q4 q5
...
q6 n = 1 2 N ... k = 1 2 K ...
P( )
Frequency models: EFPFs?
22
qSN,k
k
(1 − qk)N−SN,k
1 q1 q2 q3 q4 q5
...
q6 n = 1 2 N ... k = 1 2 K ...
P( )
Frequency models: EFPFs?
22
qSN,k
ik
(1 − qik)N−SN,k
1 q1 q2 q3 q4 q5
...
q6
23
n = 1 2 N ... k = 1 2 K ...
P( )
Frequency models: EFPFs?
K
- k=1
qSN,k
ik
(1 − qik)N−SN,k
1 q1 q2 q3 q4 q5
...
q6 n = 1 2 N ... k = 1 2 K ...
P( )
Frequency models: EFPFs?
23
K
- k=1
qSN,k
ik
(1 − qik)N−SN,k ·
- j /
∈{ik}K
k=1
(1 − qj)N
1 q1 q2 q3 q4 q5
...
q6 n = 1 2 N ... k = 1 2 K ...
P( )
Frequency models: EFPFs?
23
= E[
- distinct ik
1 K!
K
- k=1
qSN,k
ik
(1 − qik)N−SN,k ·
- j /
∈{ik}K
k=1
(1 − qj)N]
1 q1 q2 q3 q4 q5
...
q6 n = 1 2 N ... k = 1 2 K ...
P( )
Frequency models: EFPFs?
23
n = 1 2 N ... k = 1 2 K ...
P( )
[Broderick, Pitman, Jordan (submitted)]
Frequency models: EFPFs?
23
= E[
- distinct ik
1 K!
K
- k=1
qSN,k
ik
(1 − qik)N−SN,k ·
- j /
∈{ik}K
k=1
(1 − qj)N]
n = 1 2 N ... k = 1 2 K ...
P( ) Size of kth feature
[Broderick, Pitman, Jordan (submitted)]
Frequency models: EFPFs?
23
= E[
- distinct ik
1 K!
K
- k=1
qSN,k
ik
(1 − qik)N−SN,k ·
- j /
∈{ik}K
k=1
(1 − qj)N]
n = 1 2 N ... k = 1 2 K ...
P( ) Number of features Size of kth feature
[Broderick, Pitman, Jordan (submitted)]
Frequency models: EFPFs?
23
= E[
- distinct ik
1 K!
K
- k=1
qSN,k
ik
(1 − qik)N−SN,k ·
- j /
∈{ik}K
k=1
(1 − qj)N]
n = 1 2 N ... k = 1 2 K ...
P( ) Number of features Size of kth feature Number of data points
[Broderick, Pitman, Jordan (submitted)]
Frequency models: EFPFs?
23
= E[
- distinct ik
1 K!
K
- k=1
qSN,k
ik
(1 − qik)N−SN,k ·
- j /
∈{ik}K
k=1
(1 − qj)N]
= p(N; SN,1, SN,2, . . . , SN,K)
n = 1 2 N ... k = 1 2 K ...
P( ) Number of features Size of kth feature Number of data points
[Broderick, Pitman, Jordan (submitted)]
Frequency models: EFPFs?
EFPF
23
= E[
- distinct ik
1 K!
K
- k=1
qSN,k
ik
(1 − qik)N−SN,k ·
- j /
∈{ik}K
k=1
(1 − qj)N]
Exchangeable cluster distributions = Cluster distributions with EPPFs = Kingman paintbox partitions Exchangeable feature distributions = Feature paintbox allocations Frequency models Two-feature example IBP
24 [Broderick, Pitman, Jordan (submitted)]
Frequency models: EFPFs?
Exchangeable cluster distributions = Cluster distributions with EPPFs = Kingman paintbox partitions Exchangeable feature distributions = Feature paintbox allocations Frequency models Two-feature example IBP
24
Feature distributions with EFPFs
Frequency models: EFPFs?
[Broderick, Pitman, Jordan (submitted)]
Distributions with EFPFs: frequencies?
25
Distributions with EFPFs: frequencies?
25
Feature allocation
n = 1 2 N ...
Distributions with EFPFs: frequencies?
Feature allocation K=2 for all N
25
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation K=2 for all N
25
n = 1 2 N ...
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation Want to show: K=2 for all N ∃q1 ∃q2
25
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation Want to show: K=2 for all N ∃q1 ∃q2
25
n = 1 2 N ...
p(N; SN,1, SN,2)
|q1:2) = q1(1 − q2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation P(row = e.g. K=2 for all N ∃q1 ∃q2 Want to show:
25
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation P(row = e.g. K=2 for all N ∃q1 ∃q2 |q1:2) = q1(1 − q2) Want to show:
25
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation P(row = e.g. K=2 for all N ∃q1 ∃q2 |q1:2) = q1(1 − q2) Want to show:
25
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation
26
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation Feature paintbox |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =
26
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row = p1 p2 p3 p4
26
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row = p1 p2 p3 p4 q1 = p1 + p3 q2 = p2 + p3
26
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation P( ) = P( ) = P( ) |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =
27
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation P( ) = P( P(4; 2, 2) = ) = P( ) |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =
27
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation P( ) = P( P(4; 2, 2) = ) = P( ) |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =
27
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation P( ) = P( P(4; 2, 2) = ) = P( ) E[p2
1p2 2] = E[p2 3p2 4] = E[p1p2p3p4]
|p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =
27
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation P( ) = P( P(4; 2, 2) = ) = P( ) E[p2
1p2 2] = E[p2 3p2 4] = E[p1p2p3p4]
E[(p1p2 − p3p4)2] = 0 |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =
27
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation P( ) = P( P(4; 2, 2) = ) = P( ) E[p2
1p2 2] = E[p2 3p2 4] = E[p1p2p3p4]
E[(p1p2 − p3p4)2] = 0 p1p2
a.s.
= p3p4 |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =
27
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation p1p2
a.s.
= p3p4 |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =
28
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation algebra p1p2
a.s.
= p3p4 |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =
28
n = 1 2 N ...
p(N; SN,1, SN,2)
p1
a.s.
= (p1 + p3)(1 − [p2 + p3])
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation algebra p1p2
a.s.
= p3p4 |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =
28
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation algebra p1p2
a.s.
= p3p4 p1
a.s.
= (p1 + p3)(1 − [p2 + p3]) |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =
28
n = 1 2 N ...
p(N; SN,1, SN,2)
p1
a.s.
= q1(1 − [p2 + p3])
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation algebra p1p2
a.s.
= p3p4 |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =
28
n = 1 2 N ...
p(N; SN,1, SN,2)
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation algebra p1p2
a.s.
= p3p4 |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =
28
n = 1 2 N ...
p(N; SN,1, SN,2) p1
a.s.
= q1(1 − [p2 + p3])
Distributions with EFPFs: frequencies?
Assume EFPF Feature allocation algebra p1p2
a.s.
= p3p4 |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =
28
n = 1 2 N ...
p(N; SN,1, SN,2) p1
a.s.
= q1(1 − q2)
Exchangeable cluster distributions = Cluster distributions with EPPFs = Kingman paintbox partitions Exchangeable feature distributions = Feature paintbox allocations Frequency models Two-feature example IBP
29
Feature distributions with EFPFs
Distributions with EFPFs: frequencies?
[Broderick, Pitman, Jordan (submitted)]
Feature distributions with EFPFs = Frequency models Exchangeable cluster distributions = Cluster distributions with EPPFs = Kingman paintbox partitions Exchangeable feature distributions = Feature paintbox allocations Two-feature example IBP
Distributions with EFPFs: frequencies?
[Broderick, Pitman, Jordan (submitted)] 29
Conclusions
- Feature paintbox: characterization of exchangeable
feature models
- Limits of clustering characterizations in feature case?
- Remaining connections to fill in
- Other combinatorial structures
Exchangeable features
30
Conclusions
- Feature paintbox: characterization of exchangeable
feature models
- Limits of clustering characterizations in feature case?
- Remaining connections to fill in
- Other combinatorial structures
Exchangeable features
30
Conclusions
- Feature paintbox: characterization of exchangeable
feature models
- Limits of clustering characterizations in feature case?
- Remaining connections to fill in
- Other combinatorial structures
Exchangeable features; feature paintbox
30
Conclusions
- Feature paintbox: characterization of exchangeable
feature models
- Limits of clustering characterizations in feature case?
- Remaining connections to fill in
- Other combinatorial structures
Exchangeable features; feature paintbox
30
Conclusions
- Feature paintbox: characterization of exchangeable
feature models
- Limits of clustering characterizations in feature case?
- Remaining connections to fill in
- Other combinatorial structures
Models with EFPFs Exchangeable features; feature paintbox
30
Conclusions
- Feature paintbox: characterization of exchangeable
feature models
- Limits of clustering characterizations in feature case?
- Remaining connections to fill in
- Other combinatorial structures
Models with EFPFs IBP Two-feature example Exchangeable features; feature paintbox
30
Conclusions
- Feature paintbox: characterization of exchangeable
feature models
- Limits of clustering characterizations in feature case?
- Remaining connections to fill in
- Other combinatorial structures
Models with EFPFs IBP Two-feature example Exchangeable clusters Exchangeable features; feature paintbox
30
Conclusions
- Feature paintbox: characterization of exchangeable
feature models
- Characterization of alternative correlation structure
- Remaining connections to fill in
- Other combinatorial structures
Models with EFPFs IBP Two-feature example Exchangeable clusters Exchangeable features; feature paintbox
30
Conclusions
- Feature paintbox: characterization of exchangeable
feature models
- Characterization of alternative correlation structure
- Remaining connections to fill in
- Other combinatorial structures
Models with EFPFs IBP Two-feature example Exchangeable features; feature paintbox Exchangeable clusters; models with EPPFs
30
Conclusions
- Feature paintbox: characterization of exchangeable
feature models
- Characterization of alternative correlation structure
- Remaining connections to fill in
- Other combinatorial structures
Models with EFPFs; frequency models IBP Two-feature example Exchangeable features; feature paintbox Exchangeable clusters; models with EPPFs
30
Conclusions
- Feature paintbox: characterization of exchangeable
feature models
- Characterization of alternative correlation structure
- Remaining connections to fill in
- Other combinatorial structures
Models with EFPFs; frequency models IBP Two-feature example Exchangeable features; feature paintbox Exchangeable clusters; models with EPPFs; Kingman paintbox
30
Conclusions
- Feature paintbox: characterization of exchangeable
feature models
- Characterization of alternative correlation structure
- Remaining connections to fill in
- Other combinatorial structures
Models with EFPFs; frequency models IBP Two-feature example Exchangeable features; feature paintbox Exchangeable clusters; models with EPPFs; Kingman paintbox
30
Conclusions
- Feature paintbox: characterization of exchangeable
feature models
- Characterization of alternative correlation structure
- Remaining connections to fill in
- Other combinatorial structures
Models with EFPFs; frequency models Two-feature example Exchangeable features; feature paintbox Exchangeable clusters; models with EPPFs; Kingman paintbox IBP Completely random measures
30
Conclusions
- Feature paintbox: characterization of exchangeable
feature models
- Characterization of alternative correlation structure
- Remaining connections to fill in
- Other combinatorial structures
Models with EFPFs; frequency models Two-feature example Exchangeable features; feature paintbox Exchangeable clusters; models with EPPFs; Kingman paintbox IBP Completely random measures Normalized completely random measures
30
Conclusions
- Feature paintbox: characterization of exchangeable
feature models
- Characterization of alternative correlation structure
- Remaining connections to fill in
- Other combinatorial structures
Models with EFPFs; frequency models Two-feature example Exchangeable features; feature paintbox Exchangeable clusters; models with EPPFs; Kingman paintbox IBP Completely random measures Normalized completely random measures CRP
30
Conclusions
Two-feature example Exchangeable features; feature paintbox IBP Completely random measures CRP Normalized completely random measures Models with EFPFs; frequency models
- Feature paintbox: characterization of exchangeable
feature models
- Characterization of alternative correlation structure
- Remaining connections to fill in
- Other combinatorial structures
Exchangeable clusters; models with EPPFs; Kingman paintbox
30
Conclusions
Two-feature example Exchangeable features; feature paintbox IBP Completely random measures CRP Normalized completely random measures Models with EFPFs; frequency models
- Feature paintbox: characterization of exchangeable
feature models
- Characterization of alternative correlation structure
- Remaining connections to fill in
- Other combinatorial structures
Exchangeable clusters; models with EPPFs; Kingman paintbox
30
References
31
- T. Broderick, M. I. Jordan, and J. Pitman. Clusters and features from combinatorial stochastic processes. Arxiv
preprint arXiv:1206.5862, 2012.
- T. Broderick, J. Pitman, and M. I. Jordan. Feature allocations, probability functions, and paintboxes.
Submitted.
- T. Broderick, L. Mackey, J. Paisley, and M. I. Jordan. Combinatorial clustering and the beta negative binomial
- process. Arxiv preprint arXiv:1111.1802, 2011.
- T. Griffiths and Z. Ghahramani. Infinite latent feature models and the Indian buffet process. In Y
. Weiss, B. Scholkopf, and J. Platt, editors, Advances in Neural Information Processing Systems 18, pages 475–482. MIT Press, Cambridge, MA, 2006.
- N. L. Hjort. Nonparametric bayes estimators based on beta processes in models for life history data. Annals of
Statistics, 18(3):1259–1294, 1990. Y . Kim. Nonparametric Bayesian estimators for counting processes. Annals of Statistics, 27(2):562–588, 1999.
- J. F. C. Kingman. The representation of partition structures. Journal of the London Mathematical Society, 2(2):374,
1978.
- J. Pitman. Exchangeable and partially exchangeable random partitions. Probability Theory and Related Fields,
102(2):145–158, 1995.
- R. Thibaux and M. I. Jordan. Hierarchical beta processes and the Indian buffet process. In Proceedings of the
International Conference on Artificial Intelligence and Statistics, volume 11, 2007.
- M. Zhou, L. Hannah, D. Dunson, and L. Carin. Beta-negative binomial process and Poisson factor analysis.
In Proceedings of the International Conference on Artificial Intelligence and Statistics, volume 15, 2012.