Posteriors, conjugacy, and exponential families for completely - PowerPoint PPT Presentation
Posteriors, conjugacy, and exponential families for completely random measures Tamara Broderick, Ashia C. Wilson, Michael I. Jordan MIT Berkeley Berkeley Models Beta process, Bernoulli process (IBP) Gamma process, Poisson likelihood
Posteriors, conjugacy, and exponential families for completely random measures Tamara Broderick, Ashia C. Wilson, Michael I. Jordan MIT Berkeley Berkeley
Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1
Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1
Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1
Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1
Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1
Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1
Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1
Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1
Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1
Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] p ( x | θ ) = θ x (1 + θ ) − 1 x ∈ { 0 , 1 } θ > 0 � p ( θ ) ∝ θ α (1 + θ ) − α − β = BetaPrime( θ | α , β ) α > 0 , β > 0 p ( θ | x ) ∝ θ α + x (1 + θ ) − ( α + x ) − ( β − x +1) 1
Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] • Likelihood ➞ conjugate prior, straightforward inference • Integration ➞ addition 2
Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Background • Parametric exponential family conjugacy [Diaconis & Ylvisaker 1979] • Likelihood ➞ conjugate prior, straightforward inference • Integration ➞ addition 2
Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Want: One framework • For Bayesian nonparametric models: • Likelihood ➞ conjugate prior, straightforward inference 3
Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Want: One framework • For Bayesian nonparametric models: • Likelihood ➞ conjugate prior, straightforward inference 3
Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Want: One framework • For Bayesian nonparametric models: • Likelihood ➞ conjugate prior, straightforward inference 3
Models • Beta process, Bernoulli process (IBP) • Gamma process, Poisson likelihood process (DP, CRP) • Beta process, negative binomial process Want: One framework • For Bayesian nonparametric models: • Likelihood ➞ conjugate prior, straightforward inference 3
Clustering Technology Sports Health Econ Arts Document 1 Document 2 Document 3 Document 4 Document 5 Document 6 Document 7 4
Feature allocation Technology Sports Health Econ Arts Document 1 Document 2 Document 3 Document 4 Document 5 Document 6 Document 7 5
Indian buffet process (IBP) ... k = 1 2 For n = 1, 2, ..., N n = 1 1. Data point n has an existing 2 feature k that has occurred S n − 1 ,k S n − 1 ,k times with probability ... β + n − 1 � 2. Number of new features for data N point n: ✓ ◆ β K + γ n = Poisson β + n − 1 6 [Griffiths & Ghahramani 2006]
Indian buffet process (IBP) ... k = 1 2 For n = 1, 2, ..., N n = 1 1. Data point n has an existing 2 feature k that has occurred S n − 1 ,k S n − 1 ,k times with probability ... β + n − 1 � 2. Number of new features for data N point n: ✓ ◆ β K + γ n = Poisson β + n − 1 6 [Griffiths & Ghahramani 2006]
Indian buffet process (IBP) ... k = 1 2 For n = 1, 2, ..., N n = 1 1. Data point n has an existing 2 feature k that has occurred S n − 1 ,k S n − 1 ,k times with probability ... β + n − 1 � 2. Number of new features for data N point n: ✓ ◆ β K + γ n = Poisson β + n − 1 6 [Griffiths & Ghahramani 2006]
Indian buffet process (IBP) ... k = 1 2 For n = 1, 2, ..., N n = 1 1. Data point n has an existing 2 feature k that has occurred S n − 1 ,k S n − 1 ,k times with probability ... β + n − 1 � 2. Number of new features for data N point n: ✓ ◆ β K + γ n = Poisson β + n − 1 6 [Griffiths & Ghahramani 2006]
Indian buffet process (IBP) ... k = 1 2 For n = 1, 2, ..., N n = 1 1. Data point n has an existing 2 feature k that has occurred S n − 1 ,k S n − 1 ,k times with probability ... β + n − 1 � 2. Number of new features for data N point n: ✓ ◆ β K + γ n = Poisson β + n − 1 6 [Griffiths & Ghahramani 2006]
Indian buffet process (IBP) ... k = 1 2 For n = 1, 2, ..., N n = 1 1. Data point n has an existing 2 feature k that has occurred S n − 1 ,k S n − 1 ,k times with probability ... β + n − 1 � 2. Number of new features for data N point n: ✓ ◆ β K + γ n = Poisson β + n − 1 6 [Griffiths & Ghahramani 2006]
Indian buffet process (IBP) ... k = 1 2 For n = 1, 2, ..., N n = 1 1. Data point n has an existing 2 feature k that has occurred S n − 1 ,k S n − 1 ,k times with probability ... β + n − 1 � 2. Number of new features for data N point n: ✓ ◆ β K + γ n = Poisson β + n − 1 6 [Griffiths & Ghahramani 2006]
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.