STOCHASTIC PROXIMAL LANGEVIN ALGORITHM Adil Salim Joint work with - - PowerPoint PPT Presentation

▶

Aug 14, 2022 464 likes •570 views

STOCHASTIC PROXIMAL LANGEVIN ALGORITHM Adil Salim Joint work with Dmitry Kovalev and Peter Richtrik 1 SAMPLING PROBLEM (d x ) exp( U ( x ))d x , U : d convex where . 2 LANGEVIN MONTE CARLO (LMC) W k Assume

SLIDE 1

STOCHASTIC PROXIMAL LANGEVIN ALGORITHM

Adil Salim Joint work with Dmitry Kovalev and Peter Richtárik

SLIDE 2

where . U : ℝd → ℝ convex

SAMPLING PROBLEM

μ⋆(dx) ∝ exp(−U(x))dx,

SLIDE 3

Typical non asymptotic result: . KL(μk|μ⋆) = 𝒫(1/

k)

LANGEVIN MONTE CARLO (LMC)

Assume smooth, i.i.d standard gaussian and ,

U Wk γ > 0

xk+1 = xk − γ∇U(xk) + 2γWk+1 .

Gaussian noise Gradient descent

SLIDE 4

FIRST INTUITION FOR LMC

dXt = − ∇U(Xt)dt + 2dWt .

LMC can be seen as a Euler discretization of the Langevin equation: Non asymptotic results using this intuition in [Dalalyan 2017], [Durmus Moulines 2017].

SLIDE 5

LMC can be seen as an (inexact) Gradient Descent for:

μ⋆ = argmin ∫ Udμ(x) + ∫ μ(x)log(μ(x))dx μ⋆ = argmin KL(μ|μ⋆) .

SECOND INTUITION FOR LMC

Non asymptotic results using this intuition (+ extensions of LMC beyond GD) in [Durmus et al. 2018], [Wibisono 2018], [Bernton 2018].

SLIDE 6

CONTRIBUTION: STOCHASTIC PROXIMAL LANGEVIN

xk+1 = proxγg(⋅,ξk+1)(xk) +

2γWk+1 .

Stochastic Prox

Case 1: U(x) = Eξ(g(x, ξ))

Nonsmooth

SLIDE 7

Case 2:

U(x) = Eξ(f(x, ξ)) + ∑ Eξ(gi(x, ξ))

CONTRIBUTION: STOCHASTIC PROXIMAL LANGEVIN

Smooth Nonsmooth

See our Poster #161.

SLIDE 8

STOCHASTIC SUBGRADIENT VS STOCHASTIC PROX

Stochastic subgradients [Durmus et al. 2018] Sampling .

μ⋆(dx) ∝ exp( − |x|)dx

Stochastic proximal [Us]

SLIDE 9

STOCHASTIC PROXIMAL LANGEVIN ALGORITHM

where . U : ℝd → ℝ convex

SAMPLING PROBLEM

μ⋆(dx) ∝ exp(−U(x))dx,

Typical non asymptotic result: . KL(μk|μ⋆) = 𝒫(1/

k)

LANGEVIN MONTE CARLO (LMC)

Assume smooth, i.i.d standard gaussian and ,

U Wk γ > 0

xk+1 = xk − γ∇U(xk) + 2γWk+1 .

FIRST INTUITION FOR LMC

dXt = − ∇U(Xt)dt + 2dWt .

LMC can be seen as a Euler discretization of the Langevin equation: Non asymptotic results using this intuition in [Dalalyan 2017], [Durmus Moulines 2017].

LMC can be seen as an (inexact) Gradient Descent for:

μ⋆ = argmin ∫ Udμ(x) + ∫ μ(x)log(μ(x))dx μ⋆ = argmin KL(μ|μ⋆) .

SECOND INTUITION FOR LMC

Non asymptotic results using this intuition (+ extensions of LMC beyond GD) in [Durmus et al. 2018], [Wibisono 2018], [Bernton 2018].

CONTRIBUTION: STOCHASTIC PROXIMAL LANGEVIN

xk+1 = proxγg(⋅,ξk+1)(xk) +

2γWk+1 .

Case 1: U(x) = Eξ(g(x, ξ))

Case 2:

U(x) = Eξ(f(x, ξ)) + ∑ Eξ(gi(x, ξ))

CONTRIBUTION: STOCHASTIC PROXIMAL LANGEVIN

See our Poster #161.

STOCHASTIC SUBGRADIENT VS STOCHASTIC PROX

Stochastic subgradients [Durmus et al. 2018] Sampling .

μ⋆(dx) ∝ exp( − |x|)dx

Stochastic proximal [Us]

Thanks for your attention. See us at poster #161.