Approximate Posterior Inference for Differential Equation Models - - PowerPoint PPT Presentation

approximate posterior inference for differential equation
SMART_READER_LITE
LIVE PREVIEW

Approximate Posterior Inference for Differential Equation Models - - PowerPoint PPT Presentation

Approximate Posterior Inference for Differential Equation Models Jaeyong Lee Department of Statistics Seoul National University June 24, 2014 jointly with Sarat C. Dass, Universiti Teknologi PETRONAS Kyungjae Lee, Seoul National University


slide-1
SLIDE 1

Approximate Posterior Inference for Differential Equation Models

Jaeyong Lee

Department of Statistics Seoul National University

June 24, 2014 jointly with Sarat C. Dass, Universiti Teknologi PETRONAS Kyungjae Lee, Seoul National University and Jonghun Park, Seoul National University.

slide-2
SLIDE 2

Outline

Introduction of Differential Equation Model First Proposal of Posterior Computation : Laplace + Euler + Grid Sampling Second Proposal of Posterior Computation : Griddy Gibbs Sampling + Euler

slide-3
SLIDE 3

Outline

Introduction of Differential Equation Model First Proposal of Posterior Computation : Laplace + Euler + Grid Sampling Second Proposal of Posterior Computation : Griddy Gibbs Sampling + Euler

slide-4
SLIDE 4

Differential Equations and Statistics

Differential equations are widely used. It has many examples. The statistical community has been relatively silent on this area of research.

slide-5
SLIDE 5

Ordinary Differential Equation Model

(Observation Equation) y(t) = x(t) + ϵ(t), ϵ(t) ∼ N(0, σ2Ip), t ∈ [T0, T1], where

▶ y(t) is the p-dimensional vector of observation at time t and ▶ x is the mean function and ϵ(t) is the error at time t. ▶ Actual data are observed at discrete times

t1 < t2 < . . . < tn ∈ [T0, T1].

(Ordinary Differential Equation (ODE)) ˙ x(t) = f(x, u, t : θ), t ∈ [T0, T1], where

▶ f is a smooth p-dimensional function, ▶ u is the input function and ▶ θ is the q-dimensional parameter vector.

slide-6
SLIDE 6

SIR Model

red : x1(t), susceptibles blue : x2(t), recovered green : x3(t), infected

˙ x1(t) = −θ1x1(t)x2(t) ˙ x2(t) = θ1x1(t)x2(t) − θ2x2(t) ˙ x3(t) = θ2x2(t)

x1(t) : number of the susceptible x2(t) : number of the infected x3(t) : number of the recovered θ1 : infection rate in (0, 1) θ2 : recovery rate rate in (0, 1)

slide-7
SLIDE 7

Lotka-Volterra Equation or Prey-Predator Model

red : x1(t), number of preys blue : x2(t), number of predators ˙ x1(t) = x1(t)(θ1 − θ2x2(t)) ˙ x2(t) = −x2(t)(θ3 − θ4x2(t))

x1(t) : number of the preys x2(t) : number of the predators θ1 : the intrinsic rate of prey population increase θ2 : the predation rate θ3 : the predator mortality rate θ4 : the offspring rate of the predator

slide-8
SLIDE 8

Computational Methods in Literature-Frequentist Side

Nonlinear least squares method with a numerical ODE solver (Bard, 1974). Two step approach (Varah, 1982; Ramsey and Silverman, 2005) Generalized profiling method (Ramsey, Hooker, Campbell and Cao, 2007)

slide-9
SLIDE 9

Computational Methods in Literature — Bayesian Side

MCMC with numerical solver of ODE (Gelman, Bois and Jiang, 1996; Huang, Liu and Wu, 2006) Collocation Tempering or Smooth functional tempering (Campbell, 2007)

slide-10
SLIDE 10

Outline

Introduction of Differential Equation Model First Proposal of Posterior Computation : Laplace + Euler + Grid Sampling Second Proposal of Posterior Computation : Griddy Gibbs Sampling + Euler

slide-11
SLIDE 11

Differential Equation Model

(Observation Model) yi = x(ti) + ϵi, ϵi ∼ N(0, σ2Ip), i = 1, 2, . . . , n, where t1 < t2 < . . . < tn. (Differential Equation) ˙ x(t) = f(x(t), t|θ), θ ∈ Θ ⊂ Rq.

slide-12
SLIDE 12

Prior

Unknowns in the model : θ, σ2, x1 τ 2 = 1/σ2 ∼ Gamma(a, b), a, b > 0. x1|τ 2 ∼ N(µx1, cτ −2Ip), c > 0. θ ∼ π(θ).

slide-13
SLIDE 13

Posterior

π(θ, τ 2, x1|yn) ∝ p(yn|θ, τ 2, x1)π(x1|τ 2)π(τ 2)π(θ) =

n

  • i=1

|τ −22πIp|−1/2e− τ2

2 ||yi−xi(θ,x1)||2

×|2πcτ −2Ip|−1/2e− τ2

2c ||x1−µx1||2 × ba

Γ(a)(τ 2)a−1e−bτ 2 × π(θ) ∝ (τ 2)

1 2 (n+1)p+a−1e− τ2 2 (ngn(x1,θ)+ 1 c ||x1−µx1||2+2b) × π(θ),

where yn = (y1, y2, . . . , yn) gn(x1) = gn(x1, θ) = 1

n

n

i=1 ||yi − xi(θ, x1)||2 and

||x|| denotes the Euclidean norm of the vector x.

slide-14
SLIDE 14

First Proposal : Laplace approximation and grid sampling

Unknowns in the model : θ, σ2, x1 Marginalization Step : Laplace approximation and Euler method to eliminate x1 from the posterior Sampling θ and σ2

▶ Sampling θ : grid sampling ▶ Sampling τ 2 = 1/σ2 : sampling from gamma distribution

slide-15
SLIDE 15

Marginalization of x1

Using Tierney and Kadane (1986), L(θ, τ 2) =

  • π(x1|τ 2)L(θ, τ 2, x1)dx1

∝ (τ 2)np/2e− τ2

2 u(θ)det(n¨

gn(ˆ x1) + 2 c Ip)−1/2(1 + O(n−3/2)), where ¨ gn(x1) = ∂2 ∂x2

1

gn(x1, θ) ˆ x1 = ˆ x1(θ) = argmin

x1

  • ngn(x1, θ) + 1

c ∥x1 − µx1∥2 u(θ) = ngn(ˆ x1) + 1 c ||ˆ x1 − µx1||2.

slide-16
SLIDE 16

π(θ, τ 2|yn) ∝ π(θ) × det(n¨ gn(ˆ x1) + 2 c Ip)−1/2 × (τ 2)

np 2 +a−1e−τ 2( 1 2u(θ)+b).

slide-17
SLIDE 17

π(τ 2|θ, yn)

τ 2|θ, yn ∼ Gamma(a∗, b∗), where a∗ = np 2 + a b∗ = 1 2u(θ) + b.

slide-18
SLIDE 18

π(θ|yn)

π(θ|yn) =

  • π(θ, τ 2|yn)dτ 2

∝ π(θ) ( 1

2u(θ) + b)

np 2 +adet(n¨

gn(ˆ x1) + 2

cIp)1/2 .

Sampling θ is done by grid sampling.

slide-19
SLIDE 19

Second Proposal : griddy Gibbs sampling

Unknowns in the model : θ, σ2, x1 Sampling x1 and θ : griddy Gibbs Sampling τ 2 = 1/σ2 : sampling from gamma distribution

slide-20
SLIDE 20

Newton’s Law of Cooling

(Differential Equation) ˙ x(t) = θ1[x(t) − θ2], t ≥ 0, where

▶ x(t) is the temperature of an object in Celcius; ▶ θ2 is the temperature of the environment; and ▶ θ1 is the proportionality constant.

The analytic form of the solution is known : x(t) = θ2 − (θ2 − x1)eθ1t, t ≥ t1 >= 0.

slide-21
SLIDE 21

Quality of Laplace Approximation

Figure : The true posterior density and approximate posterior density with Laplace approximation when sample size n = 20.

slide-22
SLIDE 22

Step Size of Euler Method

slide-23
SLIDE 23

FitzHugh-Nagumo Model

FitzHugh-Nagumo Model is a differential equation for spike potential of giant axon of squid neurons. (Differential Equation) ˙ x1(t) = θ3[x1(t) − 1 3x3

1(t) + x2(t)]

˙ x2(t) = − 1 θ3 [x1(t) − θ1 + θ2x2(t)], where

▶ x1(t) (voltage) is the voltage across an membrane at time t, ▶ x2(t) (recovery) is the outward currents. ▶ To get meaningful behavior, 1 − 2

3θ2 < θ1 < 1, 0 < θ2 < 1

and θ2 < θ2

3 should be satisfied.

slide-24
SLIDE 24

Step sizes of Euler Method and Approximate Posteriors

slide-25
SLIDE 25
slide-26
SLIDE 26

Convergence of Approximate Posterior

In actual computation, to improve accuracy of the Euler method, we divide [ti−1, ti] into m intervals. Under certain conditions, for all θ and σ2, lim

m→∞ πm(θ, τ 2|yn) = π0(θ, τ 2|yn)(1 + O(n−3/2)).

slide-27
SLIDE 27

Outline

Introduction of Differential Equation Model First Proposal of Posterior Computation : Laplace + Euler + Grid Sampling Second Proposal of Posterior Computation : Griddy Gibbs Sampling + Euler

slide-28
SLIDE 28

Griddy Gibbs Sampling

The posterior of θ and x1, π(θ, x1|y), is approximated by discrete distribution whose probabilities are proportional to π(θ, x1|y). When the dimension of θ and x1 is not small, we apply griddy Gibbs sampling or Gibbs sampling to the discrete approximation of the posterior. When some of the parameters of the posterior are highly correlated, we use blocked Gibbs.

slide-29
SLIDE 29

Calcium oscillation model

˙ x1(t) = θ1 + θ2x1(t) − θ5x1(t)x2(t) x1(t) + θ6 − θ7x1(t)x3(t) x1(t) + θ8 ˙ x2(t) = θ3x1(t) − θ9x2(t) x2(t) + θ10 ˙ x3(t) = θ4x1(t) − θ11x3(t) x3(t) + θ12 x1: the concentration of active subunit of the G-protein x2: the concentration of activated form of PLC x3: the concentration of cytosolic Calcium

slide-30
SLIDE 30

Estimation and prediction

slide-31
SLIDE 31

Thank You!