Stochastic Simulation Markov Chain Monte Carlo
Bo Friis Nielsen
Institute of Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby – Denmark Email: bfni@dtu.dk
Stochastic Simulation Markov Chain Monte Carlo Bo Friis Nielsen - - PowerPoint PPT Presentation
Stochastic Simulation Markov Chain Monte Carlo Bo Friis Nielsen Institute of Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby Denmark Email: bfni@dtu.dk The queueing example The queueing example We simulated the
Institute of Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby – Denmark Email: bfni@dtu.dk
02443 – lecture 8 2
DTU
We simulated the system until “stochastic steady state”. We were then able to describe this steady state:
The model was a “state machine”, i.e. a Markov Chain. To obtain steady-state statistics, we used stochastic simulation, i.e. Monte Carlo.
02443 – lecture 8 3
DTU
space
decision rule depending on the value of Xn only.
matrix of transition probabilities P = {pij}, pij = P(Xn+1 = j|Xn = i)
limiting distribution π.πj = P(X∞ = j).
π = πP (equilibrium distribution)
02443 – lecture 8 4
DTU
⋄ Continuous sample space or ⋄ Continuous time: exercise 4 is an example of a Continuous time Markov chain
02443 – lecture 8 5
DTU
j
with P(X0 = j) = µ(0)
j
µ(n) = {µ(n)
j } we find
µ(n−1)P = µ(0)Pn = µ(0)P n
02443 – lecture 8 6
DTU
P = 1 − p p q p q p q 1 − q with µ(0) = 1
3, 0, 0, 2 3
1 3, 0, 0, 2 3
1 − p p q p q p q 1 − q = 1 − p 3 , p 3, 2q 3 , 2(1 − q) 3
02443 – lecture 8 7
DTU
1 3, 0, 0, 2 3
P 2 = (1 − p)2 + pq (1 − p)p p2 q(1 − p) 2qp p2 q2 2qp p(1 − q) q2 (1 − q)q (1 − q)2 + qp
02443 – lecture 8 8
DTU
1 3, 0, 0, 2 3
(1 − p)2 + pq (1 − p)p p2 q(1 − p) 2qp p2 q2 2qp p(1 − q) q2 (1 − q)q (1 − q)2 + qp = (1 − p)2 + pq 3 , (1 − p)p 3 , 4qp 3 , 2p(1 − q) 3
02443 – lecture 8 9
DTU
We have a variable X with a “complicated” distribution. We cannot sample X directly. We aim to generate a sequence of Xi’s
This is an inverse problem relative to the queueing exercise: We start with the distribution of X, and aim to design a state machine which has this steady-state distribution.
02443 – lecture 8 10
DTU
Prior distribution of parameter P ∼ U(0, 1) : fP(p) = 1 (0 ≤ p ≤ 1) Distribution of data, conditional on parameter X for given P = p is Binomial(n, p) i.e. the data has the conditional probabilities P(X = i|P) = n i P i(1 − P)n−i
02443 – lecture 8 11
DTU
Conditional density of parameter, given observed data X = i: fP|X=i(p) = fP(p)P(X = i|P = p) P(X = i) We need the unconditional probability of the observation: P(X = i) = 1 fP(p) n i pi(1 − p)n−i dp We can evaluate this; in more complex models we could not. AIM: To sample from fP|X=i, without evaluating P(X = i).
02443 – lecture 8 12
DTU
Time series
Iteration no. PHistogram of Pvec
Pvec Frequency 0.2 0.4 0.6 0.8 200 400 600 800 1000 1200 1400
02443 – lecture 8 13
DTU
The distribution is given by f(x) = c · g(x) where the unnormalized density g can be evaluated, but the normalising constant c cannot be evaluated (easily). c = 1
This is frequently the case in Bayesian statistics - the posterior density is proportional to the likelihood function Note (again) the similarity between simulation and evaluation of integrals
02443 – lecture 8 14
DTU
probability min
f(x)h(x, y)
g(x)h(x, y)
g(x)
h(y, x) = h(y, x) Metropolis algorithm
stationary distribution.
02443 – lecture 8 15
DTU
Sampling from p.d.f. c · g(x) where c is unknown.
sampled indepedently from a symmetric distribution
Note that knowing c is not necessary!
02443 – lecture 8 16
DTU
⋄ For any x, it is easy to sample from h(x, y) ⋄ It is easy to compute the accpetance probability ⋄ Each jump goes a reasonable distance in the parameter space ⋄ The proposals are not rejected too frequently
02443 – lecture 8 17
DTU
among the coordinates are known.
modify only one coordinate at a time.
the parameter space , that is of x
02443 – lecture 8 18
DTU
02443 – lecture 8 19
DTU
02443 – lecture 8 20
DTU
02443 – lecture 8 21
DTU
analytically or by simulation
⋄ For a discrete distribution we know Kπ construct P which has no physical interpretation in general and obtain π by simulation ⋄ For a continuous distribution we know g(x) construct a transition kernel P(x, y) and get f(x) by simulation.
02443 – lecture 8 22
DTU
⋄ Comparing within run variance with between run variance
http://www.mrc-bsu.cam.ac.uk/bugs/and/or links given at the BUGS site
02443 – lecture 8 23
DTU
Analysis, Chapmann & Hall 1998, ISBN 0 412 03991 5
Carlo in practice, Chapmann & Hall 1996, ISBN 0 412 05551 1
would change the acceptance probabilities.
time
then we can apply Gibbs sampling
which each interact only with a few others
given by a truncated Poisson distribution P(i) =
Ai i!
n
j=0 Aj j!
Metropolis-Hastings algorithm, verify with a χ2-test. You can use the parameter values from exercise 4.
given by P(i, j) = 1 K Ai
1
i! Aj
2
j! 0 ≤ i + j ≤ n
generate variates from this distribution. You can use A1, A2 = 4
need to find the conditional distributions analytically.
can add restrictions on the different call types.