Principles of Probabilistic Programming
Lectures at EWSCS 2020 Winter School Joost-Pieter Katoen EWSCS 2020, Palmse, Estonia
Joost-Pieter Katoen Principles of Probabilistic Programming 1/222
Principles of Probabilistic Programming Lectures at EWSCS 2020 - - PowerPoint PPT Presentation
Principles of Probabilistic Programming Lectures at EWSCS 2020 Winter School Joost-Pieter Katoen EWSCS 2020, Palmse, Estonia Joost-Pieter Katoen Principles of Probabilistic Programming 1/222 What is probabilistic programming? Probabilistic
Principles of Probabilistic Programming
Lectures at EWSCS 2020 Winter School Joost-Pieter Katoen EWSCS 2020, Palmse, Estonia
Joost-Pieter Katoen Principles of Probabilistic Programming 1/222Probabilistic programs
What? Programs with random assignments, conditioning and usual control-flow constructs Why? Z Random assignments: to describe randomised algorithms Z Conditioning: to describe stochastic decision making
Joost-Pieter Katoen Principles of Probabilistic Programming 10/222Applications
Joost-Pieter Katoen Principles of Probabilistic Programming 11/222Planning in AI: robot navigation
Uncertainty: noisy sensors and actuators, unknown environment5
5Evans et al., Modeling Agents with Probabilistic Programs, 2019 Joost-Pieter Katoen Principles of Probabilistic Programming 14/222Security: The RSA-OAEP protocol
Correctness proof took more than 20 years
Joost-Pieter Katoen Principles of Probabilistic Programming 16/222Printer troubleshooting in Windows 95
How likely is it that your print is garbled given that the ps-file is not and the page orientation is portrait? [Ramanna et al., Emerging Paradigms in Machine Learning, 2013]
Joost-Pieter Katoen Principles of Probabilistic Programming 19/222Languages
Joost-Pieter Katoen Principles of Probabilistic Programming 21/222Tp
Pyro
UbO
Issue 1: Program correctness
Z Classical programs:
Z A program is correct with respect to a (formal) specification “for input array A, the output array B is sorted and contains all elements contained in A” Z Defines a deterministic input-output relation Z Partial correctness: if an output is produced, it is correct Z Total correctness: in addition, the program terminates
Z Probabilistic programs:
Z They do not always generate the same output Z They generate a probability distribution over possible outputs
Joost-Pieter Katoen Principles of Probabilistic Programming 23/222Issue 2: Termination
Z Classical programs:
Z They terminate (on a given/all inputs), or they do not Z If they terminate, they take finitely many steps to do so Z Showing program termination is undecidable (halting problem)
Z Probabilistic programs:
Z They terminate (or not) with a certain likelihood Z They may have diverging runs whose likelihood is zero Z They may take infinitely many steps (on average) to terminate
even if they terminate with probability one!
Z Showing “probability-one” termination is “more” undecidable
Z and showing they do in finite time on average, even more!
Joost-Pieter Katoen Principles of Probabilistic Programming 24/222 112a
±
' a*
xx
Issue 3: The program’s runtime
Z Classical programs:
Z They have a deterministic, fixed run-time for a given input Z Runtimes of terminating programs in sequence are compositional:
if P and Q terminate in n and k steps, then P;Q halts in n+k steps
Z Analysis techniques: recurrence equations, tree analysis, etc.
Z Probabilistic programs:
Z Every runtime has a probability; their runtime is a distribution Z Runtimes of “probability-one” terminating programs may not sum up
if P and Q terminate in n and k steps on average, then P;Q may need infinitely many steps on average
Z Analysis techniques: involve reasoning about expected values etc.
Joost-Pieter Katoen Principles of Probabilistic Programming 25/222This EWSCS 2020 tutorial
Z The probabilistic guarded command language pGCL
Z examples, syntax, semantics (Markov chains), conditioning, recursion
Z Proving correctness of probabilistic programs
Z weakest pre-conditions, loop invariants, post-conditions, conditioning
Z Almost-sure termination
Z positive a.s.-termination, (a bit of) hardness, stochastic ranking functions
Z Analysing runtimes of probabilistic programs
Z examples, finite versus infinite expected runtime, wp-reasoning
Z Verifying and runtime analysis of Bayesian networks
Joost-Pieter Katoen Principles of Probabilistic Programming 26/222Overview
1
What is probabilistic programming?
2
Probabilistic Guarded Command Language
3
Weakest preconditions
4
Conditioning
5
Expected runtime analysis
6
Analysing Bayesian networks
7
Recursion
8
Loop invariant synthesis
9
Epilogue
Joost-Pieter Katoen Principles of Probabilistic Programming 27/222Overview
1
What is probabilistic programming?
2
Probabilistic Guarded Command Language
3
Weakest preconditions
4
Conditioning
5
Expected runtime analysis
6
Analysing Bayesian networks
7
Recursion
8
Loop invariant synthesis
9
Epilogue
Joost-Pieter Katoen Principles of Probabilistic Programming 28/222Dijkstra’s guarded command language
Z skip empty statement Z diverge divergence Z x := E assignment Z prog1 ; prog2 sequential composition Z if (G) prog1 else prog2 choice Z prog1 [] prog2 non-deterministic choice Z while (G) prog iteration
Joost-Pieter Katoen Principles of Probabilistic Programming 29/222Probabilistic GCL
Kozen McIver Morgan
Z skip empty statement Z diverge divergence Z x := E assignment Z observe (G) conditioning Z prog1 ; prog2 sequential composition Z if (G) prog1 else prog2 choice Z prog1 [p] prog2 probabilistic choice Z while (G) prog iteration
Joost-Pieter Katoen Principles of Probabilistic Programming 30/222p
P
Let’s start simple
x := 0 [0.5] x := 1; y := -1 [0.5] y := 0 This program admits four runs and yields the outcome:
Pr[x =0, y =0] = Pr[x =0, y =1] = Pr[x =1, y =0] = Pr[x =1, y =1] = 1/4
Joost-Pieter Katoen Principles of Probabilistic Programming 31/222A loopy program
For 0 < p < 1 an arbitrary probability: bool c := true; int i := 0; while (c) { i++; (c := false [p] c := true) } The loopy program models a geometric distribution with parameter p. Pr[i = N] = (1p)N1 p for N > 0
Joost-Pieter Katoen Principles of Probabilistic Programming 32/222On termination
bool c := true; int i := 0; while (c) { i++; (c := false [p] c := true) } This program does not always terminate. It almost surely terminates.
Joost-Pieter Katoen Principles of Probabilistic Programming 33/222Conditioning
Joost-Pieter Katoen Principles of Probabilistic Programming 34/222Let’s start simple
x := 0 [0.5] x := 1; y := -1 [0.5] y := 0;
This program blocks two runs as they violate x+y = 0. Outcome:
Pr[x =0, y =0] = Pr[x =1, y =1] = 1/2
Joost-Pieter Katoen Principles of Probabilistic Programming 35/222Let’s start simple
x := 0 [0.5] x := 1; y := -1 [0.5] y := 0;
This program blocks two runs as they violate x+y = 0. Outcome:
Pr[x =0, y =0] = Pr[x =1, y =1] = 1/2
Observations thus normalize the probability of the “feasible” program runs
Joost-Pieter Katoen Principles of Probabilistic Programming 35/222A loopy program
For 0 < p < 1 an arbitrary probability:
bool c := true; int i : = 0; while (c) { i++; (c := false [p] c := true) }
geom
C p ) {⇒ "
A loopy program
For 0 < p < 1 an arbitrary probability:
bool c := true; int i : = 0; while (c) { i++; (c := false [p] c := true) }
The feasible program runs have a probability 8N≥0 (1p)2Np = 1 2 p This program models the distribution: Pr[i = 2N+1] = (1p)2N p (2p) for N ≥ 0 Pr[i = 2N] = 0
Joost-Pieter Katoen Principles of Probabilistic Programming 36/222Why formal semantics matters
Z Unambiguous meaning to (almost) all probabilistic programs Z Operational interpretation to weakest pre-expectations Z Basis for proving correctness
Z of programs Z of program transformations Z of program equivalence Z of static analysis Z of compilers Z . . . . . .
Joost-Pieter Katoen Principles of Probabilistic Programming 37/222~
discreteAndrei Andrejewitsch Markow
Joost-Pieter Katoen Principles of Probabilistic Programming 38/222Markov chains
A Markov chain (MC) is a triple (Σ, σI, P) with: Z Σ being a countable set of states Z σI ∈ Σ the initial state, and Z P ⇥ Σ Dist(Σ) the transition probability function where Dist(Σ) is a discrete probability measure on Σ.
Joost-Pieter Katoen Principles of Probabilistic Programming 39/222÷
"
s OFO O
7 2 3 4rhet
P ( 2
, . )=p
withgli
)=
I
its
)= }
refs
)=o star
,3Operational semantics
Aim: Model the behaviour of a program P by the MC [ [ P ] ].
Joost-Pieter Katoen Principles of Probabilistic Programming 40/222← violating
I
satisfy
allOperational semantics
Aim: Model the behaviour of a program P by the MC [ [ P ] ].
This can be defined using Plotkin’s SOS-style semantics
Joost-Pieter Katoen Principles of Probabilistic Programming 40/222Operational semantics
Aim: Model the behaviour of a program P by the MC [ [ P ] ]. Approach: Z Take states of the form
Z ÖQ, sã with program Q or ⇤, and variable valuation s ⇥ Var Q Z Ö≤ã models the violation of an observation, and Z Ösinkã models program termination (successful or violated observation)
Z Take initial state ÖP, sã where s fulfils the initial conditions Z Take transition relation as smallest relation satisfying the SOS rules
Joost-Pieter Katoen Principles of Probabilistic Programming 41/222Some SOS rules
Öskip, sã Ö⇤, sã Ödiverge, sã Ödiverge, sã
Joost-Pieter Katoen Principles of Probabilistic Programming 42/222 I✓
I ( die , s )Some SOS rules
Öskip, sã Ö⇤, sã Ödiverge, sã Ödiverge, sã s Ï G Öobserve(G), sã Ö⇤, sã s / Ï G Öobserve(G), sã Ö≤ã
Joost-Pieter Katoen Principles of Probabilistic Programming 42/222I
Some SOS rules
Öskip, sã Ö⇤, sã Ödiverge, sã Ödiverge, sã s Ï G Öobserve(G), sã Ö⇤, sã s / Ï G Öobserve(G), sã Ö≤ã Ö⇤, sã Ösinkã Ö≤ã Ösinkã Ösinkã Ösinkã
Joost-Pieter Katoen Principles of Probabilistic Programming 42/222 1 1Some SOS rules
Öskip, sã Ö⇤, sã Ödiverge, sã Ödiverge, sã s Ï G Öobserve(G), sã Ö⇤, sã s / Ï G Öobserve(G), sã Ö≤ã Ö⇤, sã Ösinkã Ö≤ã Ösinkã Ösinkã Ösinkã Öx ⇥= E, sã Ö⇤, s[x ⇥= s([ [ E ] ])]ã
Joost-Pieter Katoen Principles of Probabilistic Programming 42/222TT
Some SOS rules
Öskip, sã Ö⇤, sã Ödiverge, sã Ödiverge, sã s Ï G Öobserve(G), sã Ö⇤, sã s / Ï G Öobserve(G), sã Ö≤ã Ö⇤, sã Ösinkã Ö≤ã Ösinkã Ösinkã Ösinkã Öx ⇥= E, sã Ö⇤, s[x ⇥= s([ [ E ] ])]ã ÖP[ p] Q, sã µ with µ(ÖP, sã) = p and µ(ÖQ, sã) = 1p
Joost-Pieter Katoen Principles of Probabilistic Programming 42/222O
't
÷
Some SOS rules
Öskip, sã Ö⇤, sã Ödiverge, sã Ödiverge, sã s Ï G Öobserve(G), sã Ö⇤, sã s / Ï G Öobserve(G), sã Ö≤ã Ö⇤, sã Ösinkã Ö≤ã Ösinkã Ösinkã Ösinkã Öx ⇥= E, sã Ö⇤, s[x ⇥= s([ [ E ] ])]ã ÖP[ p] Q, sã µ with µ(ÖP, sã) = p and µ(ÖQ, sã) = 1p ÖP, sã Ö≤ã ÖP; Q, sã Ö≤ã ÖP, sã µ ÖP; Q, sã ν with ν(ÖP¨; Q¨, s¨ã) = µ(ÖP¨, s¨ã) where ⇤; Q = Q
Joost-Pieter Katoen Principles of Probabilistic Programming 42/222Some SOS rules
Öskip, sã Ö⇤, sã Ödiverge, sã Ödiverge, sã s Ï G Öobserve(G), sã Ö⇤, sã s / Ï G Öobserve(G), sã Ö≤ã Ö⇤, sã Ösinkã Ö≤ã Ösinkã Ösinkã Ösinkã Öx ⇥= E, sã Ö⇤, s[x ⇥= s([ [ E ] ])]ã ÖP[ p] Q, sã µ with µ(ÖP, sã) = p and µ(ÖQ, sã) = 1p ÖP, sã Ö≤ã ÖP; Q, sã Ö≤ã ÖP, sã µ ÖP; Q, sã ν with ν(ÖP¨; Q¨, s¨ã) = µ(ÖP¨, s¨ã) where ⇤; Q = Q s Ï G Öwhile(G){P}, sã ÖP; while (G){P}, sã s / Ï G Öwhile(G){P}, sã Ö⇤, sã
Joost-Pieter Katoen Principles of Probabilistic Programming 42/222The piranha problem
[Tijms, 2004]
Joost-Pieter Katoen Principles of Probabilistic Programming 43/222The operational semantics
Joost-Pieter Katoen Principles of Probabilistic Programming 44/222The operational semantics
Joost-Pieter Katoen Principles of Probabilistic Programming 44/222The good, the bad, and the ugly
Joost-Pieter Katoen Principles of Probabilistic Programming 45/222Example operational semantics
int cowboyDuel(float a, b) { int t := A [0.5] t := B; bool c := true; while (c) { if (t = A) { (c := false [a] t := B); } else { (c := false [b] t := A); } } return t; }
Joost-Pieter Katoen Principles of Probabilistic Programming 46/222Example operational semantics
int cowboyDuel(float a, b) { int t := A [0.5] t := B; bool c := true; while (c) { if (t = A) { (c := false [a] t := B); } else { (c := false [b] t := A); } } return t; }
11 A 0 4 A 0 6 A 1 a!
2 • •H--+13 B • 3 A * I 8V
.>--'-r-L....I B I 1- b 4 A 1 8 B I 5A l \ 581\
6 A l 4 B I 1- a 6 /\ b 8 B 1 4 B 0 I I B 0 Joost-Pieter Katoen Principles of Probabilistic Programming 46/222 ' kI
Yz
U
Example operational semantics
int cowboyDuel(float a, b) { int t := A [0.5] t := B; bool c := true; while (c) { if (t = A) { (c := false [a] t := B); } else { (c := false [b] t := A); } } return t; }
11 A 0 4 A 0 6 A 1 a!
2 • •H--+13 B • 3 A * I 8V
.>--'-r-L....I B I 1- b 4 A 1 8 B I 5A l \ 581\
6 A l 4 B I 1- a 6 /\ b 8 B 1 4 B 0 I I B 0This (parametric) MC is finite. Once we count the number of shots before one of the cowboys dies, the MC becomes countably infinite.
Joost-Pieter Katoen Principles of Probabilistic Programming 46/222 ^ 12A
' ha
Duelling cowboys
int cowboyDuel(float a, b) { // 0 < a < 1, 0 < b < 1 int t := A [1] t := B; // decide who shoots first bool c := true; while (c) { if (t = A) { (c := false [a] t := B); // A shoots B with prob. a } else { (c := false [b] t := A); // B shoots A with prob. b } } return t; // the survivor }
Claim: Cowboy A wins the duel with probability
a a+bab.
Joost-Pieter Katoen Principles of Probabilistic Programming 47/222Wa
= a tfr
The piranha puzzle
What is the probability that the original fish in the bowl was a piranha?
Joost-Pieter Katoen Principles of Probabilistic Programming 48/222I
not V *V
Boo
go
The piranha puzzle
Equip the Markov chain with rewards. Consider expected rewards.
Joost-Pieter Katoen Principles of Probabilistic Programming 48/222b
✓
r & rq *Rewards
Reasoning about expectations using the operational semantics: use rewards. MC with rewards An MC with rewards is a pair (M, r) with M an MC with state space Σ and r ⇥ Σ R a function assigning a real reward to each state.
The reward r(σ) stands for the reward earned on leaving state σ.
Joost-Pieter Katoen Principles of Probabilistic Programming 49/222Rewards
Reasoning about expectations using the operational semantics: use rewards. MC with rewards An MC with rewards is a pair (M, r) with M an MC with state space Σ and r ⇥ Σ R a function assigning a real reward to each state.
The reward r(σ) stands for the reward earned on leaving state σ.
Cumulative reward for reachability Let π = σ0 . . . σn be a finite path in (M, r) and T ⊆ Σ a set of target
rT(π) = r(σ0) + . . . + r(σk1) where σi / ∈ T for all i < k and σk ∈ T. If π / Ï ÉT, then rT(π) = ô.
Joost-Pieter Katoen Principles of Probabilistic Programming 49/222Expected reward reachability
Expected reward for reachability The expected reward until reaching T ⊆ Σ from σ ∈ Σ is: ERM(σ, ÉT) = 9
πÏÉT
PrM(r π) rT(r π) where r π = σ0 . . . σk is the shortest prefix of π such that σk ∈ T and σ0 = σ.
Joost-Pieter Katoen Principles of Probabilistic Programming 50/222Expected reward reachability
Expected reward for reachability The expected reward until reaching T ⊆ Σ from σ ∈ Σ is: ERM(σ, ÉT) = 9
πÏÉT
PrM(r π) rT(r π) where r π = σ0 . . . σk is the shortest prefix of π such that σk ∈ T and σ0 = σ. Conditional expected reward Let ERM(σ, ÉT ∂ ¬ÉF) = ERM(σ, ÉT 0 ¬ÉF) 1 Pr(ÉF) be the conditional expected reward until reaching T while avoiding F.
Joost-Pieter Katoen Principles of Probabilistic Programming 50/222The piranha puzzle
f1 := gf [0.5] f1 := pir; f2 := pir; s := f1 [0.5] s := f2;
What is the probability that the original fish in the bowl was a piranha?
Joost-Pieter Katoen Principles of Probabilistic Programming 51/222I ✓ r
' *6.
O
The piranha puzzle
f1 := gf [0.5] f1 := pir; f2 := pir; s := f1 [0.5] s := f2;
What is the probability that the original fish in the bowl was a piranha?
Consider the expected reward of successful termination without violating any observation
Joost-Pieter Katoen Principles of Probabilistic Programming 51/222g
The piranha puzzle
f1 := gf [0.5] f1 := pir; f2 := pir; s := f1 [0.5] s := f2;
What is the probability that the original fish in the bowl was a piranha?
Consider the expected reward of successful termination without violating any observation
ER[
[ P ] ](σI, ÉÖsinkã ∂ ¬ÉÖ≤ã) = 11/2 + 01/4
1 1/4 =
1/2 3/4 = 2/3.
Joost-Pieter Katoen Principles of Probabilistic Programming 51/222Y
V.
✓
On computing expected rewards
Expected rewards in finite Markov chains can be computed in polynomial time by solving a system of linear equations. The same holds for conditional expected rewards.
Joost-Pieter Katoen Principles of Probabilistic Programming 52/222Recursion: pushdown Markov chains
Joost-Pieter Katoen Principles of Probabilistic Programming 53/222 . "BEAM
P
is = 'AThe
3 ⑧BY too
tp
=I. tpttztp
"A
terminationprobability
Take-home messages
Probabilistic programs: Z extend the expressive power of probabilistic graphical models Z are claimed to have many applications and great potential Z are classical programs with random assignment and conditioning Z have an operational semantics as (countably infinite) Markov chains Z recursive programs give rise to push-down Markov chains Next lecture: how to prove properties of probabilistic programs?
Joost-Pieter Katoen Principles of Probabilistic Programming 54/222