Principles of Probabilistic Programming Lectures at EWSCS 2020 - - PowerPoint PPT Presentation

principles of probabilistic programming
SMART_READER_LITE
LIVE PREVIEW

Principles of Probabilistic Programming Lectures at EWSCS 2020 - - PowerPoint PPT Presentation

Principles of Probabilistic Programming Lectures at EWSCS 2020 Winter School Joost-Pieter Katoen EWSCS 2020, Palmse, Estonia Joost-Pieter Katoen Principles of Probabilistic Programming 1/222 What is probabilistic programming? Probabilistic


slide-1
SLIDE 1

Principles of Probabilistic Programming

Lectures at EWSCS 2020 Winter School Joost-Pieter Katoen EWSCS 2020, Palmse, Estonia

Joost-Pieter Katoen Principles of Probabilistic Programming 1/222
slide-2
SLIDE 2 What is probabilistic programming?

Probabilistic programs

What? Programs with random assignments, conditioning and usual control-flow constructs Why? Z Random assignments: to describe randomised algorithms Z Conditioning: to describe stochastic decision making

Joost-Pieter Katoen Principles of Probabilistic Programming 10/222
slide-3
SLIDE 3 What is probabilistic programming?

Applications

Joost-Pieter Katoen Principles of Probabilistic Programming 11/222
slide-4
SLIDE 4 What is probabilistic programming?

Planning in AI: robot navigation

Uncertainty: noisy sensors and actuators, unknown environment5

5Evans et al., Modeling Agents with Probabilistic Programs, 2019 Joost-Pieter Katoen Principles of Probabilistic Programming 14/222
slide-5
SLIDE 5 What is probabilistic programming?

Security: The RSA-OAEP protocol

Correctness proof took more than 20 years

Joost-Pieter Katoen Principles of Probabilistic Programming 16/222
slide-6
SLIDE 6 What is probabilistic programming?

Printer troubleshooting in Windows 95

How likely is it that your print is garbled given that the ps-file is not and the page orientation is portrait? [Ramanna et al., Emerging Paradigms in Machine Learning, 2013]

Joost-Pieter Katoen Principles of Probabilistic Programming 19/222
slide-7
SLIDE 7 What is probabilistic programming?

Languages

Joost-Pieter Katoen Principles of Probabilistic Programming 21/222

Tp

Pyro

Ub

O

slide-8
SLIDE 8 What is probabilistic programming?

Issue 1: Program correctness

Z Classical programs:

Z A program is correct with respect to a (formal) specification “for input array A, the output array B is sorted and contains all elements contained in A” Z Defines a deterministic input-output relation Z Partial correctness: if an output is produced, it is correct Z Total correctness: in addition, the program terminates

Z Probabilistic programs:

Z They do not always generate the same output Z They generate a probability distribution over possible outputs

Joost-Pieter Katoen Principles of Probabilistic Programming 23/222
slide-9
SLIDE 9 What is probabilistic programming?

Issue 2: Termination

Z Classical programs:

Z They terminate (on a given/all inputs), or they do not Z If they terminate, they take finitely many steps to do so Z Showing program termination is undecidable (halting problem)

Z Probabilistic programs:

Z They terminate (or not) with a certain likelihood Z They may have diverging runs whose likelihood is zero Z They may take infinitely many steps (on average) to terminate

even if they terminate with probability one!

Z Showing “probability-one” termination is “more” undecidable

Z and showing they do in finite time on average, even more!

Joost-Pieter Katoen Principles of Probabilistic Programming 24/222 112

a

±

' a

*

x

x

slide-10
SLIDE 10 What is probabilistic programming?

Issue 3: The program’s runtime

Z Classical programs:

Z They have a deterministic, fixed run-time for a given input Z Runtimes of terminating programs in sequence are compositional:

if P and Q terminate in n and k steps, then P;Q halts in n+k steps

Z Analysis techniques: recurrence equations, tree analysis, etc.

Z Probabilistic programs:

Z Every runtime has a probability; their runtime is a distribution Z Runtimes of “probability-one” terminating programs may not sum up

if P and Q terminate in n and k steps on average, then P;Q may need infinitely many steps on average

Z Analysis techniques: involve reasoning about expected values etc.

Joost-Pieter Katoen Principles of Probabilistic Programming 25/222
slide-11
SLIDE 11 What is probabilistic programming?

This EWSCS 2020 tutorial

Z The probabilistic guarded command language pGCL

Z examples, syntax, semantics (Markov chains), conditioning, recursion

Z Proving correctness of probabilistic programs

Z weakest pre-conditions, loop invariants, post-conditions, conditioning

Z Almost-sure termination

Z positive a.s.-termination, (a bit of) hardness, stochastic ranking functions

Z Analysing runtimes of probabilistic programs

Z examples, finite versus infinite expected runtime, wp-reasoning

Z Verifying and runtime analysis of Bayesian networks

Joost-Pieter Katoen Principles of Probabilistic Programming 26/222
slide-12
SLIDE 12 What is probabilistic programming?

Overview

1

What is probabilistic programming?

2

Probabilistic Guarded Command Language

3

Weakest preconditions

4

Conditioning

5

Expected runtime analysis

6

Analysing Bayesian networks

7

Recursion

8

Loop invariant synthesis

9

Epilogue

Joost-Pieter Katoen Principles of Probabilistic Programming 27/222

Http

slide-13
SLIDE 13 Probabilistic Guarded Command Language

Overview

1

What is probabilistic programming?

2

Probabilistic Guarded Command Language

3

Weakest preconditions

4

Conditioning

5

Expected runtime analysis

6

Analysing Bayesian networks

7

Recursion

8

Loop invariant synthesis

9

Epilogue

Joost-Pieter Katoen Principles of Probabilistic Programming 28/222

Mawgan

slide-14
SLIDE 14 Probabilistic Guarded Command Language

Dijkstra’s guarded command language

Z skip empty statement Z diverge divergence Z x := E assignment Z prog1 ; prog2 sequential composition Z if (G) prog1 else prog2 choice Z prog1 [] prog2 non-deterministic choice Z while (G) prog iteration

Joost-Pieter Katoen Principles of Probabilistic Programming 29/222
slide-15
SLIDE 15 Probabilistic Guarded Command Language

Probabilistic GCL

Kozen McIver Morgan

Z skip empty statement Z diverge divergence Z x := E assignment Z observe (G) conditioning Z prog1 ; prog2 sequential composition Z if (G) prog1 else prog2 choice Z prog1 [p] prog2 probabilistic choice Z while (G) prog iteration

Joost-Pieter Katoen Principles of Probabilistic Programming 30/222

p

  • I
P p
  • I
Xt 7

P

  • n
  • p
slide-16
SLIDE 16 Probabilistic Guarded Command Language

Let’s start simple

x := 0 [0.5] x := 1; y := -1 [0.5] y := 0 This program admits four runs and yields the outcome:

Pr[x =0, y =0] = Pr[x =0, y =1] = Pr[x =1, y =0] = Pr[x =1, y =1] = 1/4

Joost-Pieter Katoen Principles of Probabilistic Programming 31/222
slide-17
SLIDE 17 Probabilistic Guarded Command Language

A loopy program

For 0 < p < 1 an arbitrary probability: bool c := true; int i := 0; while (c) { i++; (c := false [p] c := true) } The loopy program models a geometric distribution with parameter p. Pr[i = N] = (1p)N1 p for N > 0

Joost-Pieter Katoen Principles of Probabilistic Programming 32/222
  • ne
time C ie false N
  • r
times c :
  • tune
slide-18
SLIDE 18 Probabilistic Guarded Command Language

On termination

bool c := true; int i := 0; while (c) { i++; (c := false [p] c := true) } This program does not always terminate. It almost surely terminates.

Joost-Pieter Katoen Principles of Probabilistic Programming 33/222
slide-19
SLIDE 19 Probabilistic Guarded Command Language

Conditioning

Joost-Pieter Katoen Principles of Probabilistic Programming 34/222
slide-20
SLIDE 20 Probabilistic Guarded Command Language

Let’s start simple

x := 0 [0.5] x := 1; y := -1 [0.5] y := 0;

  • bserve (x+y = 0)

This program blocks two runs as they violate x+y = 0. Outcome:

Pr[x =0, y =0] = Pr[x =1, y =1] = 1/2

Joost-Pieter Katoen Principles of Probabilistic Programming 35/222

'

t

.
slide-21
SLIDE 21 Probabilistic Guarded Command Language

Let’s start simple

x := 0 [0.5] x := 1; y := -1 [0.5] y := 0;

  • bserve (x+y = 0)

This program blocks two runs as they violate x+y = 0. Outcome:

Pr[x =0, y =0] = Pr[x =1, y =1] = 1/2

Observations thus normalize the probability of the “feasible” program runs

Joost-Pieter Katoen Principles of Probabilistic Programming 35/222
slide-22
SLIDE 22 Probabilistic Guarded Command Language

A loopy program

For 0 < p < 1 an arbitrary probability:

bool c := true; int i : = 0; while (c) { i++; (c := false [p] c := true) }

  • bserve (odd(i))
Joost-Pieter Katoen Principles of Probabilistic Programming 36/222

geom

C p ) {

⇒ "

slide-23
SLIDE 23 Probabilistic Guarded Command Language

A loopy program

For 0 < p < 1 an arbitrary probability:

bool c := true; int i : = 0; while (c) { i++; (c := false [p] c := true) }

  • bserve (odd(i))

The feasible program runs have a probability 8N≥0 (1p)2Np = 1 2 p This program models the distribution: Pr[i = 2N+1] = (1p)2N p (2p) for N ≥ 0 Pr[i = 2N] = 0

Joost-Pieter Katoen Principles of Probabilistic Programming 36/222
slide-24
SLIDE 24 Probabilistic Guarded Command Language

Why formal semantics matters

Z Unambiguous meaning to (almost) all probabilistic programs Z Operational interpretation to weakest pre-expectations Z Basis for proving correctness

Z of programs Z of program transformations Z of program equivalence Z of static analysis Z of compilers Z . . . . . .

Joost-Pieter Katoen Principles of Probabilistic Programming 37/222

~

discrete
  • in
slide-25
SLIDE 25 Probabilistic Guarded Command Language

Andrei Andrejewitsch Markow

Joost-Pieter Katoen Principles of Probabilistic Programming 38/222
slide-26
SLIDE 26 Probabilistic Guarded Command Language

Markov chains

A Markov chain (MC) is a triple (Σ, σI, P) with: Z Σ being a countable set of states Z σI ∈ Σ the initial state, and Z P ⇥ Σ Dist(Σ) the transition probability function where Dist(Σ) is a discrete probability measure on Σ.

Joost-Pieter Katoen Principles of Probabilistic Programming 39/222

÷

"

s OF

O O

7 2 3 4

rhet

P ( 2

, . )

=p

with

gli

)=

I

its

)= }

refs

)=o star

,3
slide-27
SLIDE 27 Probabilistic Guarded Command Language

Operational semantics

Aim: Model the behaviour of a program P by the MC [ [ P ] ].

Joost-Pieter Katoen Principles of Probabilistic Programming 40/222

← violating

  • f
  • n
  • bserve

I

satisfy

all
  • bserves
slide-28
SLIDE 28 Probabilistic Guarded Command Language

Operational semantics

Aim: Model the behaviour of a program P by the MC [ [ P ] ].

This can be defined using Plotkin’s SOS-style semantics

Joost-Pieter Katoen Principles of Probabilistic Programming 40/222
slide-29
SLIDE 29 Probabilistic Guarded Command Language

Operational semantics

Aim: Model the behaviour of a program P by the MC [ [ P ] ]. Approach: Z Take states of the form

Z ÖQ, sã with program Q or ⇤, and variable valuation s ⇥ Var Q Z Ö≤ã models the violation of an observation, and Z Ösinkã models program termination (successful or violated observation)

Z Take initial state ÖP, sã where s fulfils the initial conditions Z Take transition relation as smallest relation satisfying the SOS rules

Joost-Pieter Katoen Principles of Probabilistic Programming 41/222
slide-30
SLIDE 30 Probabilistic Guarded Command Language

Some SOS rules

Öskip, sã Ö⇤, sã Ödiverge, sã Ödiverge, sã

Joost-Pieter Katoen Principles of Probabilistic Programming 42/222 I

I ( die , s )
slide-31
SLIDE 31 Probabilistic Guarded Command Language

Some SOS rules

Öskip, sã Ö⇤, sã Ödiverge, sã Ödiverge, sã s Ï G Öobserve(G), sã Ö⇤, sã s / Ï G Öobserve(G), sã Ö≤ã

Joost-Pieter Katoen Principles of Probabilistic Programming 42/222

I

slide-32
SLIDE 32 Probabilistic Guarded Command Language

Some SOS rules

Öskip, sã Ö⇤, sã Ödiverge, sã Ödiverge, sã s Ï G Öobserve(G), sã Ö⇤, sã s / Ï G Öobserve(G), sã Ö≤ã Ö⇤, sã Ösinkã Ö≤ã Ösinkã Ösinkã Ösinkã

Joost-Pieter Katoen Principles of Probabilistic Programming 42/222 1 1
slide-33
SLIDE 33 Probabilistic Guarded Command Language

Some SOS rules

Öskip, sã Ö⇤, sã Ödiverge, sã Ödiverge, sã s Ï G Öobserve(G), sã Ö⇤, sã s / Ï G Öobserve(G), sã Ö≤ã Ö⇤, sã Ösinkã Ö≤ã Ösinkã Ösinkã Ösinkã Öx ⇥= E, sã Ö⇤, s[x ⇥= s([ [ E ] ])]ã

Joost-Pieter Katoen Principles of Probabilistic Programming 42/222

TT

slide-34
SLIDE 34 Probabilistic Guarded Command Language

Some SOS rules

Öskip, sã Ö⇤, sã Ödiverge, sã Ödiverge, sã s Ï G Öobserve(G), sã Ö⇤, sã s / Ï G Öobserve(G), sã Ö≤ã Ö⇤, sã Ösinkã Ö≤ã Ösinkã Ösinkã Ösinkã Öx ⇥= E, sã Ö⇤, s[x ⇥= s([ [ E ] ])]ã ÖP[ p] Q, sã µ with µ(ÖP, sã) = p and µ(ÖQ, sã) = 1p

Joost-Pieter Katoen Principles of Probabilistic Programming 42/222

O

't

÷

slide-35
SLIDE 35 Probabilistic Guarded Command Language

Some SOS rules

Öskip, sã Ö⇤, sã Ödiverge, sã Ödiverge, sã s Ï G Öobserve(G), sã Ö⇤, sã s / Ï G Öobserve(G), sã Ö≤ã Ö⇤, sã Ösinkã Ö≤ã Ösinkã Ösinkã Ösinkã Öx ⇥= E, sã Ö⇤, s[x ⇥= s([ [ E ] ])]ã ÖP[ p] Q, sã µ with µ(ÖP, sã) = p and µ(ÖQ, sã) = 1p ÖP, sã Ö≤ã ÖP; Q, sã Ö≤ã ÖP, sã µ ÖP; Q, sã ν with ν(ÖP¨; Q¨, s¨ã) = µ(ÖP¨, s¨ã) where ⇤; Q = Q

Joost-Pieter Katoen Principles of Probabilistic Programming 42/222
slide-36
SLIDE 36 Probabilistic Guarded Command Language

Some SOS rules

Öskip, sã Ö⇤, sã Ödiverge, sã Ödiverge, sã s Ï G Öobserve(G), sã Ö⇤, sã s / Ï G Öobserve(G), sã Ö≤ã Ö⇤, sã Ösinkã Ö≤ã Ösinkã Ösinkã Ösinkã Öx ⇥= E, sã Ö⇤, s[x ⇥= s([ [ E ] ])]ã ÖP[ p] Q, sã µ with µ(ÖP, sã) = p and µ(ÖQ, sã) = 1p ÖP, sã Ö≤ã ÖP; Q, sã Ö≤ã ÖP, sã µ ÖP; Q, sã ν with ν(ÖP¨; Q¨, s¨ã) = µ(ÖP¨, s¨ã) where ⇤; Q = Q s Ï G Öwhile(G){P}, sã ÖP; while (G){P}, sã s / Ï G Öwhile(G){P}, sã Ö⇤, sã

Joost-Pieter Katoen Principles of Probabilistic Programming 42/222
slide-37
SLIDE 37 Probabilistic Guarded Command Language

The piranha problem

[Tijms, 2004]

Joost-Pieter Katoen Principles of Probabilistic Programming 43/222
slide-38
SLIDE 38 Probabilistic Guarded Command Language

The operational semantics

Joost-Pieter Katoen Principles of Probabilistic Programming 44/222
slide-39
SLIDE 39 Probabilistic Guarded Command Language

The operational semantics

Joost-Pieter Katoen Principles of Probabilistic Programming 44/222
slide-40
SLIDE 40 Probabilistic Guarded Command Language

The good, the bad, and the ugly

Joost-Pieter Katoen Principles of Probabilistic Programming 45/222
slide-41
SLIDE 41 Probabilistic Guarded Command Language

Example operational semantics

int cowboyDuel(float a, b) { int t := A [0.5] t := B; bool c := true; while (c) { if (t = A) { (c := false [a] t := B); } else { (c := false [b] t := A); } } return t; }

Joost-Pieter Katoen Principles of Probabilistic Programming 46/222
slide-42
SLIDE 42 Probabilistic Guarded Command Language

Example operational semantics

int cowboyDuel(float a, b) { int t := A [0.5] t := B; bool c := true; while (c) { if (t = A) { (c := false [a] t := B); } else { (c := false [b] t := A); } } return t; }

11 A 0 4 A 0 6 A 1 a

!

2 • •H--+13 B • 3 A * I 8

V

.>--'-r-L....I B I 1- b 4 A 1 8 B I 5A l \ 581

\

6 A l 4 B I 1- a 6 /\ b 8 B 1 4 B 0 I I B 0 Joost-Pieter Katoen Principles of Probabilistic Programming 46/222 ' k

I

Yz

U

slide-43
SLIDE 43 Probabilistic Guarded Command Language

Example operational semantics

int cowboyDuel(float a, b) { int t := A [0.5] t := B; bool c := true; while (c) { if (t = A) { (c := false [a] t := B); } else { (c := false [b] t := A); } } return t; }

11 A 0 4 A 0 6 A 1 a

!

2 • •H--+13 B • 3 A * I 8

V

.>--'-r-L....I B I 1- b 4 A 1 8 B I 5A l \ 581

\

6 A l 4 B I 1- a 6 /\ b 8 B 1 4 B 0 I I B 0

This (parametric) MC is finite. Once we count the number of shots before one of the cowboys dies, the MC becomes countably infinite.

Joost-Pieter Katoen Principles of Probabilistic Programming 46/222 ^ 12

A

' h

a

slide-44
SLIDE 44 Probabilistic Guarded Command Language

Duelling cowboys

int cowboyDuel(float a, b) { // 0 < a < 1, 0 < b < 1 int t := A [1] t := B; // decide who shoots first bool c := true; while (c) { if (t = A) { (c := false [a] t := B); // A shoots B with prob. a } else { (c := false [b] t := A); // B shoots A with prob. b } } return t; // the survivor }

Claim: Cowboy A wins the duel with probability

a a+bab.

Joost-Pieter Katoen Principles of Probabilistic Programming 47/222

Wa

= a t

fr

  • a)
In
  • b )
. Wa
slide-45
SLIDE 45 Probabilistic Guarded Command Language

The piranha puzzle

What is the probability that the original fish in the bowl was a piranha?

Joost-Pieter Katoen Principles of Probabilistic Programming 48/222

I

not V *

V

Boo

go

slide-46
SLIDE 46 Probabilistic Guarded Command Language

The piranha puzzle

Equip the Markov chain with rewards. Consider expected rewards.

Joost-Pieter Katoen Principles of Probabilistic Programming 48/222

b

r & rq *
slide-47
SLIDE 47 Probabilistic Guarded Command Language

Rewards

Reasoning about expectations using the operational semantics: use rewards. MC with rewards An MC with rewards is a pair (M, r) with M an MC with state space Σ and r ⇥ Σ R a function assigning a real reward to each state.

The reward r(σ) stands for the reward earned on leaving state σ.

Joost-Pieter Katoen Principles of Probabilistic Programming 49/222
slide-48
SLIDE 48 Probabilistic Guarded Command Language

Rewards

Reasoning about expectations using the operational semantics: use rewards. MC with rewards An MC with rewards is a pair (M, r) with M an MC with state space Σ and r ⇥ Σ R a function assigning a real reward to each state.

The reward r(σ) stands for the reward earned on leaving state σ.

Cumulative reward for reachability Let π = σ0 . . . σn be a finite path in (M, r) and T ⊆ Σ a set of target

  • states. The cumulative reward along π until reaching T is for π Ï ÉT:

rT(π) = r(σ0) + . . . + r(σk1) where σi / ∈ T for all i < k and σk ∈ T. If π / Ï ÉT, then rT(π) = ô.

Joost-Pieter Katoen Principles of Probabilistic Programming 49/222
slide-49
SLIDE 49 Probabilistic Guarded Command Language

Expected reward reachability

Expected reward for reachability The expected reward until reaching T ⊆ Σ from σ ∈ Σ is: ERM(σ, ÉT) = 9

πÏÉT

PrM(r π) rT(r π) where r π = σ0 . . . σk is the shortest prefix of π such that σk ∈ T and σ0 = σ.

Joost-Pieter Katoen Principles of Probabilistic Programming 50/222
slide-50
SLIDE 50 Probabilistic Guarded Command Language

Expected reward reachability

Expected reward for reachability The expected reward until reaching T ⊆ Σ from σ ∈ Σ is: ERM(σ, ÉT) = 9

πÏÉT

PrM(r π) rT(r π) where r π = σ0 . . . σk is the shortest prefix of π such that σk ∈ T and σ0 = σ. Conditional expected reward Let ERM(σ, ÉT ∂ ¬ÉF) = ERM(σ, ÉT 0 ¬ÉF) 1 Pr(ÉF) be the conditional expected reward until reaching T while avoiding F.

Joost-Pieter Katoen Principles of Probabilistic Programming 50/222
slide-51
SLIDE 51 Probabilistic Guarded Command Language

The piranha puzzle

f1 := gf [0.5] f1 := pir; f2 := pir; s := f1 [0.5] s := f2;

  • bserve (s = pir)

What is the probability that the original fish in the bowl was a piranha?

Joost-Pieter Katoen Principles of Probabilistic Programming 51/222

I ✓ r

' *

6.

#

smh > /

O

slide-52
SLIDE 52 Probabilistic Guarded Command Language

The piranha puzzle

f1 := gf [0.5] f1 := pir; f2 := pir; s := f1 [0.5] s := f2;

  • bserve (s = pir)

What is the probability that the original fish in the bowl was a piranha?

Consider the expected reward of successful termination without violating any observation

Joost-Pieter Katoen Principles of Probabilistic Programming 51/222

g

c

. sync
slide-53
SLIDE 53 Probabilistic Guarded Command Language

The piranha puzzle

f1 := gf [0.5] f1 := pir; f2 := pir; s := f1 [0.5] s := f2;

  • bserve (s = pir)

What is the probability that the original fish in the bowl was a piranha?

Consider the expected reward of successful termination without violating any observation

ER[

[ P ] ](σI, ÉÖsinkã ∂ ¬ÉÖ≤ã) = 11/2 + 01/4

1 1/4 =

1/2 3/4 = 2/3.

Joost-Pieter Katoen Principles of Probabilistic Programming 51/222

Y

V.

  • O

  • ,
slide-54
SLIDE 54 Probabilistic Guarded Command Language

On computing expected rewards

Expected rewards in finite Markov chains can be computed in polynomial time by solving a system of linear equations. The same holds for conditional expected rewards.

Joost-Pieter Katoen Principles of Probabilistic Programming 52/222
slide-55
SLIDE 55 Probabilistic Guarded Command Language

Recursion: pushdown Markov chains

Joost-Pieter Katoen Principles of Probabilistic Programming 53/222 . "

BEAM

P

is = 'A

The

3

BY too

tp

=

I. tpttztp

"

A

termination

probability

slide-56
SLIDE 56 Probabilistic Guarded Command Language

Take-home messages

Probabilistic programs: Z extend the expressive power of probabilistic graphical models Z are claimed to have many applications and great potential Z are classical programs with random assignment and conditioning Z have an operational semantics as (countably infinite) Markov chains Z recursive programs give rise to push-down Markov chains Next lecture: how to prove properties of probabilistic programs?

Joost-Pieter Katoen Principles of Probabilistic Programming 54/222