Tail Probabilities for Randomized Program Runtimes via Martingales - - PowerPoint PPT Presentation

tail probabilities for randomized program runtimes via
SMART_READER_LITE
LIVE PREVIEW

Tail Probabilities for Randomized Program Runtimes via Martingales - - PowerPoint PPT Presentation

Tail Probabilities for Randomized Program Runtimes via Martingales for Higher Moments Satoshi Kura 1,2 Natsuki Urabe 1 Ichiro Hasuo 1,2 1 National Institute of Informatics, Tokyo, Japan 2 The Graduate University for Advanced Studies (SOKENDAI),


slide-1
SLIDE 1

Tail Probabilities for Randomized Program Runtimes via Martingales for Higher Moments

Satoshi Kura1,2 Natsuki Urabe1 Ichiro Hasuo1,2

1National Institute of Informatics, Tokyo, Japan 2The Graduate University for Advanced Studies (SOKENDAI),

Kanagawa, Japan

April 10, 2019

1 / 36

slide-2
SLIDE 2

Our question

“What is an upper bound of the tail probability?”

1 2 3 . . . 1 − p 1 − p 1 − p p p

How likely is it to terminate within 100 steps? (e.g. at least 90%) How unlikely is it to not terminate within 100 steps? (e.g. at most 10%)

step prob. 100 Pr(T ≥ 100)

  • tail probability

≤ ??

2 / 36

slide-3
SLIDE 3

Related work

Supermartingale-based approach

  • Proving almost-sure termination

[Chakarov & Sankaranarayanan, CAV’13]

  • Overapproximating tail probabilities:

Pr(T ≥ d) ≤ ??

[Chatterjee & Fu, arxiv preprint], [Chatterjee et al., TOPLAS’18]

  • Azuma’s, Hoeffding’s and Bernstein’s

inequalities

  • Markov’s inequality (wider applicability)

Pr(T ≥ d) ≤ E[T ] d

3 / 36

slide-4
SLIDE 4

Our approach

  • Aim: overapproximating tail probabilities:

Pr(T ≥ d) ≤ ??

  • Corollary of Markov’s inequality

Pr(T ≥ d) ≤ E[T k] dk

  • Extends ranking supermartingale for

higher moments E[T k] (k = 1, 2, . . . )

4 / 36

slide-5
SLIDE 5

Our workflow

randomized program

  • ur supermartingales

upper bounds of higher moments

  • E[T ], . . . , E[T K]
  • ≤ (u1, . . . , uK)

concentration inequality upper bound of tail probability Pr(T ≥ d) ≤ ? deadline d

5 / 36

slide-6
SLIDE 6

Our workflow

randomized program

  • ur supermartingales

upper bounds of higher moments

  • E[T ], . . . , E[T K]
  • ≤ (u1, . . . , uK)

concentration inequality upper bound of tail probability Pr(T ≥ d) ≤ ? deadline d

6 / 36

slide-7
SLIDE 7

Randomized program

✓ sampling ✓ (demonic/termination avoiding) nondeterminism Given as a pCFG (probabilistic control flow graph).

1 x := 5; 2 while x > 0 do 3 if prob (0.4) then 4 x := x + 1 5 else 6 x := x - 1 7 fi 8

  • d

l1 l2 l3 l4 l5 l6 x := 5 ¬(x > 0) x > 0 0.4 0.6 x := x + 1 x := x − 1 7 / 36

slide-8
SLIDE 8

Semantics

  • Configuration: (l,

x) ∈ L × RV

  • L: finite set of locations
  • V : finite set of program variables
  • Run: sequence of configurations

l1 l2 l3 l4 l5 l6 x := 5 ¬(x > 0) x > 0 0.4 0.6 x := x + 1 x := x − 1

(l1, [x → 0])

8 / 36

slide-9
SLIDE 9

Semantics

  • Configuration: (l,

x) ∈ L × RV

  • L: finite set of locations
  • V : finite set of program variables
  • Run: sequence of configurations

l1 l2 l3 l4 l5 l6 x := 5 ¬(x > 0) x > 0 0.4 0.6 x := x + 1 x := x − 1

(l1, [x → 0]) (l2, [x → 5]) 1

8 / 36

slide-10
SLIDE 10

Semantics

  • Configuration: (l,

x) ∈ L × RV

  • L: finite set of locations
  • V : finite set of program variables
  • Run: sequence of configurations

l1 l2 l3 l4 l5 l6 x := 5 ¬(x > 0) x > 0 0.4 0.6 x := x + 1 x := x − 1

(l1, [x → 0]) (l2, [x → 5]) (l3, [x → 5]) 1 1

8 / 36

slide-11
SLIDE 11

Semantics

  • Configuration: (l,

x) ∈ L × RV

  • L: finite set of locations
  • V : finite set of program variables
  • Run: sequence of configurations

l1 l2 l3 l4 l5 l6 x := 5 ¬(x > 0) x > 0 0.4 0.6 x := x + 1 x := x − 1

(l1, [x → 0]) (l2, [x → 5]) (l3, [x → 5]) (l4, [x → 5]) 1 1 0.4

8 / 36

slide-12
SLIDE 12

Semantics

  • Configuration: (l,

x) ∈ L × RV

  • L: finite set of locations
  • V : finite set of program variables
  • Run: sequence of configurations

l1 l2 l3 l4 l5 l6 x := 5 ¬(x > 0) x > 0 0.4 0.6 x := x + 1 x := x − 1

(l1, [x → 0]) (l2, [x → 5]) (l3, [x → 5]) (l4, [x → 5]) . . . 1 1 0.4

8 / 36

slide-13
SLIDE 13

Semantics

  • Configuration: (l,

x) ∈ L × RV

  • L: finite set of locations
  • V : finite set of program variables
  • Run: sequence of configurations

l1 l2 l3 l4 l5 l6 x := 5 ¬(x > 0) x > 0 0.4 0.6 x := x + 1 x := x − 1

(l1, [x → 0]) (l2, [x → 5]) (l3, [x → 5]) (l4, [x → 5]) . . . 1 1 0.4

8 / 36

slide-14
SLIDE 14

Semantics

  • Configuration: (l,

x) ∈ L × RV

  • L: finite set of locations
  • V : finite set of program variables
  • Run: sequence of configurations

l1 l2 l3 l4 l5 l6 x := 5 ¬(x > 0) x > 0 0.4 0.6 x := x + 1 x := x − 1

(l1, [x → 0]) (l2, [x → 5]) (l3, [x → 5]) (l4, [x → 5]) . . . (l5, [x → 5]) 1 1 0.4 0.6

8 / 36

slide-15
SLIDE 15

Semantics

  • Configuration: (l,

x) ∈ L × RV

  • L: finite set of locations
  • V : finite set of program variables
  • Run: sequence of configurations

l1 l2 l3 l4 l5 l6 x := 5 ¬(x > 0) x > 0 0.4 0.6 x := x + 1 x := x − 1

(l1, [x → 0]) (l2, [x → 5]) (l3, [x → 5]) (l4, [x → 5]) . . . (l5, [x → 5]) . . . 1 1 0.4 0.6

8 / 36

slide-16
SLIDE 16

Our workflow

randomized program

  • ur supermartingales

upper bounds of higher moments

  • E[T ], . . . , E[T K]
  • ≤ (u1, . . . , uK)

concentration inequality upper bound of tail probability Pr(T ≥ d) ≤ ? deadline d

9 / 36

slide-17
SLIDE 17

Ranking function

[Floyd, ’67]

r : L × RV → N ∪ {∞} For each transition, r decreases by (at least) 1: (l, x) → (l′, x′) = ⇒ r(l′, x′) ≤ r(l, x) − 1

Theorem

If r(l, x) < ∞, then the program is terminating from (l, x) within r(l, x) steps.

1 x := 5; 2 while x > 0 do 3 x := x - 1 4

  • d

l1 2x + 1 l2 2x l3 x > 0 x := x − 1 x ≤ 0

10 / 36

slide-18
SLIDE 18

Ranking supermartingale

[Chakarov & Sankaranarayanan, CAV’13]

η : L × RV → [0, ∞] For each transition, η decreases by (at least) 1 “on average”: (Xη)(l, x) ≤ η(l, x) − 1 for each (l, x) where X is next-time operator (the expected value after one transition): (Xη)(l, x) := E[η(l′, x′) | (l, x) → (l′, x′)].

11 / 36

slide-19
SLIDE 19

Ranking supermartingale

Theorem

If η(l, x) < ∞, then the program is (positively) almost surely terminating from (l, x) with the expected runtime ≤ η(l, x) steps. This can be explained lattice-theoretically.

  • Expected runtime is a lfp
  • Ranking supermartingale is a prefixed point

12 / 36

slide-20
SLIDE 20

Runtime before and after transition

Let T (l, x) be a random variable representing the runtime from (l, x).

l0 T (l0, x0) l1 T (l1, x1) l2 T (l2, x2) . . . . . . p 1 − p

Runtime from (l0, x0):

  • T (l1,

x1) + 1 with probability p

  • T (l2,

x2) + 1 with probability 1 − p

13 / 36

slide-21
SLIDE 21

Expected runtime is a fixed point

l0 T (l0, x0) l1 T (l1, x1) l2 T (l2, x2) . . . . . . p 1 − p

E[T ](l0, x0) = pE[T (l1, x1) + 1] + (1 − p)E[T (l2, x2) + 1] = p(E[T (l1, x1)] + 1) + (1 − p)(E[T (l2, x2)] + 1) = E

  • E[T (l′,

x′)] + 1 | (l0, x0) → (l′, x′)

  • = (X(E[T ] + 1))(l0,

x0)

where E[T ] := λ(l, x). E[T (l, x)].

14 / 36

slide-22
SLIDE 22

Expected runtime is lfp

E[T ] = X(E[T ] + 1) In fact, E[T ] is the “least” fixed point of F1(η) := X(η + 1).

  • F1 is a monotone function on the complete

lattice [0, ∞]L×RV

  • F1 adds 1 unit of time, and then calculate the

expected value after one transition

15 / 36

slide-23
SLIDE 23

Ranking supermartingale is prefixed point

η is a ranking supermartingale ⇐ ⇒ η is a prefixed point of F1 F1η = X(η + 1) ≤ η

Theorem (Knaster–Tarski)

Let L be a complete lattice and F : L → L be a monotone function. The least fixed point µF is the least prefixed point. Therefore we have F η ≤ η = ⇒ µF ≤ η. It follows that η is a ranking supermartingale = ⇒ E[T ] ≤ η.

16 / 36

slide-24
SLIDE 24

Our supermartingale

[Chakarov & Sankaranarayanan, CAV’13]

lattice L × RV → [0, ∞] monotone function F F1 lfp µF E[T ] prefixed point F η ≤ η ranking supermartingale η Knaster–Tarski µF ≤ η E[T ] ≤ η

†for a pCFG without nondeterminism

17 / 36

slide-25
SLIDE 25

Our supermartingale

[Chakarov & Sankaranarayanan, CAV’13]

Our supermartingale lattice L × RV → [0, ∞] L × RV → [0, ∞]K monotone function F F1 FK lfp µF E[T ] (E[T ], . . . , E[T K])† prefixed point F η ≤ η ranking supermartingale η ranking supermartingale for higher moments

  • η

Knaster–Tarski µF ≤ η E[T ] ≤ η (E[T ], . . . , E[T K]) ≤ η

†for a pCFG without nondeterminism

17 / 36

slide-26
SLIDE 26

Runtime before and after transition

Let T (l, x) be a random variable representing the runtime from (l, x).

l0 T (l0, x0) l1 T (l1, x1) l2 T (l2, x2) . . . . . . p 1 − p

Runtime from (l0, x0):

  • T (l1,

x1) + 1 with probability p

  • T (l2,

x2) + 1 with probability 1 − p

18 / 36

slide-27
SLIDE 27

Characterizing E[T 2] as lfp?

l0 T (l0, x0) l1 T (l1, x1) l2 T (l2, x2) . . . . . . p 1 − p

E[T 2](l0, x0) = pE[

  • T (l1,

x1) + 1 2] + (1 − p)E[(T (l2, x2) + 1)2] =

  • X(E[T 2] + 2E[T ] + 1)
  • (l0,

x0)

19 / 36

slide-28
SLIDE 28

Characterizing E[T 2] as lfp?

l0 T (l0, x0) l1 T (l1, x1) l2 T (l2, x2) . . . . . . p 1 − p

E[T 2](l0, x0) = pE[

  • T (l1,

x1) + 1 2] + (1 − p)E[(T (l2, x2) + 1)2] =

  • X(E[T 2] + 2E[T ] + 1)
  • (l0,

x0) Calculate E[T ] and E[T 2] simultaneously

19 / 36

slide-29
SLIDE 29

Characterizing E[T ] and E[T 2] as lfp

  • E[T ]

E[T 2]

  • = X
  • 1

1

  • +
  • 1 0

2 1 E[T ] E[T 2]

  • In fact, (E[T ], E[T 2]) is the “least” fixed point of

F2

  • η1

η2

  • := X
  • 1

1

  • +
  • 1 0

2 1 η1 η2

  • where
  • η1, η2 : L × RV → [0, ∞]
  • E[T ] = λ(l,

x). E[T (l, x)]

  • E[T 2] = λ(l,

x). E[

  • T (l,

x) 2]

20 / 36

slide-30
SLIDE 30

Characterizing higher moments as lfp

In the same way, we can define

FK :

  • L × RV → [0, ∞]K

  • L × RV → [0, ∞]K

that characterizes higher moments of runtime.

Lemma

  • For a pCFG without nondeterminism,

(E[T ], . . . , E[T K]) = µFK.

  • In general (with nondeterminism),

(E[T ], . . . , E[T K]) ≤ µFK.

21 / 36

slide-31
SLIDE 31

Supermartingale is a prefixed point

Definition

A ranking supermartingale for K-th moment is a prefixed point η = (η1, . . . , ηK) of FK. FK η ≤ η By the Knaster–Tarski theorem, η gives an upper bound (even with nondeterminism).   E[T ] . . . E[T K]   ≤ µFK ≤   η1 . . . ηK  

22 / 36

slide-32
SLIDE 32

Our workflow

randomized program

  • ur supermartingales

upper bounds of higher moments

  • E[T ], . . . , E[T K]
  • ≤ (u1, . . . , uK)

concentration inequality upper bound of tail probability Pr(T ≥ d) ≤ ? deadline d

23 / 36

slide-33
SLIDE 33

Problem

Assume

  • d > 0,
  • T is a nonnegative random variable,

 E[T ] . . . E[T K]   ≤   u1 . . . uK  ,

  • but we do not know the exact values of

E[T ], . . . , E[T K]. How to obtain an upper bound of P (T ≥ d)?

24 / 36

slide-34
SLIDE 34

If K = 1 ...

Theorem (Markov’s inequality)

If T is a nonnegative r.v. and d > 0, Pr(T ≥ d) ≤ E[T ] d . By E[T ] ≤ u1, Pr(T ≥ d) ≤ E[T ] d ≤ u1 d .

25 / 36

slide-35
SLIDE 35

General case

  • For any k ∈ {1, . . . , K},

Pr(T ≥ d) = Pr(T k ≥ dk) ≤ E[T k] dk ≤ uk dk

  • (“0-th” moment)

Pr(T ≥ d) ≤ 1 = E[T 0] d0

26 / 36

slide-36
SLIDE 36

Concentration inequality we used

Pr(T ≥ d) ≤ min

k=0,...,K

uk dk where

  • d > 0
  • T is a nonnegative random variable

 E[T ] . . . E[T K]   ≤   u1 . . . uK  

  • u0 = 1

Moreover, this gives the “optimal” upper bound under the above conditions.

27 / 36

slide-37
SLIDE 37

Our workflow

randomized program

  • ur supermartingales

upper bounds of higher moments

  • E[T ], . . . , E[T K]
  • ≤ (u1, . . . , uK)

concentration inequality upper bound of tail probability Pr(T ≥ d) ≤ ? deadline d

28 / 36

slide-38
SLIDE 38

Synthesis (linear template)

Based on [Chakarov & Sankaranarayanan, CAV’13]

  • Input: a pCFG with initial config (linit,

xinit)

  • Output: an upper bound of E[T K](linit,

xinit) Assume that η = (η1, . . . , ηK) is linear: ηk(l, x) = ak,l · x + bk,l (k = 1, . . . , K) Determine ak,l, bk,l by solving the LP problem:

  • minimize: ηK(linit,

xinit)

  • subject to: ranking supermartingale condition

(using Farkas’ lemma) Then we have E[T K](linit, xinit) ≤ min ηK(linit, xinit)

29 / 36

slide-39
SLIDE 39

Synthesis (polynomial template)

Based on [Chatterjee et al., CAV’16]

  • Input: a pCFG with initial config (linit,

xinit)

  • Output: an upper bound of E[T K](linit,

xinit) Assume that η = (η1, . . . , ηK) is polynomial. Determine coefficients by solving the SDP problem:

  • minimize: ηK(linit,

xinit)

  • subject to: ranking supermartingale condition

(using Positivstellensatz) Then we have E[T K](linit, xinit) ≤ min ηK(linit, xinit)

30 / 36

slide-40
SLIDE 40

Experiments

  • Implementation based on linear/polynomial

templates

  • Tested 7 example programs
  • 2 coupon collector’s problems
  • 5 random walks (some of them include

nondeterminism)

  • (degree of polynomial template) ≤ 3

31 / 36

slide-41
SLIDE 41

Experimental result (1)

A coupon collector’s problem (linear template) upper bound execution time E[T ] ≤ 68 0.024 s E[T 2] ≤ 3124 0.054 s E[T 3] ≤ 171932 0.089 s E[T 4] ≤ 12049876 0.126 s E[T 5] ≤ 1048131068 0.191 s

32 / 36

slide-42
SLIDE 42

Experimental result (1)

Pr(T ≥ d) ≤ min

k=0,...,K

uk dk 20 40 60 80 100120140 0.2 0.4 0.6 0.8 1 k = 1 k = 2 k = 3 k = 4 k = 5 deadline d tail probability

33 / 36

slide-43
SLIDE 43

Experimental result (1)

Pr(T ≥ d) ≤ min

k=0,...,K

uk dk 20 40 60 80 100120140 0.2 0.4 0.6 0.8 1 deadline d tail probability

33 / 36

slide-44
SLIDE 44

Experimental result (1)

Pr(T ≥ d) ≤ min

k=0,...,K

uk dk 20 40 60 80 100120140 0.2 0.4 0.6 0.8 1 k = 1 deadline d tail probability

33 / 36

slide-45
SLIDE 45

Experimental result (1)

Pr(T ≥ d) ≤ min

k=0,...,K

uk dk 20 40 60 80 100120140 0.2 0.4 0.6 0.8 1 k = 1 k = 2 deadline d tail probability

33 / 36

slide-46
SLIDE 46

Experimental result (1)

Pr(T ≥ d) ≤ min

k=0,...,K

uk dk 20 40 60 80 100120140 0.2 0.4 0.6 0.8 1 k = 1 k = 2 k = 3 deadline d tail probability

33 / 36

slide-47
SLIDE 47

Experimental result (1)

Pr(T ≥ d) ≤ min

k=0,...,K

uk dk 20 40 60 80 100120140 0.2 0.4 0.6 0.8 1 k = 1 k = 2 k = 3 k = 4 deadline d tail probability

33 / 36

slide-48
SLIDE 48

Experimental result (1)

Pr(T ≥ d) ≤ min

k=0,...,K

uk dk 20 40 60 80 100120140 0.2 0.4 0.6 0.8 1 k = 1 k = 2 k = 3 k = 4 k = 5 deadline d tail probability

33 / 36

slide-49
SLIDE 49

Experimental result (2)

A random walk with nondeterminism

  • Linear template

upper bound execution time E[T ] ≤ 96 0.020 s E[T 2]: infeasible 0.029 s

  • Polynomial template

upper bound execution time E[T ] ≤ 95.95 157.748 s E[T 2] ≤ 10944.0 361.957 s

34 / 36

slide-50
SLIDE 50

Experimental result (2)

100 200 300 400 0.2 0.4 0.6 0.8 1 k = 1 k = 2 deadline d tail probability

35 / 36

slide-51
SLIDE 51

Conclusion & Future work

Conclusion

  • New supermartingale for higher moments of runtime
  • Applied to obtain upper bounds of tail probabilities
  • Tested our method experimentally

Future work

  • Improved treatment of nondeterminism
  • Compositional reasoning (cf. [Kaminski et al., ESOP’16])
  • Improve implementation (numerical error of SDP solver)

36 / 36