[PPT] - Construction of Lyapunov functions via relative entropy with PowerPoint Presentation

SLIDE 1

Construction of Lyapunov functions via relative entropy with application to caching

Nicolas Gast1 ACM MAMA 2016, Antibes, France

1Inria Nicolas Gast – 1 / 23

SLIDE 2

Outline

1

Why?

2

How to make the fixed point method work (sufficient condition)

3

What: application to caching policy

4

Conclusion

Nicolas Gast – 2 / 23

SLIDE 3

State space explosion and mean-field method

313 ≈ 106 states. We need to keep track P(X1(t) = i1, . . . , Xn(t) = in)

Nicolas Gast – 3 / 23

SLIDE 4

State space explosion and mean-field method

313 ≈ 106 states. We need to keep track P(X1(t) = i1, . . . , Xn(t) = in)

The decoupling assumption is

P(X1(t) = i1, . . . , Xn(t) = in) ≈ P(X1(t) = i1) . . . P(Xn(t) = in) Problem: is this valid?

Nicolas Gast – 3 / 23

SLIDE 5

Decoupling assumption: (always) valid in transient regime

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

1 list (200) 4 lists (50/50/50/50)

Simulation

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

approx 1 list (200) approx 4 lists (50/50/50/50)

Mean-field: ˙ x = xQ(x)

Nicolas Gast – 4 / 23

SLIDE 6

Decoupling assumption: (always) valid in transient regime

Theorem (Kurtz (70’), Benaim, Le Boudec (08),...)

For many systems and any fixed t, if x → xQ(x) is Lipschitz-continuous then, as the number of objects N goes to infinity: lim

N→∞ P(Xk(t) = i) = xk,i(t),

where x satisfies ˙ x = xQ(x).

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

1 list (200) 4 lists (50/50/50/50)

Simulation

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

approx 1 list (200) approx 4 lists (50/50/50/50)

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

1 list (200) 4 lists (50/50/50/50)

de aprox (1 list)
de approx (4 lists)

Mean-field: ˙ x = xQ(x)

Nicolas Gast – 4 / 23

SLIDE 7

The fixed point method

We know that xi(t) ≈ P(X(t) = i) satisfies ˙ x = xQ(x). Does P(X = i) satisfies xQ(x) = 0? Method was used in many papers: Bianchi 002 Ramaiyan et al. 083 Kwak et al. 054 Kumar et al 085

2Performance analysis of the IEEE 802.11 distributed coordination function. – G. Bianchi. – IEEE J. Select. Areas

Commun. 2000.

3Fixed point analys is of single cell IEEE 802.11e WLANs: Uniqueness, multistability. – V. Ramaiyan, A. Kumar, and E.

Altman. – ACM/IEEE Trans. Networking. Oct. 2008.

4Performance analysis of exponenetial backoff. – B.-J. Kwak, N.-O. Song, and L. Miller. – ACM/IEEE Trans. Networking. 2005. 5New insights from a fixed-point analysis of single cell IEEE 802.11 WLANs. – A. Kumar, E. Altman, D. Miorandi, and M. Goyal. – ACM/IEEE Trans. Networking 2007 Nicolas Gast – 5 / 23

SLIDE 8

It does not always work67

S I R

1 + 10I S + a 5 10S + 10−3

Markov chain is irreducible. Unique fixed point xQ(x) = 0.

6Benaim Le Boudec 08 7Cho, Le Boudec, Jiang, On the Asymptotic Validity of the Decoupling Assumption

for Analyzing 802.11 MAC Protoco. 2010

Nicolas Gast – 6 / 23

SLIDE 9

It does not always work67

S I R

1 + 10I S + a 5 10S + 10−3

Markov chain is irreducible. Unique fixed point xQ(x) = 0. Fixed point

Stat. measure

xQ(x) = 0 N = 1000 xS xI πS πI a = .3 0.209 0.234 0.209 0.234

6Benaim Le Boudec 08 7Cho, Le Boudec, Jiang, On the Asymptotic Validity of the Decoupling Assumption

for Analyzing 802.11 MAC Protoco. 2010

Nicolas Gast – 6 / 23

SLIDE 10

It does not always work67

S I R

1 + 10I S + a 5 10S + 10−3

Markov chain is irreducible. Unique fixed point xQ(x) = 0. Fixed point

Stat. measure

xQ(x) = 0 N = 1000 xS xI πS πI a = .3 0.209 0.234 0.209 0.234 a = .1 0.078 0.126 0.11 0.13

6Benaim Le Boudec 08 7Cho, Le Boudec, Jiang, On the Asymptotic Validity of the Decoupling Assumption

for Analyzing 802.11 MAC Protoco. 2010

Nicolas Gast – 6 / 23

SLIDE 11

It does not always work

0.0 1.0 1.0 0.0 0.0 1.0 Fixed point true stationnary distribution limit cycle

R S I

Nicolas Gast – 7 / 23

SLIDE 12

Outline

1

Why?

2

How to make the fixed point method work (sufficient condition)

3

What: application to caching policy

4

Conclusion

Nicolas Gast – 8 / 23

SLIDE 13

Outline

1

Why?

2

How to make the fixed point method work (sufficient condition)

3

What: application to caching policy

4

Conclusion

Nicolas Gast – 9 / 23

SLIDE 14

Link between the decoupling assumption and ˙ x = xQ(x)

P(X1(t) = i1, . . . , Xn(t) = in) ≈ P(X1(t) = i1)

=x1,i1(t)

. . . P(Xn(t) = in)

=xn,in(t)

When we zoom on one object

P(X1(t + dt) = j|X1(t) = i) ≈ E [P(X1(t) = j|X1 = i ∧ X2 . . . Xn)] ≈ Q(1)

i,j (x) :=

i2...in

K(i,i2...in)→(j,j2...jn)x2,i2 . . . xn,in We then get: d dt x1,j(t) ≈

i

x1,iQ(1)

i,j (x)

Nicolas Gast – 10 / 23

SLIDE 15

Exchangeability of limits

Transient regime Stationary Markov chain ˙ p = pK πK = 0 t → ∞

Nicolas Gast – 11 / 23

SLIDE 16

Exchangeability of limits

Transient regime Stationary Markov chain ˙ p = pK πK = 0 t → ∞ Mean-field ˙ x = xQ(x) xQ(x) = 0

fixed points

N → ∞ ?

Nicolas Gast – 11 / 23

SLIDE 17

Exchangeability of limits

Transient regime Stationary Markov chain ˙ p = pK πK = 0 t → ∞ xQ(x) = 0 Mean-field ˙ x = xQ(x) xQ(x) = 0

fixed points

N → ∞ N → ∞ t → ∞

Nicolas Gast – 11 / 23

SLIDE 18

Exchangeability of limits

Transient regime Stationary Markov chain ˙ p = pK πK = 0 t → ∞ xQ(x) = 0 Mean-field ˙ x = xQ(x) xQ(x) = 0

fixed points

N → ∞ N → ∞ t → ∞ if yes then yes

Theorem ((i) Benaim Le Boudec 08,(ii) Le Boudec 12)

The stationary distribution πN concentrates on the fixed points if : (i) All trajectories of the ODE converges to the fixed points. (ii) (or) The markov chain is reversible.

Nicolas Gast – 11 / 23

SLIDE 19

Lyapunov functions

A solution of d dt x(t) = xQ(x(t)) converges to the fixed points of xQ(x) = 0, if there exists a Lyapunov function f , that is: Lower bounded: inf

x f (x) > +∞

Decreasing along trajectories: d dt f (x(t)) < 0, whenever x(t)Q(x(t)) = 0.

Nicolas Gast – 12 / 23

SLIDE 20

Lyapunov functions

A solution of d dt x(t) = xQ(x(t)) converges to the fixed points of xQ(x) = 0, if there exists a Lyapunov function f , that is: Lower bounded: inf

x f (x) > +∞

Decreasing along trajectories: d dt f (x(t)) < 0, whenever x(t)Q(x(t)) = 0.

How to find a Lyapnuov function

Energy? Distance? Entropy? Luck?

Nicolas Gast – 12 / 23

SLIDE 21

The relative entropy is a Lyapunov function for Markov chains

Let Q be the generator of an irreducible Markov chain and π be its stationary distribution. Let P(t) be the solution of d dt P(t) = P(t)Q.

Theorem (e.g. Budhiraja et al 15, Dupuis-Fischer 11)

The relative entropy R(Pπ) =

i

Pi log Pi πi is a Lyapunov function: d dt R(P(t)π) < 0, with equality if and only if P(t) = π.

Nicolas Gast – 13 / 23

SLIDE 22

Relative entropy for mean-field models

Assume that Q(x) be a generator of an irreducible Markov chain and let π(x) be its stationary distribution. Let P(t) be the solution of d dt P(t) = P(t)Q(P(t)). Then d dt R(P(t)π(t)) = d dt P(t) ∂ ∂P R(P(t), π(t))

≤0

+ d dt π(t) ∂ ∂πR(P(t), π(t))

=−

i xi(t) d dt log πi(t)

≤ −

i

xi(t) d dt log πi(t)

Nicolas Gast – 14 / 23

SLIDE 23

Relative entropy for mean-field models

Assume that Q(x) be a generator of an irreducible Markov chain and let π(x) be its stationary distribution. Let P(t) be the solution of d dt P(t) = P(t)Q(P(t)). Then d dt R(P(t)π(t)) = d dt P(t) ∂ ∂P R(P(t), π(t))

≤0

+ d dt π(t) ∂ ∂πR(P(t), π(t))

=−

i xi(t) d dt log πi(t)

≤ −

i

xi(t) d dt log πi(t)

Theorem

If there exists a lower bounded integral F(x) of −

i

xi(t) d dt log πi(t), then x → R(xπ(x)) + F(x) is a Lyapunov function for the mean-field model.

Nicolas Gast – 14 / 23

SLIDE 24

Outline

1

Why?

2

How to make the fixed point method work (sufficient condition)

3

What: application to caching policy

4

Conclusion

Nicolas Gast – 15 / 23

SLIDE 25

I consider a cache (virtually) divided into lists

Application data source

list 1

. . .

list j list j+1

. . .

list h

IRM Probability request pi RAND Upon hit/miss: Exchanged with random from next list.

Nicolas Gast – 16 / 23

SLIDE 26

I consider a cache (virtually) divided into lists

Application data source

list 1

. . .

list j list j+1

. . .

list h

miss IRM Probability request pi RAND Upon hit/miss: Exchanged with random from next list.

Nicolas Gast – 16 / 23

SLIDE 27

I consider a cache (virtually) divided into lists

Application data source

list 1

. . .

list j list j+1

. . .

list h

hit miss IRM Probability request pi RAND Upon hit/miss: Exchanged with random from next list.

Nicolas Gast – 16 / 23

SLIDE 28

We construct the ODE by assuming independence

Let Hi(t) be the popularity in list i.

Nicolas Gast – 17 / 23

SLIDE 29

We construct the ODE by assuming independence

Let Hi(t) be the popularity in list i. If xk,i(t) is the probability that item k is in list i at time t: ODE of the type ˙ x = xQ(x).

Nicolas Gast – 17 / 23

SLIDE 30

Transient regime: this approximation is accurate

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

1 list (200) 4 lists (50/50/50/50)

de aprox (1 list)
de approx (4 lists)

Nicolas Gast – 18 / 23

SLIDE 31

Stationary distribution: uniqueness of the fixed point

Nicolas Gast – 19 / 23

SLIDE 32

Stationary distribution: uniqueness of the fixed point

By simulation: very accurate

Nicolas Gast – 19 / 23

SLIDE 33

Relative entropy for the caching model

The stationary measure πk,i satisfy: πk,i(x) = i−1

j=1 λk,j/µj(x)

h

j′=1

j′−1

j=1 λk,j/µj(x)

. Very similar to (Fricker,G. 14), (Fricker et al. 12) (Tibi 11).

Nicolas Gast – 20 / 23

SLIDE 34

Outline

1

Why?

2

How to make the fixed point method work (sufficient condition)

3

What: application to caching policy

4

Conclusion

Nicolas Gast – 21 / 23

SLIDE 35

Conclusion

Decoupling assumption: OK in transient. The fixed point method is not always valid. We need either:

◮ Reversibility ◮ Lyapunov function

To find Lyapunov functions: we need problem-specific.

◮ Physics: energy. ◮ Markov chains: relative entropy (since it it decrease along trajectories)

Yet... the method is not robust (e.g.: non-IRM, LRU instead of RAND)

Nicolas Gast – 22 / 23

SLIDE 36

Thank you!

http://mescal.imag.fr/membres/nicolas.gast nicolas.gast@inria.fr

G. Van Houdt 15 Transient and Steady-state Regime of a Family of List-based Cache

Replacement Algorithms., Gast, Van Houdt., ACM Sigmetrics 2015

G. 16

Construction of Lyapunov functions via relative entropy with application to caching, Gast, N., ACM MAMA 2016

Bena¨ ım, Le Boudec 08

A class of mean field interaction models for computer and communication systems, M.Bena¨

ım and J.Y. Le Boudec., Performance evaluation, 2008. Le Boudec 10

The stationary behaviour of fluid limits of reversible processes is concentrated on stationary points., J.-Y. L. Boudec. , Arxiv:1009.5021, 2010

Budhiraja et al. 15

Limits of relative entropies associated with weakly interacting particle systems., A. S. Budhiraja, P. Dupuis, M. Fischer, and K. Ramanan. , Electronic journal of probability,

20, 2015. Fricker-Gast 14

Incentives and redistribution in homogeneous bike-sharing systems with stations of finite capacity., C. Fricker and N. Gast. , Euro journal on transportation and

logistics:1-31, 2014. Fricket et al. 13

Mean field analysis for inhomogeneous bike sharing systems, Fricker, Gast,

Mohamed, Discrete Mathematics and Theoretical Computer Science DMTCS Nicolas Gast – 23 / 23

SLIDE 37

Outline

5

On the non-optimality of too many lists

Nicolas Gast – 1 / 2

SLIDE 38

Increasing the number of lists is not always better8

The scheme seems to sort the number of items from least popular to most popular:

m1

. . . mj

mj+1 . . . mh

hit less popular popular items ?≥? Six lists: m = (1, 1, 1, 1, 1, 1) Three lists: m = (1, 1, 4).

8contrary to the conjecture of O. I. Aven, E. G. Coffman, Jr., and Y. A. Kogan. Stochastic Analysis of Computer Storage. Kluwer Academic Publishers, Norwell, MA, USA, 1987. Nicolas Gast – 2 / 2

SLIDE 39

Increasing the number of lists is not always better

?≥? Six lists: m = (1, 1, 1, 1, 1, 1) Three lists: m = (1, 1, 4). Having 3 lists of sizes (1, 1, 4) is better than 6 lists of size 1. The same holds for the mean-field approximation.

Nicolas Gast – 2 / 2