Transient and Steady-state Regime of a Family of List-based Cache - - PowerPoint PPT Presentation

transient and steady state regime of a family of list
SMART_READER_LITE
LIVE PREVIEW

Transient and Steady-state Regime of a Family of List-based Cache - - PowerPoint PPT Presentation

Transient and Steady-state Regime of a Family of List-based Cache Replacement Algorithms Nicolas Gast 1 , Benny Van Houdt 2 Sigmetrics 2015, Portland, Oregon 1 Inria 2 University of Antwerp Nicolas Gast 1 / 26 Caches are everywhere


slide-1
SLIDE 1

Transient and Steady-state Regime of a Family of List-based Cache Replacement Algorithms

Nicolas Gast1, Benny Van Houdt2 Sigmetrics 2015, Portland, Oregon

1Inria 2University of Antwerp Nicolas Gast – 1 / 26

slide-2
SLIDE 2

Caches are everywhere

User/Application data source slow cache fast Examples: Processor Database CDN Single cache / hierarchy of caches

Nicolas Gast – 2 / 26

slide-3
SLIDE 3

In this talk, I focus on a single cache.

The question is: which item to replace? Application data source cache

requests

Nicolas Gast – 3 / 26

slide-4
SLIDE 4

In this talk, I focus on a single cache.

The question is: which item to replace? Application data source cache

requests

hit

Nicolas Gast – 3 / 26

slide-5
SLIDE 5

In this talk, I focus on a single cache.

The question is: which item to replace? Application data source cache

requests

hit replace one item miss Classical cache replacement policies: RAND, FIFO LRU CLIMB Other approaches: Time to live

Nicolas Gast – 3 / 26

slide-6
SLIDE 6

Our performance metric will be the hit probability

hit probability = number of items served from cache total number of items served = 1 − miss probability

Nicolas Gast – 4 / 26

slide-7
SLIDE 7

Our performance metric will be the hit probability

hit probability = number of items served from cache total number of items served = 1 − miss probability Theoretical studies: IRM (started with [King 1971, Gelenbe 1973]) Practical studies use trace-based simulations. Approximations: link between TTL and cache replacement policies.

◮ FIFO and LRU: [Dan and Towsley 1990, Martina at al. 14, Fofack at

  • al. 13, Berger et al. 14]

◮ LRU: Che approximation [Che, 2002, Fricker et al. 12] Nicolas Gast – 4 / 26

slide-8
SLIDE 8

Contributions (and Outline of the talk)

We introduce a family of policies for which the cache is (virtually) divided into lists (generalization of FIFO/RANDOM)

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

1 list (200) 4 lists (50/50/50/50)

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

approx 1 list (200) approx 4 lists (50/50/50/50)

Nicolas Gast – 5 / 26

slide-9
SLIDE 9

Contributions (and Outline of the talk)

We introduce a family of policies for which the cache is (virtually) divided into lists (generalization of FIFO/RANDOM)

1 We can compute in polynomial time the steady-state distribution

under the IRM model.

◮ Disprove old conjectures.

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

1 list (200) 4 lists (50/50/50/50)

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

approx 1 list (200) approx 4 lists (50/50/50/50)

Nicolas Gast – 5 / 26

slide-10
SLIDE 10

Contributions (and Outline of the talk)

We introduce a family of policies for which the cache is (virtually) divided into lists (generalization of FIFO/RANDOM)

1 We can compute in polynomial time the steady-state distribution

under the IRM model.

◮ Disprove old conjectures. 2 We develop a mean-field approximation and show that it is accurate ◮ Fast approximation of the steady-state distribution. ◮ We can characterize the transient behavior:

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

1 list (200) 4 lists (50/50/50/50)

simulation

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

approx 1 list (200) approx 4 lists (50/50/50/50)

ODE approx.

Nicolas Gast – 5 / 26

slide-11
SLIDE 11

Contributions (and Outline of the talk)

We introduce a family of policies for which the cache is (virtually) divided into lists (generalization of FIFO/RANDOM)

1 We can compute in polynomial time the steady-state distribution

under the IRM model.

◮ Disprove old conjectures. 2 We develop a mean-field approximation and show that it is accurate ◮ Fast approximation of the steady-state distribution. ◮ We can characterize the transient behavior:

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

1 list (200) 4 lists (50/50/50/50)

simulation

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

approx 1 list (200) approx 4 lists (50/50/50/50)

ODE approx.

3 We provide guidelines of how to tune the parameters by using IRM

and trace-based simulation

Nicolas Gast – 5 / 26

slide-12
SLIDE 12

Outline

1

Cache model and IRM

2

Steady-state performance under the IRM model

3

Fast and accurate mean-field approximation

4

How to choose the size of the lists?

5

Conclusion

Nicolas Gast – 6 / 26

slide-13
SLIDE 13

Outline

1

Cache model and IRM

2

Steady-state performance under the IRM model

3

Fast and accurate mean-field approximation

4

How to choose the size of the lists?

5

Conclusion

Nicolas Gast – 7 / 26

slide-14
SLIDE 14

I consider a cache (virtually) divided into lists

Application data source

list 1

. . .

list j list j+1

. . .

list h

IRM At each time step, item i is requested with probability pi (IRM assumption3)

  • 3L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf-like distributions: Evidence and
  • implications. In INFOCOM’99, volume 1, pages 126-134. IEEE, 1999.

Nicolas Gast – 8 / 26

slide-15
SLIDE 15

I consider a cache (virtually) divided into lists

Application data source

list 1

. . .

list j list j+1

. . .

list h

miss IRM At each time step, item i is requested with probability pi (IRM assumption3) MISS If item i is not in the cache, it is exchanged with a item from list 1 (FIFO or RAND).

  • 3L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf-like distributions: Evidence and
  • implications. In INFOCOM’99, volume 1, pages 126-134. IEEE, 1999.

Nicolas Gast – 8 / 26

slide-16
SLIDE 16

I consider a cache (virtually) divided into lists

Application data source

list 1

. . .

list j list j+1

. . .

list h

hit miss IRM At each time step, item i is requested with probability pi (IRM assumption3) MISS If item i is not in the cache, it is exchanged with a item from list 1 (FIFO or RAND). HIT If item i is list j, it is exchanged with a item from list j + 1 (FIFO or RAND).

  • 3L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf-like distributions: Evidence and
  • implications. In INFOCOM’99, volume 1, pages 126-134. IEEE, 1999.

Nicolas Gast – 8 / 26

slide-17
SLIDE 17

Items on higher lists are (supposedly) more popular.

list 1

. . .

list j list j+1

. . .

list h

miss hit less popular popular items cache size = m = m1 + · · · + mh These algorithms are refered to as RAND(m)and FIFO(m).

Nicolas Gast – 9 / 26

slide-18
SLIDE 18

Outline

1

Cache model and IRM

2

Steady-state performance under the IRM model

3

Fast and accurate mean-field approximation

4

How to choose the size of the lists?

5

Conclusion

Nicolas Gast – 10 / 26

slide-19
SLIDE 19

The steady-state is a product-form distribution

Same for RAND and FIFO.

Nicolas Gast – 11 / 26

slide-20
SLIDE 20

The steady-state is a product-form distribution

Same for RAND and FIFO. Example of a cache of size 4 with 3 lists and m = (1, 2, 1): i j k ℓ Probability of (i, j, k, ℓ) is proportional to pi(pjpk)2(pℓ)3.

Nicolas Gast – 11 / 26

slide-21
SLIDE 21

We can compute the miss probability by using a dynamic programming approach (Generalization of [Fagin,Price]5).

We want to compute M(m) =

  • c∈Cn(m)

 

k∈c

pk   π(c) = E(m + e1, n) E(m, n) , where E(r, k) =

  • c∈Ck(r)

h

  • i=1

 

ri

  • j=1

pc(i,j)  

i

. We obtain a recursion formula on E(r, k): solvable in O(n × m1 . . . mh). The Dan and Towsley4 approximation is not needed for polynomial time.

  • 4A. Dan and D. Towsley. An approximate analysis of the LRU and FIFO buffer replacement schemes. SIGMETRICS
  • Perform. Eval. Rev., 18(1):143-152, Apr. 1990.
  • 5R. Fagin and T. G. Price. Efficient calculation of expected miss ratios in the independent reference model. SIAM J.

Comput., 7:288-296, 1978. Nicolas Gast – 12 / 26

slide-22
SLIDE 22

A higher cache size and more lists (usually) leads to a lower steady-state miss probability.

300 400 500 600 700 800 900 1000 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55

Cache size m Miss Probability n = 3000, α = 0.8

h = ∞ h = 2, m2 = m − 1 h = 3, m3 = m − 2 h = 5, m5 = m − 4 h = 10, m10 = m − 9 Lower bounds

(h = ∞ corresponds to LFU).

Nicolas Gast – 13 / 26

slide-23
SLIDE 23

Is increasing the number of lists always better6?

m1

. . . mj

mj+1 . . . mh

hit less popular popular items ?≥? Six lists: m = (1, 1, 1, 1, 1, 1) Three lists: m = (1, 1, 4).

6conjectured in O. I. Aven, E. G. Coffman, Jr., and Y. A. Kogan. Stochastic Analysis of Computer Storage. Kluwer Academic Publishers, Norwell, MA, USA, 1987. Nicolas Gast – 14 / 26

slide-24
SLIDE 24

Is increasing the number of lists always better6?

?≥? Six lists: m = (1, 1, 1, 1, 1, 1) Three lists: m = (1, 1, 4).

6conjectured in O. I. Aven, E. G. Coffman, Jr., and Y. A. Kogan. Stochastic Analysis of Computer Storage. Kluwer Academic Publishers, Norwell, MA, USA, 1987. Nicolas Gast – 14 / 26

slide-25
SLIDE 25

Outline

1

Cache model and IRM

2

Steady-state performance under the IRM model

3

Fast and accurate mean-field approximation

4

How to choose the size of the lists?

5

Conclusion

Nicolas Gast – 15 / 26

slide-26
SLIDE 26

We want to study at which speed the caches fills

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

1 list (200) 4 lists (50/50/50/50)

simulation

Figure : Popularities of objects change every 2000 steps.

Nicolas Gast – 16 / 26

slide-27
SLIDE 27

We want to study at which speed the caches fills

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

1 list (200) 4 lists (50/50/50/50)

simulation

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

approx 1 list (200) approx 4 lists (50/50/50/50)

ODE approx.

Figure : Popularities of objects change every 2000 steps.

We develop an ODE approximation

Nicolas Gast – 16 / 26

slide-28
SLIDE 28

We want to study at which speed the caches fills

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

1 list (200) 4 lists (50/50/50/50)

simulation

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

1 list (200) 4 lists (50/50/50/50)

  • de aprox (1 list)
  • de approx (4 lists)

Figure : Popularities of objects change every 2000 steps.

We develop an ODE approximation We show that it is accurate

Nicolas Gast – 16 / 26

slide-29
SLIDE 29

We construct an ODE by assuming independence

Let Hi(t) be the popularity in list i.

Nicolas Gast – 17 / 26

slide-30
SLIDE 30

We construct an ODE by assuming independence

Let Hi(t) be the popularity in list i. If xk,i(t) is the probability that item k is in list i at time t, we approximately have: This is similar to a TTL approximation.

Nicolas Gast – 17 / 26

slide-31
SLIDE 31

We show that this approximation is accurate, theoretically and by simulation

2000 4000 6000 8000 10000 number of requests 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 probability in cache

1 list (200) 4 lists (50/50/50/50)

  • de aprox (1 list)
  • de approx (4 lists)

Nicolas Gast – 18 / 26

slide-32
SLIDE 32

This approximation can also be used to compute stationary distribution

Very accurate: Map is contracting: computation in O(nh), compared to O(nm1 . . . mh) for the exact.

Nicolas Gast – 19 / 26

slide-33
SLIDE 33

Outline

1

Cache model and IRM

2

Steady-state performance under the IRM model

3

Fast and accurate mean-field approximation

4

How to choose the size of the lists?

5

Conclusion

Nicolas Gast – 20 / 26

slide-34
SLIDE 34

Under the IRM model, a smaller first list (usually) means a higher hit probability but a larger time to fill the cache

10

3

10

4

10

5

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Number of Requests Hit Probability

m = 200, ODE m = 200, simul m = (100,100), ODE m = (100,100), simul m = (50,150), ODE m = (50,150), simul m = (20,180), ODE m = (20,180), simul Nicolas Gast – 21 / 26

slide-35
SLIDE 35

Under the IRM model, the time to fill the cache mainly depend on the size of the first list.

10

3

10

4

10

5

0.1 0.15 0.2 0.25 0.3 0.35 0.4

Number of Requests Hit Probability

m = (40,160), ODE m = (40,160), simul m = (40,40,120), ODE m = (40,40,120) simul m = (40,40,40,80), ODE m = (40,40,40,80), simul

In a dynamic setting, a good choice seems to be m1 ≥ m2 · · · ≥ mh with m1 “large-enough”.

Nicolas Gast – 22 / 26

slide-36
SLIDE 36

We verified on a trace of youtube videos7, that reserving at least 30% of the cache for the first list seems important.

1000 2000 3000 4000 5000 0.27 0.28 0.29 0.3 0.31 0.32 0.33

m − m1 Hit Probability FIFO m = 5000 LRU

FIFO(m): 2 lists FIFO(m): 3 lists FIFO(m): 5 lists LRU(m): 2 lists LRU(m): 3 lists LRU(m): 5 lists

  • 7M. Zink, K. Suh, Y. Gu, and J. Kurose. Characteristics of YouTube network traffic at a campus network-measurements,

models, and implications. Comput. Netw., 53(4):501-514, Mar. 2009. Nicolas Gast – 23 / 26

slide-37
SLIDE 37

Outline

1

Cache model and IRM

2

Steady-state performance under the IRM model

3

Fast and accurate mean-field approximation

4

How to choose the size of the lists?

5

Conclusion

Nicolas Gast – 24 / 26

slide-38
SLIDE 38

Conclusion

Unified framework for studying list-based replacement policies. Steady-state miss probability in polynomial time. Accurate ODE approximation Guidelines on how to use such a replacement algorithm: the size of the first list is important.

m1

. . . mj

mj+1

. . .

mh

Two theoretical interests of this work:

◮ provides a unified framework and disproves old conjectures. ◮ ODE approximation

Future work: network of caches.

Nicolas Gast – 25 / 26

slide-39
SLIDE 39

Thank you!

http://mescal.imag.fr/membres/nicolas.gast nicolas.gast@inria.fr Transient and Steady-state Regime of a Family of List-based Cache Replacement Algorithms.

Nicolas Gast – 26 / 26