[PPT] - On the Sustainability of the Extreme Value Theory for WCET PowerPoint Presentation

SLIDE 1

On the Sustainability of the Extreme Value Theory for WCET Estimation

Luca Santinelli, J´ erˆ

me Morio, Guillaume Dufour, Damien

Jacquemart

SLIDE 2

Plan

1 Motivations & Problem Statement 2 EVT Applicability: what are the hypotheses to apply EVT? 3 EVT complexity: what is complex while applying EVT? 4 EVT robustness

2/19

SLIDE 3

Plan

1 Motivations & Problem Statement 2 EVT Applicability: what are the hypotheses to apply EVT? 3 EVT complexity: what is complex while applying EVT? 4 EVT robustness

3/19

SLIDE 4

Motivations & Problem Statement

Extreme Value Theory (EVT) in combination with measurements: Measurement-Based Probabilistic Timing Analysis (MBPTA).

1 Measurements of system execution times: reproducible &

reliable.

2 Rare Events (where the worst-case should be): Extreme

Value Theory. Statistical analysis in between: stationarity, time series (new - necessary to make MBPTA more formal).

4/19

SLIDE 5

Motivations & Problem Statement

13600 13650 13700 13750 1e−12 1e−09 1e−06 1e−03 1e+00 Execution Time Probability

EVT 1−CDF Obs 1−CDFF

4/19

SLIDE 6

Motivations & Problem Statement

Extreme Value Theory in combination with measurements: {X1, X2, X3, . . . , XN} → probabilistic estimation of Worst-Case Execution Time (pWCET). The need for it - with complex (modeling) system: it reduces cost and pessimism. Pros: easy to apply, near zero modeling (near 0 info required). Cons: reliability - worst-case: where is the worst-case? Safety? strict hypotheses (not realistic hypotheses).

4/19

SLIDE 7

Motivations & Problem Statement

Measurements: C =

Ck

pk = P{C = Ck}

k∈{1,··· ,K}

probabilistic Worst-Case Execution Time (pWCET): C =

C j

pj = P{C = C j}

j∈{1,··· ,J}

EVT

Measurements Measurements EVT

HP Verification

C C C

4/19

SLIDE 8

Extreme Value Theory

Block Maxima (BM) EVT approach Peak over Thresholds (PoT) EVT approach

5/19

SLIDE 9

Extreme Value Theory

EVT BM

The maxima of an independent and identical distributed sequence converges to a Generalized Extreme Value (GEV) distribution Gξ under some general conditions: Gξ(x) =      exp(− exp(−x)), if ξ = 0 exp

−(1 + ξx)− 1

ξ

,

if ξ = 0. GEV distributions characterized by ξ = 0, ξ > 0 and ξ < 0 corresponding to Gumbel, Fr´ echet and Weibull. Suppose there exist aN and bN, with aN > 0 such that, for all y ∈ R P X(N)−bN

aN

≤ y

= GN(aNy + bN) N→∞

− → G(y), where G is a non degenerate CDF, then G is a GEV distribution Gξ. In this case, one denotes G ∈ MDA(ξ) (MDA=Maximum Domain of Attraction).

Block size b selected arbitrarily: it affects accuracy and safety of the EVT BM estimation

5/19

SLIDE 10

Extreme Value Theory

EVT PoT

PoT considers the largest samples Xi to estimate the probability P{X > S}. Let us assume that the distribution function G of independent and identical distributed samples X1, ..., XN is continuous. Set y∗ = sup{y, G(y) < 1} = inf{y, G(y) = 1}. Then, the next two assertions are equivalent: a) G ∈ MDA(ξ), and b) there exists a positive and measurable function u → β(u) such that lim

u→y∗

sup

0<y<y∗−u

|Gu(y) − Hξ,β(u)(y)| = 0. Gu(y) = P{X − u ≤ y|X > u}, and Hξ,β(u) is the CDF of a generalized Pareto distribution (GPD) with shape parameter ξ and scale parameter β(u).

Threshold u selected arbitrarily: affects accuracy and safety

f the EVT PoT estimation.

5/19

SLIDE 11

Extreme Value Theory

BM and PoT comparison:

13600 13700 13800 13900 14000 1e−12 1e−09 1e−06 1e−03 1e+00 Execution Time Probability

Obs BM b=10 BM b=100 PoT u=0.98 PoT u=0.95

5/19

SLIDE 12

Plan

1 Motivations & Problem Statement 2 EVT Applicability: what are the hypotheses to apply EVT? 3 EVT complexity: what is complex while applying EVT? 4 EVT robustness

6/19

SLIDE 13

EVT applicability

Classically: independence and identical distribution (i.i.d.)

7/19

SLIDE 14

EVT applicability

Long range independence EVT

Let {Xn} be a stationary sequence such that Mn = max{X1, . . . Xn} has a non-degenerate limiting distribution G as in P{an(Mn − bn) ≤ x} d → G(x), for some constants an > 0, bn. Suppose that D(un) : |Fi1,...,ip,j1,...,jq (un) − Fi1,...,ip (un) · Fj1,...,jq (un)| ≤ αn,l, where liml→∞limn→∞αn,l = 0, holds for all sequences un given by un = x/an + bn, −∞ < x < ∞. Then G is one of the three classical types: Weibull, Frechet, Gumbel.

7/19

SLIDE 15

EVT applicability

Extremal independence EVT

Let {Xn} be a stationary sequence with marginal distribution function F such that Mn = max{X1, . . . Xn}, and {un} a sequence of constants such that D(un), D′(un)

hold. Let 0 ≤ τ < ∞, then

P{Mn ≤ un} d → exp(−τ) iff n · [1 − F(un)] → τ.

D′(un) : limn→∞ sup n · n/k

j=2 P{X1 > un, Xj > un} → 0,

assure independence between close-in-time observations.

7/19

SLIDE 16

EVT applicability

Independence → extremal independence → stationarity C the pWCET EVT estimation in case of stationarity,

C the pWCET EVT estimation in case of independence.

Supposing execution time measurements follow the same marginal distribution, then C = Cθ, C C.

7/19

SLIDE 17

EVT hypotheses: identical distribution

A collection of random variables is identically distributed (i.d.) if each random variable has the same probability distribution. Kolmogorov-Smirnov test for comparing two sets of observations. The trace of

bservations is divided into two sets which are compared, to verify whether they

represent the same distribution.

8/19

SLIDE 18

EVT hypotheses: independence

A collection of random variables is independent (i.) if all the random variables are mutually independent. Independence proven or tested for time series based on autoregression and autocorrelation. Verifying stationarity (Statistic analysis): more info about observation traces; characterize system execution behavior looking for the worst-case execution conditions.

9/19

SLIDE 19

EVT hypotheses: extremal independence

Extremogram ρ: dependence at the extremes is

estimated. Analogue of the autocorrelation function,

depending only on the extreme values in the sequence of

bservations; based on estimators.

Extremal index θ: another tool for measuring the dependency of extreme values (to compare with the independence case). Make use of the blocks test to compute; based on estimators.

10/19

SLIDE 20

EVT hypotheses: stationarity

A process X is stationary if its mean variance and autocovariance structure do not change over time. Weak form

f stationarity: flat-looking observations, no trend, constant

variance over time, and no periodic fluctuations or autocorrelation. Autocorrelation: similarity between observations as a function

f the time lag between them.

Sample Auto Correlation Function (ACF): important assessment tools for detecting data dependence and fitting models to data. The model is not faced at first, the observed data {X1, . . . , XN} are known.

11/19

SLIDE 21

Plan

1 Motivations & Problem Statement 2 EVT Applicability: what are the hypotheses to apply EVT? 3 EVT complexity: what is complex while applying EVT? 4 EVT robustness

12/19

SLIDE 22

EVT complexity

Parameter effect on WCET estimation: single path BM

13600 13650 13700 13750 13800 0.0 0.2 0.4 0.6 0.8 1.0 Execution time Probability Obs b=5 b=10 b=20 b=50 b=100 b=200

b selection!

13/19

SLIDE 23

EVT complexity

Parameter effect on WCET estimation: multi-path BM

100 200 500 1000 2000 5000 0.0 0.2 0.4 0.6 0.8 1.0 Execution time Probability Obs b=5 b=10 b=20 b=50 b=100 b=200

b selection!

13/19

SLIDE 24

EVT complexity

Parameter effect on WCET estimation: single path PoT

13600 13650 13700 13750 13800 0.0 0.2 0.4 0.6 0.8 1.0 Execution time Probability Obs u=.7 u=.8 u=.9 u=.95 u=.98 u=.9999

u selection!

13/19

SLIDE 25

EVT complexity

Parameter effect on WCET estimation: multi-path PoT

100 200 500 1000 2000 5000 0.0 0.2 0.4 0.6 0.8 1.0 Execution time Probability Obs u=.7 u=.8 u=.9 u=.95 u=.98 u=.9999

u selection!

13/19

SLIDE 26

Plan

1 Motivations & Problem Statement 2 EVT Applicability: what are the hypotheses to apply EVT? 3 EVT complexity: what is complex while applying EVT? 4 EVT robustness

14/19

SLIDE 27

EVT robustness

How can we define robustness? Still a fuzzy word/definition for EVT, but something to play with and extend. error/confidence: bootstrap & relative error. Multi-path: all the paths have to be included.

15/19

SLIDE 28

Confidence - relative error

Bootstrapping and looking for independence: relative error with ”artificial independence”.

1.36 1.365 1.37 1.375 1.38 x 10

4

0.2 0.4 0.6 0.8 1 Execution Time PWCET

16/19

SLIDE 29

Confidence - relative error

Bootstrapping and looking for independence: relative error with ”artificial independence”.

1.36 1.365 1.37 1.375 1.38 x 10

4

250 500 750 1000 Exectution TIme Relative error in %

16/19

SLIDE 30

Confidence - relative error

Bootstrapping and looking for independence: relative error with ”artificial independence”.

2000 4000 6000 8000 10000 0.2 0.4 0.6 0.8 1 Execution Time pWCET

16/19

SLIDE 31

Confidence - relative error

Bootstrapping and looking for independence: relative error with ”artificial independence”.

2000 4000 6000 8000 10000 200 400 600 Execution Time Relative error in%

16/19

SLIDE 32

Path coverage

Block maxima ”artificial” multi-path

16000 18000 20000 22000 0.0 0.2 0.4 0.6 0.8 1.0 Execution time Probability Obs first first−second all

17/19

SLIDE 33

Path coverage

Peak over threshold ”artificial” multi-path

16000 18000 20000 22000 0.0 0.2 0.4 0.6 0.8 1.0 Execution time Probability Obs first first−second all

17/19

SLIDE 34

Conclusions & Future Work

A more formal discussion around the Extreme Value Theory for task execution time and pWCET estimation Pessimism and safety issues have to be accounted for: makes EVT applicability more challenging No free lunch: a) decisions to be made, b) pessimism-safety to be guaranteed. For the future: Continue investigating EVT complexity - certification + verification around parameters & reliability Definition of robustness for EVT and EVT limits as well as applicability.

18/19

SLIDE 35

Thank you Questions ?

luca.santinelli@onera.fr

19/19