[PPT] - On the Predictability of Large Transfer TCP Throughput Qi He PowerPoint Presentation

SLIDE 1

SIGCOMM '05 1

On the Predictability of Large Transfer TCP Throughput

Qi He Constantine Dovrolis Mostafa Ammar

College of Computing Georgia Institute of Technology

SLIDE 2

SIGCOMM '05 2

Outline

TCP throughput prediction: problem statement and motivation Formula-Based (FB) prediction

A formula-based predictor Types of FB prediction errors Experimental evaluation

History-Based (HB) prediction

Typical history-based predictors Dealing with outliers and level shifts Experimental evaluation

Predictability factors

What makes some paths less predictable than others?

SLIDE 3

SIGCOMM '05 3

Problem Statement and Motivation

Objective:

Predict the throughput of a bulk TCP transfer on a given path

Motivation:

Server selection Overlay/multi-homed routing Load balancing Grid computing P2P downloading

SLIDE 4

SIGCOMM '05 4

Constraints and Assumptions

Prediction is needed before the start of transfer Performing “test TCP transfer” just for prediction is too intrusive/slow Measuring certain “lightweight path characteristics” (e.g., loss rate or RTT) is not intrusive

SLIDE 5

SIGCOMM '05 5

Two Classes of TCP Throughput Predictors

Prediction based on actual TCP transfers No previous transfers required

Advantages

History of previous TCP transfers on the same path Estimates of path’s RTT and loss rate

Inputs

Prediction accuracy? Prediction accuracy?

Issues

Time series forecasting theory

History Based (HB)

Analytical model for TCP throughput

Formula Based (FB) Basis Prediction Method

SLIDE 6

SIGCOMM '05 6

Main Contributions

Evaluate prediction accuracy for FB and HB predictions

FB can be significantly inaccurate, especially for congestion-limited flows HB is quite accurate even with simple linear predictors and sporadic previous samples

Explain major causes of prediction errors in terms of underlying network and TCP behavior

Focus on cause-effect relations, rather than black box evaluation

Study effects of path properties and transfer characteristics on prediction accuracy

Load, degree of multiplexing Receiver window, transfer frequency

SLIDE 7

SIGCOMM '05 7

Outline

TCP throughput prediction: problem statement and motivation Formula-Based (FB) prediction

A formula-based predictor Types of FB prediction errors Experimental evaluation

History-Based (HB) prediction

Typical history-based predictors Dealing with outliers and level shifts Experimental evaluation

Predictability factors

What makes some paths less predictable than others?

SLIDE 8

SIGCOMM '05 8

TCP Throughput Model

Analytical model of the expected TCP throughput R as a function of several path characteristics

R = f (T, p) (p > 0)

T, p: RTT and loss rate experienced during the flow We use PFTK model by Padhye et. al (Sigcomm ’98)

( )

, , ) 32 1 ( ) 8 3 , 1 min( 3 2 min

2

> + + = p T W p p bp T bp T M R

M: path MTU (Maximum Transfer Unit) W: TCP maximum congestion window T0: TCP retransmission timeout b: segments released per new ACK

SLIDE 9

SIGCOMM '05 9

An FB Predictor

R ˆ = f (T’, p’) (p’ > 0)

Measure loss rate p’, RTT T’ before the target flow starts

Typical measurement: periodic probing, e.g., Ping

Apply T’ and p’ to the throughput equation With the PFTK model…

' ), ' , ) ' 32 1 ( ' ) 8 ' 3 , 1 min( 3 ' 2 ' min(

2 ^

> + + = p T W p p bp T bp T M R

SLIDE 10

SIGCOMM '05 10

An FB Predictor

R ˆ = f (T’, p’) (p’ > 0)

Measure loss rate p’, RTT T’ before the target flow starts

Typical measurement: periodic probing, e.g., Ping

Apply T’ and p’ to the throughput equation With the PFTK model…

              = > + + = ' if , ) ' , ' min( ' if ), ' , ) ' 32 1 ( ) 8 ' 3 , 1 min( 3 ' 2 ' min(

2 ^

p T W A p T W p p bp T bp T M R

A’: available bandwidth estimation

SLIDE 11

SIGCOMM '05 11

Potential Issues with FB Prediction

Differences between T’ and T, p’ and p

T’, p’ T, p Underestimate or

verestimate throughput

Adaptive and bursty TCP sampling vs. non-adaptive periodic sampling Overestimate throughput Additional load of the target flow may increase T, p Effect Issue Temporal: before flow during flow Sampling: periodic probing TCP “sampling”

SLIDE 12

SIGCOMM '05 12

Evaluation of Throughput Prediction Accuracy

RTT T' and Loss Rate p' (ping, 60s) TCP Throughput R, RTT Te and Loss Rate pe (ping and iperf, 60s ) Available bandwidth A' (pathload, 20s-60s)

One Measurement Epoch

Epoch: IPerf for TCP transfers, pathload for available bandwidth, ping (interval: 100ms, pkt size: 41bytes) for RTT & loss rate

SLIDE 13

SIGCOMM '05 13

Evaluation of Throughput Prediction Accuracy

Epoch: IPerf for TCP transfers, pathload for available bandwidth, ping (interval: 100ms, pkt size: 41bytes) for RTT & loss rate

Available bandwidth A' (pathload, 20s-60s) RTT T' and Loss Rate p' (ping, 60s) TCP Throughput R, RTT Te and Loss Rate pe (ping and iperf, 60s )

150 epochs Each trace consists of 150 consecutive epochs We used 35 Internet paths; 7 traces on each path; hosts in US, Europe, Korea

SLIDE 14

SIGCOMM '05 14

Prediction Error Metrics

) , min(

^ ^

R R R R E − = Relative Error

^

R

=(1/w) R, and = wR both have: |E|=w-1 e.g., = ½ R, and = 2 R both have: |E|=1

R R

^

R

^ ^

Root Mean Square Relative Error

∑

n = i i

E n = RMSRE

1 2

1

SLIDE 15

SIGCOMM '05 15

CDF of FB Prediction Error

Overestimation by >100% (E>1) for 40% of the measurements Dominance of overestimation errors (E>0)

Prevalent occurrences of T’ < T and p’ < p

20 40 60 80 100

4
2

2 4 6 8 10 CDF (%) Relative Error E CDF of Relative Prediction Error (lossy paths)

SLIDE 16

SIGCOMM '05 16

CDF of FB Prediction Error

Overestimation by >100% (E>1) for 40% of the measurements Dominance of overestimation errors (E>0)

Prevalent occurrences of T’ < T and p’ < p

20 40 60 80 100 20 40 60 80 100 0.001 0.02 0.04 0.06 0.08 0.1 CDF (%) RTT Increase (ms) Loss Rate Increase Loss rate increase RTT increase

SLIDE 17

SIGCOMM '05 17

Errors Due to Sampling Differences

Prediction using Ping RTT & loss rate measurements during target flow Prediction errors are still significant, but overestimation & underestimation are almost symmetric

20 40 60 80 100

8
6
4
2

2 4 6 8 CDF (%) Relative Error E CDF of Relative Prediction Error RTT/loss rate during TCP flow RTT/loss rate prior to TCP flow

SLIDE 18

SIGCOMM '05 18

Prediction Accuracy vs. Actual Throughput

Large errors are more common in lower-throughput paths Explanation: in a congested path, slight load increase causes large loss rate increase

SLIDE 19

SIGCOMM '05 19

Window-limited Flows

80 30 10 2 1 0.5 0.1 5 10 15 20 RMSRE (log scale) Path Index W=20KB (window-limited) W=1MB (congestion-limited)

Throughput is more predictable for window-limited TCP flows Explanation: window-limited flows do not saturate path’s bottleneck

SLIDE 20

SIGCOMM '05 20

Outline

TCP throughput prediction: problem statement Formula-Based (FB) prediction

A formula-based predictor Types of FB prediction errors Experimental evaluation

History-Based (HB) prediction

Typical history-based predictors Dealing with outliers and level shifts Experimental evaluation

Predictability factors

What makes some paths less predictable than others?

SLIDE 21

SIGCOMM '05 21

History-Based Prediction

General one-step forecasting problem We only consider simple linear predictors

Moving Average (MA) Exponentially Weighted Moving Average (EWMA) Non-seasonal Holt-Winters (HW) An EWMA variation that captures the time series trend

∑

+ − = + = i n i k k i

X n X

1 1

1 ˆ

i i i

X X X ˆ ) 1 ( ˆ

1

α α − + =

+

t i s i s i t i f i i s i t i s i f i

X X X X X X X X X X

1 1 1 1

ˆ ) 1 ( ) ˆ ˆ ( ˆ ˆ ) 1 ( ˆ ˆ ˆ ˆ

− − + +

− + − = − + = + = β β α α

) ,...., , ( ˆ

1 2 1 −

=

n n

R R R f R

-smoothing
-trend

SLIDE 22

SIGCOMM '05 22

Level Shifts (LS) and Outliers (OL)

Why are LS and OL undesirable?

Cause large prediction errors and differences among predictors; complicate the analysis of HB predictability

Dealing with LS and OL is more important than choosing among predictors

Actions: ignore OL, restart predictor upon LS

2 4 6 8 10 12 14 16 18 20 40 60 80 100 120 140 Throughput (Mbps) Measurement Epoch UTAH-LULEA

SLIDE 23

SIGCOMM '05 23

Overall HB Prediction Accuracy

HB prediction is much more accurate than FB prediction 90% of traces have RMSREs < 0.4 (with LS/OL detection) With LS/OL detections, the choices of predictor and of predictor parameters make little difference

20 40 60 80 100 0.01 0.1 0.4 1 5 CDF (%) RMSRE (log scale) CDF of Prediction Error (Holt-Winters) 0.8-HW-LSO 0.8-HW 0.4-HW

SLIDE 24

SIGCOMM '05 24

Effect of Measurement Frequency

Longer measurement period does not degrade accuracy significantly

Even with single transfer every 24 minutes, RMSRE is below 0.4 in 75% of the traces

20 40 60 80 100 CDF CDF of Prediction Error vs. Sampling Frequency 3-min interval 6-min interval 24-min interval 45-min interval 0.01 0.1 0.2 0.4 1 5 RMSRE (log scale)

SLIDE 25

SIGCOMM '05 25

Outline

TCP throughput prediction: problem statement Formula-Based (FB) prediction

A formula-based predictor Types of FB prediction errors Empirical evaluation

History-Based (HB) prediction

Typical history-based predictors Dealing with outliers and level shifts Empirical evaluation

Predictability factors

What makes some paths less predictable than others?

SLIDE 26

SIGCOMM '05 26

What makes throughput more predictable on some paths than on others?

Factors examined:

Link utilization Degree of statistical multiplexing

Approach:

Analyze the Coefficient of Variation (CoV) of the marginal distribution of TCP throughput

CoV ∝ time series prediction error

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 CoV RMSRE Prediction Error vs. CoV

SLIDE 27

SIGCOMM '05 27

Impact of Load (congestion-limited flow)

Model: Processor Sharing server (C) with Poisson session arrivals

Flow arrival rate: λ, avg flow size: θ Offered load: Per-flow rate: Distribution of # sessions:

C λθ ρ =

N C N r / ) ( =

) 1 ( ) ( ρ π

ρ

− =

N

[ ]

2 2

) 1 log( ) 1 ( ) , 2 ( ) 1 log( ) 1 ( ) ( ρ ρ ρ ρ ρ ρ − − ⋅ + − − = L N r CoV

CoV of per-session throughput increases with the

ffered load

So, relative prediction error increases with offered load

0.5 1 1.5 2 2.5 3 0.2 0.4 0.6 0.8 1 CoV Offered load on a 50Mbps link

Bandwidth Share C/N Variance (Processor Sharing)

CoV

SLIDE 28

SIGCOMM '05 28

Impact of Degree of Multiplexing

Consider the avail-bw A at non-congested Processor Sharing server (C)

Traffic model: N homogenous flows with rate limit: r, flow arrival rate: λ (Poisson), avg flow size: θ

Conclusion: provided that utilization remains constant, CoV of available bandwidth decreases as number of flows increases

So, we expect lower prediction error as number of flows increases

[ ] [ ] [ ]

) 1 ( 1 ρ ρ − ⋅ = − = C C N E Y C CoV A CoV

SLIDE 29

SIGCOMM '05 29

Conclusions

FB prediction for congestion-limited TCP flows can cause major errors

Main reason: loss rate and RTT increase due to target flow

HB prediction is much more accurate

Even with very simple predictors and sporadic previous transfers

Path HB-predictability depends on load and degree of multiplexing at bottleneck link

Hardest-to-predict paths: heavily utilized bottleneck link, loaded with just a few flows