Comparing 2 implementations of the IETF-IPPM One-Way Delay and Loss - - PowerPoint PPT Presentation

comparing 2 implementations of the ietf ippm one way
SMART_READER_LITE
LIVE PREVIEW

Comparing 2 implementations of the IETF-IPPM One-Way Delay and Loss - - PowerPoint PPT Presentation

Comparing 2 implementations of the IETF-IPPM One-Way Delay and Loss Metrics Sunil Kalidindi, Matt Zekauskas Advanced Network & Services Armonk, NY, USA Henk Uijterwaal, Ren Wilhelm RIPE-NCC Amsterdam, The Netherlands 1 Henk Uijterwaal


slide-1
SLIDE 1

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

1

Comparing 2 implementations

  • f the IETF-IPPM One-Way

Delay and Loss Metrics

Sunil Kalidindi, Matt Zekauskas

Advanced Network & Services Armonk, NY, USA

Henk Uijterwaal, René Wilhelm

RIPE-NCC Amsterdam, The Netherlands

slide-2
SLIDE 2

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

2

Outline

  • The problem
  • Theory behind one-way delay and loss

measurements

  • The two experiments
  • Time-keeping
  • Comparing raw-data
  • Statistical approach to comparing data
  • Effect of packet-sizes on delays
  • Outlook and conclusions
slide-3
SLIDE 3

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

3

The Problem

  • The IETF IPPM WG has defined metrics for

(type-P) one-way delay and packet losses

– RFC’s 2330, 2679, 2680

  • It is the goal of the IPPM-WG to turn these

metrics into Internet standards

  • This requires 2 independent implementations

that are interoperable

  • There are 2 implementations of these metrics
  • So what is the problem then?
slide-4
SLIDE 4

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

4

The Problem (2)

  • One has to show that the implementations are

interoperable

  • For metrics, this means that both

implementations, measuring along the same path, give the same results

  • The results of individual delay and loss

measurements depend on the instantaneous condition of the network

slide-5
SLIDE 5

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

5

The Problem (3)

  • No direct comparison of individual

measurements is possible

  • One has to look at distributions instead

– Distribution of delays and losses over time – Patterns of the delays and losses over time – Statistical methods

  • This presentation is a first attempt at such a

comparison

slide-6
SLIDE 6

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

6

Outline

  • The problem
  • Theory behind one-way delay and loss

measurements

  • The two experiments
  • Time-keeping
  • Comparing raw-data
  • Statistical approach to comparing data
  • Effect of packet-sizes on delays
  • Outlook and conclusions
slide-7
SLIDE 7

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

7

One-way delay and loss measurements

Border Router

ISP A

Internal Network

Border Router

Internal Network

ISP B

slide-8
SLIDE 8

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

8

One-way delay and loss measurements

Border Router

ISP A

Internal Network

Border Router

Internal Network

ISP B

Probe Probe GPS Clock

slide-9
SLIDE 9

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

9

One-way delay and loss measurements

Border Router

ISP A

Internal Network

Border Router

Internal Network

ISP B

Probe Probe GPS Clock Delay Loss

slide-10
SLIDE 10

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

10

Outline

  • The problem
  • Theory behind one-way delay and loss

measurements

  • The two experiments
  • Time-keeping
  • Comparing raw-data
  • Statistical approach to comparing data
  • Effect of packet-sizes on delays
  • Outlook and conclusions
slide-11
SLIDE 11

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

11

The two implementations

  • Advanced Network & Services: Surveyor

– http://www.advanced.org/surveyor – Measurement machine: surveyor box

  • RIPE-NCC: TTM or Test-Traffic Measurements

– http://www.ripe.net/test-traffic – Measurement machines: test-box

slide-12
SLIDE 12

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

12

Common features

  • Active tests of type-P one-way delay and loss

– Test packets time-stamped with GPS time – UDP packets

  • 40 bytes (total), 2/second: Surveyor
  • 100 bytes, 3/minute: TTM

– Later slide

– Scheduled according to a poisson distribution – Accuracy:

  • Surveyor: Back-to-back calibration: 95% of measurements

± 100 µs → 10 µs “soon” (in-kernel packet timestamping)

  • RIPE-NCC: 10 µs
slide-13
SLIDE 13

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

13

Common features (2)

  • Concurrent routing measurements

– Traceroute – Only look at the IP-addresses of the intermediate points

  • Measurements centrally managed
  • Reports on the web
slide-14
SLIDE 14

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

14

Common features (3) Measurement machines

Surveyor

  • Dell 400 MHz Pentium

Pro

  • 128 MBytes RAM
  • 8 GBytes disk
  • BSDI Unix
  • TrueTime GPS card and

antenna (coax)

  • Network Interface (10/

100bT, FDDI, OC3 ATM)

  • Special driver for the GPS

card TTM

  • Pentium, Pentium II,

200…466 MHz

  • 32…64 MBytes RAM
  • 4...8 GBytes disk
  • FreeBSD Unix
  • Motorola Oncore GPS

receiver and antenna

  • Network Interface:

10/100bT

  • Special kernel for time-

keeping

slide-15
SLIDE 15

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

Current Surveyor Deployment

  • 71 machines

– Universities – Tele-Immersion Labs – National Labs – Auckland, NZ – …others

  • 2741 paths

– NASA Ames XP – I2 gigaPoPs (some) – CA*net2 gigaPoPs – APAN sites – Abilene router nodes up with NTP, awaiting GPS

  • Measurement machines at campuses and

at other interesting places along paths (e.g., gigaPoPs, interconnects)

slide-16
SLIDE 16

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

16

Surveyor locations

slide-17
SLIDE 17

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

17

RIPE-NCC Test-Traffic Measurements

  • 43 machines

– RIPE-Membership: ISP’s, research networks, etc in Europe and surrounding areas – A few sites interested in One-Way Delay measurements outside Europe – Common locations with Surveyor:

  • Advanced Network & Systems
  • SLAC (Menlo Park, USA)
  • CERN (Geneva, CH)
  • Full mesh with approximately 1600 paths
slide-18
SLIDE 18

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

18

Location of the RIPE-NCC Test-boxes

slide-19
SLIDE 19

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

19

Outline

  • The problem
  • Theory behind one-way delay and loss

measurements

  • The two experiments
  • Time-keeping

– The key issue to make this work – Different approaches

  • Comparing raw-data
  • Statistical approach to comparing data
  • Effect of packet-sizes on delays
  • Outlook and conclusions
slide-20
SLIDE 20

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

20

RIPE-NCC approach Unix timekeeping

  • Hardware oscillator

– Interrupt every 10ms

  • Software counter

– Counts # interrupts since 1/1/70

  • User access to time

– gettimeofday(), adjtime()

  • Resolution only 10ms

– same order of magnitude as typical network delays

slide-21
SLIDE 21

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

21

Unix timekeeping (2) BSD Clock Implementation

  • Second counter

– Counts at a rate of 1.193 MHz (0.84 µs steps) – Provides time inside a 10 ms interval

  • Resolution increases to 1 µ

s

slide-22
SLIDE 22

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

22

Unix timekeeping (3)

  • A resolution of 1 µs is several orders of magni-

tude better than the typical delays on the Internet

  • But the clocks on two machines will run completely

independent of each other

  • We have to synchronize our clocks

– Set the clock to the right initial value – Tune it to run at the right speed – Correct for experimental effects

  • To do that, we need

– An external time reference source – “Flywheel” to keep the clock running at right speed

slide-23
SLIDE 23

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

23

Flywheel/Phase Locked Loop

  • External time source: GPS
  • PLL

– Determine the difference between internal and external clock – Make the internal clock run faster/slower – Correct for variations over time

  • Kernel level code
  • NTP
  • Internal clock synchronized

to a few µs

slide-24
SLIDE 24

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

24

Time-keeping Advanced N&S solution: Hardware

  • Wanted off-the-shelf solution
  • TrueTime PC[I]-SG “bus-level” card

– Bancom/Datum has similar product

  • Synchronize using GPS satellites
  • “Dumb” antenna (receiver on card)
  • Oscillator & time of day clock on-board
  • Claim: within 1 µs of UTC
  • Major disadvantage: cost ($2500 US)
slide-25
SLIDE 25

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

25

Time of Day: Software

  • System clock ignored
  • Must access card for time-of-day
  • Deployed software

– timestamp at user-level – read via ioctl()(implies bus transaction) – Calibration error of 10 µs (loose), if there is no

  • ther load

– 100 µs is a loose bound for 80 peers

slide-26
SLIDE 26

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

26

Outline

  • The problem
  • Theory behind one-way delay and loss

measurements

  • The two experiments
  • Time-keeping
  • Comparing raw-data
  • Statistical approach to comparing data
  • Effect of packet-sizes on delays
  • Outlook and conclusions
slide-27
SLIDE 27

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

27

Comparing the data

  • RIPE-NCC and Advanced N&S exchanged

boxes in October 1998.

  • Boxes are on the same network segments at

both sides

  • Data taking since October 1998.
  • Other sites with both a Surveyor and TTM box:

– CERN (Spring ‘99) – SLAC (Fall ‘99)

slide-28
SLIDE 28

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

28

Raw Data

20 hours

  • RIPE-NCC
  • Advanced

N&S

slide-29
SLIDE 29

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

29

Percentile delays over a 2 month period

Advanced N&S-data RIPE-NCC-data

Median 2.5% 97.5%

slide-30
SLIDE 30

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

30

Outline

  • The problem
  • Theory behind one-way delay and loss

measurements

  • The two experiments
  • Time-keeping
  • Comparing raw-data
  • Statistical approach to comparing data
  • Effect of packet-sizes on delays
  • Outlook and conclusions
slide-31
SLIDE 31

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

31

Statistical approach

  • “Maybe we should do some statistical

analysis…”

slide-32
SLIDE 32

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

32

Statistical approach

  • “Maybe we should do some statistical

analysis…”

  • Les Cottrell and Warren Matthews from

SLAC sent us a paper

slide-33
SLIDE 33

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

33

SLAC ⇒ CERN

slide-34
SLIDE 34

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

34

Matching the delays?

  • Vary RIPE-NCC delays in the histograms
  • Find the value where the 2 sets agree best
  • Decrease RIPE-NCC delays by 0.2 ms
  • Why?
slide-35
SLIDE 35

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

35

Outline

  • The problem
  • Theory behind one-way delay and loss

measurements

  • The two experiments
  • Time-keeping
  • Comparing raw-data
  • Statistical approach to comparing data
  • Effect of packet-sizes on delays
  • Outlook and conclusions
slide-36
SLIDE 36

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

36

Effects of the packet-size on delays

  • Obviously, larger packets take longer to

transmit

  • But are packets treated differently?
  • 3 experiments:

– Local network (1999) – Transatlantic network

  • Advanced-RIPE (1999)
  • SLAC-CERN (2000)
slide-37
SLIDE 37

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

37

Local Network

  • Similar shapes but shifted in time

40 200 500 1000 1500 Byte Packets

slide-38
SLIDE 38

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

38

Local Network

  • Linear up to MTU, then fragmentation
slide-39
SLIDE 39

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

39

Trans-Atlantic connection

  • Linear up to MTU, larger packets dropped
slide-40
SLIDE 40

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

40

Delays versus packet-size

  • Model
  • Local throughput:
  • Transatlantic connection throughput:
  • Does this explain the difference observed in the

CERN-SLAC data?

kbyte/s ) 2 118 ( t throughpu byte/ms 10 ) 05 . 47 . 8 (

3 1

± =

  • ±

=

  • a

Mbyte/s ) 015 . 235 . 1 ( t throughpu byte/ms 10 ) 10 . 09 . 8 (

  • 4

1

± =

  • ±

= a

MTU B B a a D < + = for ,

1

slide-41
SLIDE 41

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

41

SLAC ⇒ CERN data

  • SLAC-> CERN, March 28, 2000
  • Split data into 2 sub-samples
slide-42
SLIDE 42

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

42

SLAC ⇒ CERN data

  • Extrapolate to 60 bytes difference: 0.14 ms
slide-43
SLIDE 43

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

43

SLAC ⇒ CERN data

  • 0.2 ms difference
  • 0.14 ms can be explained by differences in

packet-size

  • Further investigation needed on the remaining

0.06 ms

  • But this is less than 0.1% of the observed

delay

  • Experimental errors O(0.02) ms.
slide-44
SLIDE 44

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

44

Outline

  • The problem
  • Theory behind one-way delay and loss

measurements

  • The two experiments
  • Time-keeping
  • Comparing raw-data
  • Statistical approach to comparing data
  • Effect of packet-sizes on delays
  • Outlook and conclusions
slide-45
SLIDE 45

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

45

Conclusion and outlook

  • All tests seem to indicate that the 2 setups

measure the same delays and losses

  • Is this sufficient to meet the two independent

implementations requirement?

– Look at more paths, look for more unusual

  • ccurrences

– Any other statistical tests that people consider useful?

  • Look at the effects of different sampling

frequencies

  • These slides will be at http://www.ripe.net/test-traffic
  • n Monday April 10
slide-46
SLIDE 46

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

46

slide-47
SLIDE 47

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

47

Phase Locked Loop

  • A PLL maintains a sense of time over a long

period

– Advantage: small glitches will not immediately affect the clock – Disadvantage: it takes a while before the clock is synchronized

  • The time difference between a pair of clocks

will drift around a constant

– Our software has a correction for this effect

slide-48
SLIDE 48

Henk Uijterwaal . PAM2000, Hamilton, NZ, February 12, 2008 . http://www.ripe.net/test-traffic

48

Implementation

  • NTP
  • Kernel level implementation of the PLL
  • Home-built GPS receiver

– Based on Motorola’s Oncore-VP