Tracing the Path to YouTube - Introduction A Quantification of Path - - PowerPoint PPT Presentation

tracing the path to youtube
SMART_READER_LITE
LIVE PREVIEW

Tracing the Path to YouTube - Introduction A Quantification of Path - - PowerPoint PPT Presentation

Tracing the Path to YouTube - Introduction A Quantification of Path Lengths and Latencies towards Content Caches Motivation Research Questions Methodology Accepted for publication in IEEE Communications Magazine Analysis (Pre-print:


slide-1
SLIDE 1

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Tracing the Path to YouTube -

A Quantification of Path Lengths and Latencies towards Content Caches

Accepted for publication in IEEE Communications Magazine (Pre-print: http://in.tum.de/~doan/2018-yt-traces.pdf)

Trinh Viet Doan, Ljubica Pajević, Vaibhav Bajpai, Jörg Ott

Chair of Connected Mobility Technical University of Munich

RIPE 77, Amsterdam October 17, 2018

1 / 18

slide-2
SLIDE 2

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Introduction

2 / 18

slide-3
SLIDE 3

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Introduction

Motivation

Previous work [2]:

◮ Measuring YouTube

performance for popular videos

◮ Performance over IPv6 is

worse than over IPv4

◮ Speculation:

Content caches not dual-stacked?

−5 ∆t (ms) TCP Connect Times Web −0.4 −0.3 −0.2 −0.1 0.0 ∆t (ms) TCP Connect Times Audio Video −120 −80 −40 ∆p (ms) Prebuffering Duration Jan 2015 Jan 2016 Jan 2017 Jul Jul Jul −400 −300 −200 −100 ∆s (ms) Startup Delay

Figure 1: Difference of YouTube performance metrics over IPv4 and IPv6

3 / 18

slide-4
SLIDE 4

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Introduction

Research Questions

  • 1. How far are content caches from users?
  • 2. How much benefit do these caches provide?
  • 3. How do these metrics compare quantitatively over IPv4 and IPv6?

4 / 18

slide-5
SLIDE 5

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Methodology

5 / 18

slide-6
SLIDE 6

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Methodology

Measurement Setup

Figure 2: Map of SamKnows probes Figure 3: Example of measurement probe: SamKnows Whitebox 8.01 ◮ ≈ 100 probes deployed around the world since 2014 ◮ Deployed in dual-stacked residential networks, NRENs, business networks,

research labs, data centers, IXPs, ...

◮ Active measurement studies from fixed-line networks

1 https://blog.samknows.com/new-testing-superfast-broadband-27a7abcf1303 [accessed 2018-08-07] 6 / 18

slide-7
SLIDE 7

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Methodology

Targets and Metrics

◮ Hourly traceroute measurements over IPv4 & IPv6

◮ Using scamper [3] for paris traceroute over ICMP

◮ Targets: YouTube media servers

◮ Media servers identified by youtube test [1] that mimics video streaming

from YouTube

◮ DNS resolution for this streaming directly on the probe

⇒ Redirected to best/closest cache, determined by YouTube

◮ Identified IP addresses of media servers to scamper for measurements ◮ Time period: since May 2016 7 / 18

slide-8
SLIDE 8

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Analysis

8 / 18

slide-9
SLIDE 9

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Analysis

Paths

1 10 100 0.0 0.2 0.4 0.6 0.8 1.0 CDF

RTT (IPv6) [ms] RTT (IPv4) [ms] TTL (IPv6) TTL (IPv4)

Figure 4: CDF of median IP path TTL and RTT ◮ Comparable number of paths observed

◮ 78% with TTL ≤ 12 (IPv4), ≤ 11 (IPv6)

→ IPv6 paths more often shorter

◮ 74% with RTT ≤ 25 ms (IPv4), 72% over (IPv6)

→ IPv6 more often slower

9 / 18

slide-10
SLIDE 10

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Analysis

Deltas

However, no direct comparison possible ⇒ look at destination pairs

unit_id dtime source destination status ttl endpoint rtt 239416 2016-06-07 16:45:35 2001:67c:_:_:_:_:fef0:d612 2a00:1450:400f:f::a COMPLETED 9 2a00:1450:400f:f::a 10.522 239416 2016-06-07 16:45:36 10.0.1.3 83.255.235.81 COMPLETED 7 83.255.235.81 13.178

Figure 5: Example for a destination pair

∆TTL = TTLIPv4 − TTLIPv6 ∆RTT = RTTIPv4 − RTTIPv6

10 / 18

slide-11
SLIDE 11

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Analysis

Destination Pairs: General

1000 100 10 1 1 10 100 1000

delta 0.0 0.2 0.4 0.6 0.8 1.0 CDF

IPv6 slower IPv6 faster

RTT delta [ms] TTL delta

Figure 6: CDF of median destination pair deltas ◮ TTL:

◮ 27% with ∆TTL < 0 ◮ 33% with ∆TTL = 0 ◮ 40% with ∆TTL > 0

◮ RTT:

◮ ≈ 50% with ∆RTT < 0 ◮ ≈ 50% with ∆RTT > 0 11 / 18

slide-12
SLIDE 12

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Analysis

Destination Pairs: General

1000 100 10 1 1 10 100 1000

delta 0.0 0.2 0.4 0.6 0.8 1.0 CDF

IPv6 slower IPv6 faster

RTT delta [ms] TTL delta

Figure 6: CDF of median destination pair deltas ◮ Overall:

◮ TTL: 91% within [-5; +5] ◮ RTT: 91% within [-20; +20] ms 11 / 18

slide-13
SLIDE 13

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Analysis

Content Caches

◮ Content caches usually deployed within ISP networks ◮ In close proximity to users to reduce latency

12 / 18

slide-14
SLIDE 14

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Analysis

Content Caches

◮ Content caches usually deployed within ISP networks ◮ In close proximity to users to reduce latency ◮ How to identify caches?

◮ Matching AS numbers for source and destination

→ src ASN == dst ASN

◮ Reverse DNS lookups of destination IP addresses

to retrieve human-readable hostnames → keywords: cache or ggc

◮ Lookups using RIPEstat2 2 https://stat.ripe.net/ 12 / 18

slide-15
SLIDE 15

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Analysis

Destination Pairs: Caches

Possible scenarios for identification of caches when comparing between different address families. IPv4 IPv6 Cache No Cache Cache both O IPv4 only △ No Cache IPv6 only

  • neither

13 / 18

slide-16
SLIDE 16

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Analysis

Destination Pairs: Caches

15 10 5 5 10 15 TTL delta 0.0 0.2 0.4 0.6 0.8 1.0 CDF

IPv6 slower IPv6 faster

IPv4 only IPv6 only Both versions Neither version

1000 100 10 1 0 1 10 100 1000 RTT delta [ms] 0.0 0.2 0.4 0.6 0.8 1.0 CDF IPv6 slower IPv6 faster

IPv4 only IPv6 only Both versions Neither version

Figure 7: CDF of median destination pair deltas (split)

13 / 18

slide-17
SLIDE 17

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Analysis

Destination Pairs: Caches

15 10 5 5 10 15 TTL delta 0.0 0.2 0.4 0.6 0.8 1.0 CDF

IPv6 slower IPv6 faster

IPv4 only IPv6 only Both versions Neither version

1000 100 10 1 0 1 10 100 1000 RTT delta [ms] 0.0 0.2 0.4 0.6 0.8 1.0 CDF IPv6 slower IPv6 faster

IPv4 only IPv6 only Both versions Neither version

Figure 7: CDF of median destination pair deltas (split) ◮ IPv4 cache only (△): shifted to left side; RTT lower over IPv4 for ≈ 80%

13 / 18

slide-18
SLIDE 18

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Analysis

Destination Pairs: Caches

15 10 5 5 10 15 TTL delta 0.0 0.2 0.4 0.6 0.8 1.0 CDF

IPv6 slower IPv6 faster

IPv4 only IPv6 only Both versions Neither version

1000 100 10 1 0 1 10 100 1000 RTT delta [ms] 0.0 0.2 0.4 0.6 0.8 1.0 CDF IPv6 slower IPv6 faster

IPv4 only IPv6 only Both versions Neither version

Figure 7: CDF of median destination pair deltas (split) ◮ IPv4 cache only (△): shifted to left side; RTT lower over IPv4 for ≈ 80% ◮ IPv6 cache only (): paths shorter to IPv6 caches compared to IPv4

no-cache destinations; yet still higher RTTs in most cases

13 / 18

slide-19
SLIDE 19

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Analysis

Destination Pairs: Caches

15 10 5 5 10 15 TTL delta 0.0 0.2 0.4 0.6 0.8 1.0 CDF

IPv6 slower IPv6 faster

IPv4 only IPv6 only Both versions Neither version

1000 100 10 1 0 1 10 100 1000 RTT delta [ms] 0.0 0.2 0.4 0.6 0.8 1.0 CDF IPv6 slower IPv6 faster

IPv4 only IPv6 only Both versions Neither version

Figure 7: CDF of median destination pair deltas (split) ◮ IPv4 cache only (△): shifted to left side; RTT lower over IPv4 for ≈ 80% ◮ IPv6 cache only (): paths shorter to IPv6 caches compared to IPv4

no-cache destinations; yet still higher RTTs in most cases

◮ Both (O): deltas converging towards zero; 60% of the time faster over IPv4,

40% of the time faster over IPv6, however ≈ 80% within [-1,+1] ms

13 / 18

slide-20
SLIDE 20

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Analysis

Content Caches: Distributions

2 4 6 8 10 12 14 16 18 20 22 TTL 0.0 0.2 0.4 0.6 0.8 1.0 CDF

no cache (IPv4) cache (IPv4) no cache (IPv6) cache (IPv6)

Figure 8: CDF of cache vs no cache path values for all traces (TTL) ◮ ≈ 100% of ISP caches reachable within 7 IP hops ◮ Cache vs no cache

◮ ≤ 6 IP hops for ≈ 90% of the cache measurements ◮ ≤ 12 IP hops for ≈ 89% of the no cache measurements 14 / 18

slide-21
SLIDE 21

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Analysis

Content Caches: Distributions

10 20 30 40 50 60 RTT [ms] 0.0 0.2 0.4 0.6 0.8 1.0 CDF

no cache (IPv4) cache (IPv4) no cache (IPv6) cache (IPv6)

Figure 9: CDF of cache vs no cache path values for all traces (RTT) ◮ Majority of caches reachable within 20 ms (87%) ◮ For 80% of the measurements (no cache → cache)

◮ IPv4: 25 ms → 17 ms; ≈ 1⁄

3 improvement

◮ IPv6: 29 ms → 16 ms; ≈ 1⁄

2 improvement

15 / 18

slide-22
SLIDE 22

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Conclusion

16 / 18

slide-23
SLIDE 23

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Conclusion

  • 1. Distance of caches?

◮ Caches within 6 IP hops and 20 ms over both IPv4 and IPv6 17 / 18

slide-24
SLIDE 24

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Conclusion

  • 1. Distance of caches?

◮ Caches within 6 IP hops and 20 ms over both IPv4 and IPv6

2.&3. Benefits of caches? Performance over IPv4 vs IPv6?

◮ IP path length: up to 6 hops lower (i.e.

1⁄ 2) for both IPv4 and IPv6

◮ Latency: up to ≈ 10 ms lower; relative improvement of IPv6 caches higher ◮ IPv4: up to 8 ms (1

3); IPv6: up to 13 ms (1

2)

◮ Surprise: IPv6 caches higher RTT than IPv4 non-caches despite lower TTL 17 / 18

slide-25
SLIDE 25

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Conclusion

  • 1. Distance of caches?

◮ Caches within 6 IP hops and 20 ms over both IPv4 and IPv6

2.&3. Benefits of caches? Performance over IPv4 vs IPv6?

◮ IP path length: up to 6 hops lower (i.e.

1⁄ 2) for both IPv4 and IPv6

◮ Latency: up to ≈ 10 ms lower; relative improvement of IPv6 caches higher ◮ IPv4: up to 8 ms (1

3); IPv6: up to 13 ms (1

2)

◮ Surprise: IPv6 caches higher RTT than IPv4 non-caches despite lower TTL

Takeaways: Room for improvement regarding IPv6 content delivery:

◮ Ensure caches are dual-stacked within ISP networks (see △ and cases), ◮ Optimize delivery regarding performance, routing, forwarding, ... ◮ Caches are not the end of the story regarding IPv4 and IPv6 discrepancy

17 / 18

slide-26
SLIDE 26

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

Conclusion

  • 1. Distance of caches?

◮ Caches within 6 IP hops and 20 ms over both IPv4 and IPv6

2.&3. Benefits of caches? Performance over IPv4 vs IPv6?

◮ IP path length: up to 6 hops lower (i.e.

1⁄ 2) for both IPv4 and IPv6

◮ Latency: up to ≈ 10 ms lower; relative improvement of IPv6 caches higher ◮ IPv4: up to 8 ms (1

3); IPv6: up to 13 ms (1

2)

◮ Surprise: IPv6 caches higher RTT than IPv4 non-caches despite lower TTL

Takeaways: Room for improvement regarding IPv6 content delivery:

◮ Ensure caches are dual-stacked within ISP networks (see △ and cases), ◮ Optimize delivery regarding performance, routing, forwarding, ... ◮ Caches are not the end of the story regarding IPv4 and IPv6 discrepancy

Dataset and code publicly available at: https://github.com/tv-doan/youtube-traceroutes doan@in.tum.de

17 / 18

slide-27
SLIDE 27

Introduction

Motivation Research Questions

Methodology Analysis

Paths Deltas Destination Pairs Content Caches

Conclusion

References

[1] Ahsan, S., Bajpai, V., Ott, J., and Schönwälder, J. Measuring YouTube from Dual-Stacked Hosts. In PAM (2015), vol. 8995 of Lecture Notes in Computer Science, Springer,

  • pp. 249–261.

https://doi.org/10.1007/978-3-319-15509-8_19. [2] Bajpai, V., Ahsan, S., Schönwälder, J., and Ott, J. Measuring YouTube Content Delivery over IPv6. Computer Communication Review 47, 5 (2017), 2–11. http://doi.acm.org/10.1145/3155055.3155057. [3] Luckie, M. J. Scamper: a Scalable and Extensible Packet Prober for Active Measurement of the Internet. In Internet Measurement Conference (2010), ACM, pp. 239–245. http://doi.acm.org/10.1145/1879141.1879171.

18 / 18

slide-28
SLIDE 28

Appendix

Backup Slides

1 / 4

slide-29
SLIDE 29

Appendix

Analysis

Temporal View

5 10 15 20 TTL IPv4 IPv6 05/ 2016 08 11 02 05/ 2017 08 11 02 05/ 2018 20 40 60 RTT [ms] 05/ 2016 08 11 02 05/ 2017 08 11 02 05/ 2018

Figure 10: Boxplots of path TTL and RTT values, aggregated by month ◮ Median TTL across all months: 7 IP hops (both IPv4 and IPv6) ◮ Median RTT across all months: 9.9 ms (IPv4), 10.7 ms (IPv6)

2 / 4

slide-30
SLIDE 30

Appendix

Analysis

Intermediate IP Hops

2 4 6 8 10 12 14 16 18 20 22 TTL 0.0 0.2 0.4 0.6 0.8 1.0 CDF

IPv4 - Content AS IPv6 - Content AS IPv4 - Transit AS IPv6 - Transit AS

Figure 11: CDF of all TTL values by version and AS type3

TTL ≈ 7 as a separator for both IPv4 and IPv6:

◮ Transit/Access ASes: TTL ≤ 7 for 93% ◮ Content ASes:

TTL ≥ 7 for 85%

3 CAIDA AS Classification:

https://www.caida.org/data/as-classification/

3 / 4

slide-31
SLIDE 31

Appendix

Analysis

Intermediate IP Hops

0.1 1 10 100 RTT [ms] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 TTL 0.1 1 10 100 RTT [ms] IPv4 IPv6

Figure 12: Boxplots of RTT by TTL ◮ Destination reached in TTL < 7 (blue gradient):

ISP cache in Transit/Access AS

◮ Destination reached in TTL > 7 (orange gradient):

  • rigin content server in Content AS

4 / 4