Piotr Srebrny Part I: Multicasting in the Internet Basis & - - PowerPoint PPT Presentation

piotr srebrny part i multicasting in the internet
SMART_READER_LITE
LIVE PREVIEW

Piotr Srebrny Part I: Multicasting in the Internet Basis & - - PowerPoint PPT Presentation

Piotr Srebrny Part I: Multicasting in the Internet Basis & critique Part II: CacheCast Internet redundancy Packet caching systems CacheCast design CacheCast evaluation Efficiency Computational complexity


slide-1
SLIDE 1

Piotr Srebrny

slide-2
SLIDE 2

 Part I: Multicasting in the Internet

  • Basis & critique

 Part II: CacheCast

  • Internet redundancy
  • Packet caching systems
  • CacheCast design
  • CacheCast evaluation

▪ Efficiency ▪ Computational complexity ▪ Environmental impact

slide-3
SLIDE 3
slide-4
SLIDE 4
slide-5
SLIDE 5
slide-6
SLIDE 6
slide-7
SLIDE 7

Vera Goebel Daniel Rodríguez Fernández Ellen Munthe-Kaas Kjell Åge Bringsrud Hi…

slide-8
SLIDE 8

Hi, I would like to invite you to the presentation

  • n the IP Multicast

issues.

slide-9
SLIDE 9
slide-10
SLIDE 10

Scalability was like the Holy Grail for multicast community.

slide-11
SLIDE 11

DMMS S Group up

slide-12
SLIDE 12

DMMS S Group up

slide-13
SLIDE 13
slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16
slide-17
SLIDE 17

… and after 10 years of non-random mutations

slide-18
SLIDE 18
slide-19
SLIDE 19
slide-20
SLIDE 20

DVMRP PIM-DM PIM-SM MOSPF MSDP MLDv2 IGPMv2 IGPMv3 MBGP BGP4+ MLDv3 AMT PIM-SSM PGM MAAS MADCAP MASC IGMP Snooping CGMP RGMP

slide-21
SLIDE 21

http://multicasttech.com/status/ (obsolete)

slide-22
SLIDE 22
slide-23
SLIDE 23
slide-24
SLIDE 24

Part II

slide-25
SLIDE 25

 Internet redundancy  Packet caching systems  CacheCast design  CacheCast evaluation

  • Efficiency
  • Computational complexity
  • Environmental impact

 Related system  Summary

slide-26
SLIDE 26

 Internet is a content distribution network

26

Ipoque 2009 (http://www.ipoque.com/sites/default/files/mediafiles/documents/internet-study-2008-2009.pdf)

slide-27
SLIDE 27

 Single source multiple destination transport

mechanism becomes fundamental!

  • At present, Internet does not provide

efficient multi-point transport mechanism

slide-28
SLIDE 28

 “Datagram routing for internet multicasting”, L.

Aguilar, 1984 – explicit list of destinations in the IP header

 “Host groups: A multicast extension for datagram

internetworks”, D. Cheriton and S. Deering, 1985 – destination address denotes a group of host

 “A case for end system multicast”, Y. hua Chu et al.,

2000 – application layer multicast

28

slide-29
SLIDE 29

 Server transmitting the same data to multiple

destinations is wasting the Internet resources

  • The same data traverses the same path multiple

times

D C S A B

P D P C P B P A 29

slide-30
SLIDE 30

 Consider two packets A and B that carry the same

content and travel the same few hops

P A A B P P P

30

slide-31
SLIDE 31

 Consider two packets A and B that carry the same

content and travel the same few hops

P B A B P P P B

31

P B

slide-32
SLIDE 32

 In practice:

  • How to determine whether a packet payload is in

the next hop cache?

  • How to compare packet payloads?
  • What size should be the cache?
slide-33
SLIDE 33

33

slide-34
SLIDE 34

Network elements:

 Link

  • Medium transporting packets

 Router

  • Switches data packets between links

34

slide-35
SLIDE 35

 Link

  • Logical point to point connection
  • Highly robust & very deterministic
  • Throughput limitation per bit [bps]
  • It is beneficial to avoid redundant payload

transmissions over a link

slide-36
SLIDE 36

 Router

  • Switching node
  • Performs three elementary tasks per packet

▪ TTL update ▪ Checksum recalculation ▪ Destination IP address lookup

  • Throughput limitation per packet [pps]
  • Forwarding packets with redundant payload

does not impact router performance

slide-37
SLIDE 37

 Caching is done on per link basis  Cache Management Unit (CMU) removes payloads

that are stored on the link exit

 Cache Store Unit (CSU) restores payloads from a

local cache

37

slide-38
SLIDE 38

 Link cache processing must be simple

  • ~72ns to process the minimum size packet on a 10Gbps

link

  • Modern memory r/w cycle ~6-20ns

 Link cache size must be minimised

  • At present, a link queue is scaled to 250ms of the link

traffic, for a 10Gbps link it is already 315MB

  • Difficult to build!

38

A source of redundant data must support link caches!

slide-39
SLIDE 39
  • 1. Server can transmit packets carrying the

same data within a minimum time interval

  • 2. Server can mark its redundant traffic
  • 3. Server can provide additional information

that simplifies link cache processing

39

slide-40
SLIDE 40

 CacheCast packet carries an extension header

describing packet payload

  • Payload ID
  • Payload size
  • Index

 Only packets with the header are cached

40

slide-41
SLIDE 41

Packet train

  • Only the first packet carries the payload
  • The remaining packets truncated to the header

41

slide-42
SLIDE 42

 Packet train duration time  It is sufficient to hold payload in the CSU for

the packet train duration time

 What is the maximum packet train duration

time?

42

slide-43
SLIDE 43

Back-of-the-envelope calculations

  • ~10ms caches are sufficient

43

slide-44
SLIDE 44

Two components of the CacheCast system

  • Server support
  • Distributed infrastructure of small link caches

D C S A B

D C B P A

CMU CSU CMU CSU

44

slide-45
SLIDE 45
slide-46
SLIDE 46

P2 P4 P3 Index 1 2 Index 1 2 Payload ID P3 P4 Payload A P1 x P1 P1 A P1

  • Cache miss

CMU table Cache store

CMU CSU

46

P2

slide-47
SLIDE 47

Index 1 2 Index 1 2 Payload ID P1 P2 P3 Payload

  • B

P2 x P1 B 1 P2 P3

  • Cache hit

P2

CMU table Cache store

CMU CSU

47

slide-48
SLIDE 48

P2 P4 P3 Index 1 2 Index 1 2 Payload ID P3 P4 Payload A P1 x P1 A P1

  • Cache miss

CMU table Cache store

CMU CSU

48

P2

What can go wrong?

slide-49
SLIDE 49

P2 P4 P3 Index 1 2 Index 1 2 Payload ID P3 P4 Payload P1

  • Cache hit

CMU table Cache store

CMU CSU

49

B P1 x B P2 B P2

How to protect against this error?

slide-50
SLIDE 50

 Tasks:

  • Batch transmissions of the same data to multiple destinations
  • Build the CacheCast headers
  • Transmit packets in the form of a packet train

 One system call to transmit data to all destinations

msend()

slide-51
SLIDE 51

 msend() system call

  • Implemented in Linux
  • Simple API
  • fds_write – a set of file descriptors representing connections to

data clients

  • fds_written – a set of file descriptors representing connections

to clients that the data was transmitted to int msend(fd_set *fds_write, fd_set *fds_written, char *buf, int len)

51

slide-52
SLIDE 52

 OS network stack

  • Connection endpoints represented as sockets
  • Transport layer (e.g. TCP, UDP, or DCCP)
  • Network layer (e.g. IP)
  • Link layer (e.g. Ethernet)
  • Network card driver
slide-53
SLIDE 53

 msend() execution

slide-54
SLIDE 54
slide-55
SLIDE 55

Two aspects of the CacheCast system

I.

Efficiency

  • How much redundancy CacheCast removes?

II.

Computational complexity

  • Can CacheCast be implemented efficiently with

the present technology?

55

slide-56
SLIDE 56

 CacheCast and ‘Perfect multicast’

  • ‘Perfect multicast’ – delivers data to multiple

destinations without any overhead

 CacheCast overheads I.

Unique packet header per destination

II.

Finite link cache size resulting in payload retransmissions

  • III. Partial deployment

56

slide-57
SLIDE 57

Metric expresses the reduction in traffic volume

 Example:

u m m

L L  1  5 , 9  

m u

L L

% 44 9 4 9 5 1    

m

57

links unicast

  • f

am

  • unt

total the

  • links

m ulticast

  • f

am

  • unt

total the

  • u

m

L L

slide-58
SLIDE 58

CacheCast unicast header part (h) and multicast payload part (p) Thus: E.g.:

using packets where sp=1436B and sh=64B, CacheCast achieves 96% of the ‘perfect multicast’ efficiency

u p h m p u h CC

L s s L s L s ) ( 1     

u m m

L L  1 

p h m CC

s s r r    , 1 1  

58

slide-59
SLIDE 59

 Single link cache efficiency is related to the amount

  • f redundancy that is removed

 Traffic volumes:

  • V

Vc  1 

CacheCast without volum e traffic CacheCast with volum e traffic

V Vc

) (

h p

s s n V  

h p c

ns s V  

slide-60
SLIDE 60

 Link cache efficiency:  Thus:

h p h p

ns ns ns s    1 

C n         1 1 

p h

s s r r C    , 1 1

slide-61
SLIDE 61

C n         1 1 

slide-62
SLIDE 62

 The more destination the higher efficiency  E.g.

  • 512Kbps – 8 headers in 10ms, e.g. 12 destinations

 Slow sources transmitting to many

destinations cannot achieve the maximum efficiency

A P B C D E F G H I P J K L

slide-63
SLIDE 63

System efficiency δm for 10ms large caches

63

slide-64
SLIDE 64

 Step I

slide-65
SLIDE 65

 Step II

slide-66
SLIDE 66

 Step III

slide-67
SLIDE 67

 Step IV  How could we improve?

slide-68
SLIDE 68

S

CMU and CSU deployed partially

1 2 3 4 5 6

68

slide-69
SLIDE 69

69

slide-70
SLIDE 70

 Considering unique packet headers

  • CacheCast can achieve up to 96% of the ‘Perfect multicast’

efficiency

 Considering finite cache size

  • 10ms link caches can remove most of the redundancy

generated by fast sources

 Considering partial deployment

  • CacheCast deployed over the first five hops from a server

achieves already half of the maximum efficiency

70

slide-71
SLIDE 71
slide-72
SLIDE 72

 Computational complexity may render

CacheCast inefficient

 Implementations

  • Link cache elements – implemented with Click

Modular Router Software as processing elements

  • Server support – a Linux system call and an

auxiliary shell command tool

72

slide-73
SLIDE 73

 CacheCast can be deployed as a software

update

  • Click Modular Router Software

 CacheCast router modell

73

slide-74
SLIDE 74

 Router configuration:

  • CSU – first element
  • CMU – last element

 Packet drop occurs

at the output link queue, however before CMU processing

slide-75
SLIDE 75

 Workload

  • Packet trains
  • Payload sizes: 500B, 1500B, 9000B, 16000B
  • Group size: 10, 100, 1000
slide-76
SLIDE 76

 Due to CSU and CMU elements CacheCast router

cannot forward packet trains at line rate

76

slide-77
SLIDE 77

 When compared with a standard router

CacheCast router can forward more data

77

slide-78
SLIDE 78

 Efficiency of the server support is related to

the efficiency of the msend() system call

 Basic metric:

  • Time to transmit a single packet

 Compare the msend() system call with the

standard send() system call

78

slide-79
SLIDE 79

 Server transmitting to

100 destinations using

  • Loop of send() sys. calls
  • A single msend() sys. call

 msend() system call outperforms the standard send()

system call when transmitting to multiple destinations

79

slide-80
SLIDE 80

 Paraslash audio streaming software

80

while (written < len) { size_t num = len - written; if (num > DCCP_MAX_BYTES_PER_WRITE) num = DCCP_MAX_BYTES_PER_WRITE; msend(&dss->client_fds, &fdw, buf, num); written += num; } /* We drop chunks that we don't manage to send */ http://paraslash.systemlinux.org/

dccp_send.c

slide-81
SLIDE 81

 Setup  Paraslash clients located at the machines A

and B gradually request an audio stream from the server S

81

slide-82
SLIDE 82

 Original paraslash server can only handle 74 clients  CacheCast paraslash server can handle 1020 clients and

more depending on the chunk size

 Server load is reduced when using large chunks

82

slide-83
SLIDE 83

83

slide-84
SLIDE 84

 Internet congestion avoidance relies on

communicating end-points that adjust transmission rate to the network conditions

 CacheCast transparently removes

redundancy increasing network capacity

 It is not given how congestion control

algorithms behave in the CacheCast presence

84

slide-85
SLIDE 85

 CacheCast implemented in ns-2  Simulation setup:

  • Bottleneck link topology
  • 100 TCP flows and

100 TFRC flows

  • Link cache operating
  • n a bottleneck link

85

slide-86
SLIDE 86

 TCP flows consume the spare capacity  TFRC flows increase end-to-end throughput

  • CacheCast preserves the Internet ‘fairness’

86

slide-87
SLIDE 87

 “Packet caches on routers: the implications of

universal redundant traffic elimination” Ashok Anand, Archit Gupta, Aditya Akella, Srinivasan Seshan, and Scott Shenker, SIGCOMM’08

  • Fine grain redundancy detection

▪ 10-50% removed redundancy

  • New redundancy aware routing protocol

▪ Further 10-25% removed redundancy

  • Large caches

▪ Caching 10s of traffic traversing a link

slide-88
SLIDE 88

 IP Multicast

  • Based on the host group model to achieve great scalability
  • Breaks end-to-end model of the Internet communication
  • Operates only in ‘walled gardens’

 CacheCast

  • Only removes redundant payload transmission and

preserves end-to-end connections

  • Can achieve near multicast bandwidth savings
  • Is incrementally deployable
  • Preserves fairness in Internet
  • Requires server support
slide-89
SLIDE 89

 P. Srebrny, T. Plagemann, V. Goebel, and A.

Mauthe, “CacheCast: Eliminating Redundant Link Traffic for Single Source Multiple Destination Transfer,” ICDCS’2010.

 A. Anand, A. Gupta, A. Akella, S. Seshan, and S.

Shenker, “Packet caches on routers: the implications of universal redundant traffic elimination,” SIGCOMM’08

 J. Santos and D. Wetherall, “Increasing effective

link bandwidth by suppressing replicated data,” USENIX’98

89