[PPT] - Piotr Srebrny Part I: Multicasting in the Internet Basis & PowerPoint Presentation

SLIDE 1

Piotr Srebrny

SLIDE 2

 Part I: Multicasting in the Internet

Basis & critique

 Part II: CacheCast

Internet redundancy
Packet caching systems
CacheCast design
CacheCast evaluation

▪ Efficiency ▪ Computational complexity ▪ Environmental impact

SLIDE 3

SLIDE 4

SLIDE 5

SLIDE 6

SLIDE 7

Vera Goebel Daniel Rodríguez Fernández Ellen Munthe-Kaas Kjell Åge Bringsrud Hi…

SLIDE 8

Hi, I would like to invite you to the presentation

n the IP Multicast

issues.

SLIDE 9

SLIDE 10

Scalability was like the Holy Grail for multicast community.

SLIDE 11

DMMS S Group up

SLIDE 12

DMMS S Group up

SLIDE 13

SLIDE 14

SLIDE 15

SLIDE 16

SLIDE 17

… and after 10 years of non-random mutations

SLIDE 18

SLIDE 19

SLIDE 20

DVMRP PIM-DM PIM-SM MOSPF MSDP MLDv2 IGPMv2 IGPMv3 MBGP BGP4+ MLDv3 AMT PIM-SSM PGM MAAS MADCAP MASC IGMP Snooping CGMP RGMP

SLIDE 21

http://multicasttech.com/status/ (obsolete)

SLIDE 22

SLIDE 23

SLIDE 24

Part II

SLIDE 25

 Internet redundancy  Packet caching systems  CacheCast design  CacheCast evaluation

Efficiency
Computational complexity
Environmental impact

 Related system  Summary

SLIDE 26

 Internet is a content distribution network

26

Ipoque 2009 (http://www.ipoque.com/sites/default/files/mediafiles/documents/internet-study-2008-2009.pdf)

SLIDE 27

 Single source multiple destination transport

mechanism becomes fundamental!

At present, Internet does not provide

efficient multi-point transport mechanism

SLIDE 28

 “Datagram routing for internet multicasting”, L.

Aguilar, 1984 – explicit list of destinations in the IP header

 “Host groups: A multicast extension for datagram

internetworks”, D. Cheriton and S. Deering, 1985 – destination address denotes a group of host

 “A case for end system multicast”, Y. hua Chu et al.,

2000 – application layer multicast

28

SLIDE 29

 Server transmitting the same data to multiple

destinations is wasting the Internet resources

The same data traverses the same path multiple

times

D C S A B

P D P C P B P A 29

SLIDE 30

 Consider two packets A and B that carry the same

content and travel the same few hops

P A A B P P P

30

SLIDE 31

 Consider two packets A and B that carry the same

content and travel the same few hops

P B A B P P P B

31

P B

SLIDE 32

 In practice:

How to determine whether a packet payload is in

the next hop cache?

How to compare packet payloads?
What size should be the cache?

SLIDE 33

33

SLIDE 34

Network elements:

 Link

Medium transporting packets

 Router

Switches data packets between links

34

SLIDE 35

 Link

Logical point to point connection
Highly robust & very deterministic
Throughput limitation per bit [bps]
It is beneficial to avoid redundant payload

transmissions over a link

SLIDE 36

 Router

Switching node
Performs three elementary tasks per packet

▪ TTL update ▪ Checksum recalculation ▪ Destination IP address lookup

Throughput limitation per packet [pps]
Forwarding packets with redundant payload

does not impact router performance

SLIDE 37

 Caching is done on per link basis  Cache Management Unit (CMU) removes payloads

that are stored on the link exit

 Cache Store Unit (CSU) restores payloads from a

local cache

37

SLIDE 38

 Link cache processing must be simple

~72ns to process the minimum size packet on a 10Gbps

link

Modern memory r/w cycle ~6-20ns

 Link cache size must be minimised

At present, a link queue is scaled to 250ms of the link

traffic, for a 10Gbps link it is already 315MB

Difficult to build!

38

A source of redundant data must support link caches!

SLIDE 39

1. Server can transmit packets carrying the

same data within a minimum time interval

2. Server can mark its redundant traffic
3. Server can provide additional information

that simplifies link cache processing

39

SLIDE 40

 CacheCast packet carries an extension header

describing packet payload

Payload ID
Payload size
Index

 Only packets with the header are cached

40

SLIDE 41

Packet train

Only the first packet carries the payload
The remaining packets truncated to the header

41

SLIDE 42

 Packet train duration time  It is sufficient to hold payload in the CSU for

the packet train duration time

 What is the maximum packet train duration

time?

42

SLIDE 43

Back-of-the-envelope calculations

~10ms caches are sufficient

43

SLIDE 44

Two components of the CacheCast system

Server support
Distributed infrastructure of small link caches

D C S A B

D C B P A

CMU CSU CMU CSU

44

SLIDE 45

SLIDE 46

P2 P4 P3 Index 1 2 Index 1 2 Payload ID P3 P4 Payload A P1 x P1 P1 A P1

Cache miss

CMU table Cache store

CMU CSU

46

P2

SLIDE 47

Index 1 2 Index 1 2 Payload ID P1 P2 P3 Payload

B

P2 x P1 B 1 P2 P3

Cache hit

P2

CMU table Cache store

CMU CSU

47

SLIDE 48

P2 P4 P3 Index 1 2 Index 1 2 Payload ID P3 P4 Payload A P1 x P1 A P1

Cache miss

CMU table Cache store

CMU CSU

48

P2

What can go wrong?

SLIDE 49

P2 P4 P3 Index 1 2 Index 1 2 Payload ID P3 P4 Payload P1

Cache hit

CMU table Cache store

CMU CSU

49

B P1 x B P2 B P2

How to protect against this error?

SLIDE 50

 Tasks:

Batch transmissions of the same data to multiple destinations
Build the CacheCast headers
Transmit packets in the form of a packet train

 One system call to transmit data to all destinations

msend()

SLIDE 51

 msend() system call

Implemented in Linux
Simple API
fds_write – a set of file descriptors representing connections to

data clients

fds_written – a set of file descriptors representing connections

to clients that the data was transmitted to int msend(fd_set *fds_write, fd_set *fds_written, char *buf, int len)

51

SLIDE 52

 OS network stack

Connection endpoints represented as sockets
Transport layer (e.g. TCP, UDP, or DCCP)
Network layer (e.g. IP)
Link layer (e.g. Ethernet)
Network card driver

SLIDE 53

 msend() execution

SLIDE 54

SLIDE 55

Two aspects of the CacheCast system

I.

Efficiency

How much redundancy CacheCast removes?

II.

Computational complexity

Can CacheCast be implemented efficiently with

the present technology?

55

SLIDE 56

 CacheCast and ‘Perfect multicast’

‘Perfect multicast’ – delivers data to multiple

destinations without any overhead

 CacheCast overheads I.

Unique packet header per destination

II. Finite link cache size resulting in payload retransmissions

III. Partial deployment

56

SLIDE 57

Metric expresses the reduction in traffic volume

 Example:

u m m

L L  1  5 , 9  

m u

L L

% 44 9 4 9 5 1    

m



57

links unicast

f

am

unt

total the

links

m ulticast

f

am

unt

total the

u

m

L L

SLIDE 58

CacheCast unicast header part (h) and multicast payload part (p) Thus: E.g.:

using packets where sp=1436B and sh=64B, CacheCast achieves 96% of the ‘perfect multicast’ efficiency

u p h m p u h CC

L s s L s L s ) ( 1     

u m m

L L  1 



p h m CC

s s r r    , 1 1  

58

SLIDE 59

 Single link cache efficiency is related to the amount

f redundancy that is removed

 Traffic volumes:

V

Vc  1 

CacheCast without volum e traffic CacheCast with volum e traffic



V Vc

) (

h p

s s n V  

h p c

ns s V  

SLIDE 60

 Link cache efficiency:  Thus:

h p h p

ns ns ns s    1 

C n         1 1 

p h

s s r r C    , 1 1

SLIDE 61

C n         1 1 

SLIDE 62

 The more destination the higher efficiency  E.g.

512Kbps – 8 headers in 10ms, e.g. 12 destinations

 Slow sources transmitting to many

destinations cannot achieve the maximum efficiency

A P B C D E F G H I P J K L

SLIDE 63

System efficiency δm for 10ms large caches

63

SLIDE 64

 Step I

SLIDE 65

 Step II

SLIDE 66

 Step III

SLIDE 67

 Step IV  How could we improve?

SLIDE 68

S

CMU and CSU deployed partially

1 2 3 4 5 6

68

SLIDE 69

69

SLIDE 70

 Considering unique packet headers

CacheCast can achieve up to 96% of the ‘Perfect multicast’

efficiency

 Considering finite cache size

10ms link caches can remove most of the redundancy

generated by fast sources

 Considering partial deployment

CacheCast deployed over the first five hops from a server

achieves already half of the maximum efficiency

70

SLIDE 71

SLIDE 72

 Computational complexity may render

CacheCast inefficient

 Implementations

Link cache elements – implemented with Click

Modular Router Software as processing elements

Server support – a Linux system call and an

auxiliary shell command tool

72

SLIDE 73

 CacheCast can be deployed as a software

update

Click Modular Router Software

 CacheCast router modell



73

SLIDE 74

 Router configuration:

CSU – first element
CMU – last element

 Packet drop occurs

at the output link queue, however before CMU processing

SLIDE 75

 Workload

Packet trains
Payload sizes: 500B, 1500B, 9000B, 16000B
Group size: 10, 100, 1000

SLIDE 76

 Due to CSU and CMU elements CacheCast router

cannot forward packet trains at line rate

76

SLIDE 77

 When compared with a standard router

CacheCast router can forward more data

77

SLIDE 78

 Efficiency of the server support is related to

the efficiency of the msend() system call

 Basic metric:

Time to transmit a single packet

 Compare the msend() system call with the

standard send() system call

78

SLIDE 79

 Server transmitting to

100 destinations using

Loop of send() sys. calls
A single msend() sys. call

 msend() system call outperforms the standard send()

system call when transmitting to multiple destinations

79

SLIDE 80

 Paraslash audio streaming software

80

while (written < len) { size_t num = len - written; if (num > DCCP_MAX_BYTES_PER_WRITE) num = DCCP_MAX_BYTES_PER_WRITE; msend(&dss->client_fds, &fdw, buf, num); written += num; } /* We drop chunks that we don't manage to send */ http://paraslash.systemlinux.org/

dccp_send.c

SLIDE 81

 Setup  Paraslash clients located at the machines A

and B gradually request an audio stream from the server S

81

SLIDE 82

 Original paraslash server can only handle 74 clients  CacheCast paraslash server can handle 1020 clients and

more depending on the chunk size

 Server load is reduced when using large chunks

82

SLIDE 83

83

SLIDE 84

 Internet congestion avoidance relies on

communicating end-points that adjust transmission rate to the network conditions

 CacheCast transparently removes

redundancy increasing network capacity

 It is not given how congestion control

algorithms behave in the CacheCast presence

84

SLIDE 85

 CacheCast implemented in ns-2  Simulation setup:

Bottleneck link topology
100 TCP flows and

100 TFRC flows

Link cache operating
n a bottleneck link

85

SLIDE 86

 TCP flows consume the spare capacity  TFRC flows increase end-to-end throughput

CacheCast preserves the Internet ‘fairness’

86

SLIDE 87

 “Packet caches on routers: the implications of

universal redundant traffic elimination” Ashok Anand, Archit Gupta, Aditya Akella, Srinivasan Seshan, and Scott Shenker, SIGCOMM’08

Fine grain redundancy detection

▪ 10-50% removed redundancy

New redundancy aware routing protocol

▪ Further 10-25% removed redundancy

Large caches

▪ Caching 10s of traffic traversing a link

SLIDE 88

 IP Multicast

Based on the host group model to achieve great scalability
Breaks end-to-end model of the Internet communication
Operates only in ‘walled gardens’

 CacheCast

Only removes redundant payload transmission and

preserves end-to-end connections

Can achieve near multicast bandwidth savings
Is incrementally deployable
Preserves fairness in Internet
Requires server support

SLIDE 89

 P. Srebrny, T. Plagemann, V. Goebel, and A.

Mauthe, “CacheCast: Eliminating Redundant Link Traffic for Single Source Multiple Destination Transfer,” ICDCS’2010.

 A. Anand, A. Gupta, A. Akella, S. Seshan, and S.

Shenker, “Packet caches on routers: the implications of universal redundant traffic elimination,” SIGCOMM’08

 J. Santos and D. Wetherall, “Increasing effective

link bandwidth by suppressing replicated data,” USENIX’98

89