[PPT] - Extreme-scale Computing Global Knowledge without Global PowerPoint Presentation

SLIDE 1

Invited Talk:

Epidemic Protocols for Extreme-scale Computing

Dr. Giuse

seppe pe Di Fa Fatta ta

G.DiFatta@reading.ac.uk

Wednesday, September 24, 2014

Global Knowledge without Global Communication

SLIDE 2

G. Di Fatta

2

Outline

eXtre

reme-sca scale le Computin ting

Motivations: global knowledge without global

communication

Applications: from distributed systems to

exascale supercomputing (HPC)

Epidemic Data Mining
Epidemic Protocols
Information dissemination and data aggregation
Membership and aggregation protocols
Open Issues and Contributions
aggregation in asynchronous systems
local detection of global convergence
dynamics in overlay topologies
Conclusions

SLIDE 3

G. Di Fatta

From Large to Extreme-scale Systems

Distributed Systems

Internet

– Ubiquitous Computing, Crowd Sensing, P2P Overlay Networks – Internet of Things (50 to 100 trillion objects) – Decentralised Online Social Networks

Ad-hoc Networks

– Large-scale Wireless Sensor Networks – Mobile ad-hoc Networks (MANET) – Vehicular Ad-Hoc Networks (VANET)

Parallel Systems

Towards exascale computing

– Tianhe-2 (MilkyWay-2): National Supercomputer Center, Sun Yat-sen University, Guangzhou, China, Top500 N.1 since June 2013, 34/55 Pflop/s, 3.12M cores

SLIDE 4

G. Di Fatta

Extremely Scalable Computing

Scalability

– number of data objects – dimensionality of data objects – number of processing elements 

Computing in extreme-scale systems

– Scalability of the communication cost – Decentralisation – Robustness and fault-tolerance – Adaptiveness: ability to cope with dynamic environments

Global Knowledge w/o Global Communication

4

p

Communi nication

n-

bound nd

SLIDE 5

G. Di Fatta

Epidemic Protocols

A commun

unicati ication n and compu putat tatio ion n paradi digm gm for large-scale networked systems:

– high scalability – probabilistic guarantees on convergence speed and accuracy – robustness, fault-tolerance, high stability under disruption

aka Gossip-based protocols

SLIDE 6

G. Di Fatta

Figure from: “Rapid communications A preliminary estimation of the reproduction ratio for new influenza A(H1N1) from the outbreak in Mexico, March-April 2009", P Y Boëlle, P Bernillon, J C Desenclos, Eurosurveillance, Volume 14, Issue 19, 14 May 2009

Exponential Growth

In epidemiology an epidemic is a disease outbreak that occurs when new

cases exceed a "normal" expectation of propagation (a contained propagation).

– The disease spreads person-to-person: the affected individuals become independent reservoirs leading to further exposures. – In uncontrolled outbreaks there is an exponential growth of the infected cases.

Figure from: “Controlling infectious disease outbreaks: Lessons from mathematical modelling”, T Déirdre Hollingsworth, Journal of Public Health Policy 30, 328-341, Sept. 2009

SLIDE 7

G. Di Fatta

Epidemic Computing

7

Idea: Virus  Information

Di Disease se outbre break ak Epidemi mic c commun municat cation n for extr treme eme-scale scale compu puti ting ng

SLIDE 8

G. Di Fatta

Epidemic/Gossip-based Protocol

A synchronous push mechanism for information dissemination (infection)
Uniform Gossiping: assuming a node is able to select a node id (peer)

uniformly at random

Practical peer sampling: Membership Protocols are used to provide such a

function in a practical way.

8

Repeat

– wait some T – chose a random peer – send local state

Repeat

– receive remote state – If state==infected, then local state=infected Active thread (cycle-based): Passive thread (event-based):

SLIDE 9

G. Di Fatta

Information Dissemination: Propagation Time

Time to propagate information originated at one peer

9

Time to complete “infection”: O(log N)

expected # protocol cycles # peers

SLIDE 10

G. Di Fatta

Seminal Work and History

Demers 1987 (Xerox PARC), Clearinghouse Directory Service
Golding 1993, the refdbms distributed bibliographic database system
Demers 1993-97 (Xerox PARC), the Bayou project
Birman 1998 (Cornell), Bimodal Multicast
van Renesse 1999 (Cornell), Astrolabe
Karp 2000 (ICSI, Berkeley), Randomized Rumor Spreading
In 2000-2005, a surge of studies:

– several epidemic protocols and – their applications in communication networks and distributed systems

Di Fatta 2011, first epidemic data mining algorithm for distributed systems
Strakova 2011, first application to exascale supercomputing
Theoretical work is still making progress but practical protocols and apps

have received limited attention.

SLIDE 11

G. Di Fatta

Open Issues

Theoretical studies and simulations typically assume

 simplistic synchronous communication model with static/reliable network  unrealistic global knowledge of the networked system  the initial overlay topology is a random graph  unlimited or “enough” protocol rounds to reach convergence

In distributed, large and extreme-scale networks:
communication is asynchronous, net is not reliable/is dynamic
nodes may only know a limited set of neighbours (sparse graph)
the initial topology may not be a random graph: poor initial topologies

may have serious implications in convergence speed and, even worse, in the convergence guarantee itself

convergence is a global property that depends on several factors, which

typically are not known locally.

11

SLIDE 12

Applications

SLIDE 13

G. Di Fatta

Applications

Epidemic protocols have been used to provide scalable and

fault-tolerant services, such as:

– information dissemination (broadcast, multicast) – data aggregation: values of aggregate functions more important than individual data (sum, average, sampling, percentiles, etc.)

And they have been proposed for various applications:

– DB replica synchronisation and maintenance – Network management and monitoring – Failure detection – HPC algs and services, e.g., QR factorization and power-capping – Epidemic Knowledge Discovery and Data Mining

decentralised discovery of global patterns and trends

13

SLIDE 14

G. Di Fatta

14

Parallel K-Means in share-nothing systems

distributed data All-Reduce distributed processes centroids for next iteration: repeat until convergence compute local clusters: partial sums Broadcast generate centroids for first iteration

data are intrinsically distributed

compute local clusters: partial sums compute local clusters: partial sums compute local clusters: partial sums

initialisation

P0 P1 P2 P3

Global communication is not a feasible approach for extreme-scale systems

SLIDE 15

G. Di Fatta

15

Epidemic K-Means

distributed data Epidemic Aggregation of sums, counts and errors distributed processes centroids for next iteration: repeat until convergence compute local clusters: partial sums Epidemic broadcast

f a seed for the random number generator

generate centroids for first iteration

data are intrinsically distributed

compute local clusters: partial sums compute local clusters: partial sums compute local clusters: partial sums

initialisation

P0 P1 P2 P3 generate centroids for first iteration generate centroids for first iteration generate centroids for first iteration

(or static list of seeds for multiple executions)

SLIDE 16

G. Di Fatta

Simulations - Data Distributions

Each node has a fixed number of data points (100).
Each data point belongs to a category (colour).
Data points are assigned to nodes from uniformly at random (a) to locality-

dependent allocation (d).

16

SLIDE 17

G. Di Fatta

Clustering Accuracy

Accuracy w.r.t. the “ideal” (centralised) data clustering

17

Clustering Accuracy (average)

Cluster distribution (Jain Index)

skew data distribution uniform distribution

epidemic random p2p local p2p

Standard Deviation

Cluster distribution (Jain Index)

skew data distribution uniform distribution

epidemic random p2p local p2p

SLIDE 18

G. Di Fatta

Mean Squared Error of Centroids

Error w.r.t. the “ideal” (centralised) centroids

18

Clustering Error (average)

Cluster distribution (Jain Index)

skew data distribution uniform distribution

epidemic random p2p local p2p

Standard Deviation

Cluster distribution (Jain Index)

skew data distribution uniform distribution

epidemic random p2p local p2p

SLIDE 19

G. Di Fatta

Fault-Tolerance of Epidemic K-Means

Clustering accuracy under message loss and churn: 0-20%

19

Clustering Error (average)

Cluster distribution (Jain Index)

skew data distribution uniform distribution

epidemic random p2p local p2p

Standard Deviation

Cluster distribution (Jain Index)

skew data distribution uniform distribution

epidemic random p2p local p2p

SLIDE 20

Data Aggregation

SLIDE 21

G. Di Fatta

The Data Aggregation Problem

(a.k.a. the “node aggregation” problem)
Given a network of N nodes, each node i holding a

local value xi, the goal is to determine the value of a global aggregation function f() at every node: f(x0, x1, ..., xN-1)

Example of aggregation functions:

– sum, average, max, min, random samples, quantiles and

ther aggregate databases queries.

SLIDE 22

G. Di Fatta

Data Aggregation: e.g., Sum

22

Centralised approach: all receive operations, and all

additions, must be serialized: O(N)

Divide-and-conquer strategy to perform the global sum with a

binary tree: the number of communication steps is reduced from O(N) to O(log(N)).



 



1 N i i

x s

SLIDE 23

G. Di Fatta

All-to-all Communication

23

MPI AllReduce
MPI predefined operations: max, min, sum, product, and, or, xor
all processes compute identical results
number of communication steps: log(N)
number of messages: N*log(N)

) ,..., , (

1 1  N

x x x f

x0 x1 x2 x3 x4 x5 x6 x7

Any global function which can be approximated well using linear combinations.

t1 t0 t2 t3 time

SLIDE 24

G. Di Fatta

Fault-Tolerance and Robustness

24

The parallel approach requires global communication.
It is not fault tolerant: even a single node or link failure cannot

be tolerated.

A delay on a single communication may have an effect on all

nodes.

node de failur ilure

SLIDE 25

Epidemic Aggregation Protocols

SLIDE 26

G. Di Fatta

Epidemic/Gossip-based Protocol

definition of state and merge function (aggregation protocol)
E.g., average, sum, percentiles, etc.
based on randomised communication: peer selection mechanism

(membership protocol)

26

Repeat

– wait some T – chose a random peer – send local state

Repeat

– receive remote state

– [reply with local state]

– merge remote and local state Active thread (cycle-based): Passive thread (event-based):

Membership Protocol Aggregation Protocol

SLIDE 27

G. Di Fatta

Epidemic Data Aggregation: Global Average

Simulation of epidemic aggregation: local estimations of global average
Network of 10K nodes: each node holds a local value.

– Worst case analysis: peak distribution, i.e. information originated at one node

27

Very high value Higher value Target value (0.01% error) Lower value

SLIDE 28

G. Di Fatta

Epidemic Protocols

Push epidemic

– each peer sends state to other member

Pull epidemic

– each peer requests state from other member – expected #rounds the same

Push/Pull epidemic

– Push and Pull in one exchange – reduces #rounds at increased communication cost

28

Asymm ymmetric etric Gossi ssiping ping Symmetric mmetric Goss ssiping iping

SLIDE 29

G. Di Fatta

Asymmetric/Symmetric Approaches

in Uniform Gossip, at any cycle the probability that a node receives a

number x of messages follows a binom

mial

ial distri stribu bution ion.

Asymmetric: at each cycle, 36.8% of nodes do not receive any push

message.

Symmetric: at each cycle, every node receives at least one pull message.

29

SLIDE 30

G. Di Fatta

30

The Push-Sum Protocol (PSP)

Each node i holds and updates the local sum st,i and a weight wt,i.
Initialisation:

– Node i sends the pair <xi,w0,i> to itself.

At each cycle t:

z i j

<½st,j, ½wt,j>

u

<½st,i, ½wt,i>

st+1,i = ½st,j + ½st,i + ½st,z

Update at node i:

wt+1,i = ½wt,j + ½wt,i + ½wt,z

<½st,i, ½wt,i> variance reduction step

SLIDE 31

G. Di Fatta

31

The Push-Sum Protocol (PSP)

Convergence: accuracy, consistency and speed
Settings for various aggregation functions:

SLIDE 32

G. Di Fatta

Mass Conservation Invariant

The mass conservation invariant states that, for the

case “Average”, the average of all local values is always the correct average and the sum of all weights is always N.

Protocols violating this invariant do not converge to

the true global aggregate.

32

SLIDE 33

G. Di Fatta

Diffusion Speed

The diffusion speed is how quickly values originating

at a source diffuse evenly through a network (convergence).

– The number of protocol iterations such that the value at a node is diffused through the network, i.e., a peak distribution is transformed in a uniform distribution. – The diffusion speed is typically given as the complexity of the number of iteration steps as function of the network size, maximum error and maximum probability that the approximation at a node is larger than the maximum error.

33

PSP diffusion speed: with probability 1- the relative error in

the approximation of the global aggregate is within , in at most O(log(N) + log(1/) + log(1/)) cycles, where  and  are arbitrarily small positive constants.

SLIDE 34

G. Di Fatta

34

The Push-Pull Gossip (PPG) Protocol

PPG aggregated average:
at a push msg nodes reply with a pull msg: local values are exchanged

and averaged.

– Node i selects a random node j to exchange their local values. – Each node compute the average and updates the local pair.

The push-pull operations need to be performed atomically.

– If not, the conservation of mass in the system is not guaranteed and the protocol does not converge to the true global aggregate.

i j i j 1 2 4 u 3

vt+1,i = ½(vt,j + vt,i) vt+1,j = ½(vt,j + vt,i) variance reduction step:

2 1

SLIDE 35

G. Di Fatta

35

The Symmetric Push-Sum Protocol (SPSP)

SPSP is a Push-Pull scheme with asynchronous communication

– no atomic operation is required.

j i

<½st,i, ½wt,i> <½st,j, ½wt,j> <½st,j, ½wt,j> <½st,i, ½wt,i>

SLIDE 36

G. Di Fatta

36

Comparative Analysis (PSP, PPG, SPSP)

Convergence speed: variance of the estimated global aggregate over time

– Percentage of operations with atomicity violation (AVP): 0.3% and 90%, – Internet-like topologies, 5000 nodes. – PPG and SPSP convergence speed is similar w.r.t. AVP.

PPG PSP SPSP

SLIDE 37

G. Di Fatta

37

Comparative Analysis (PSP, PPG, SPSP)

The mean percentage error (MPE) over time

– different AVP levels (from 0.3% to 90%) – averages over 100 different simulations: Internet-like and mesh topologies, 1000-5000 nodes, different data distributions, asynchronous communication. – Only PSP and SPSP converge to the true global aggregate value.

PPG PSP SPSP

SLIDE 38

Epidemic Membership Protocols

SLIDE 39

G. Di Fatta

Transport Protocol

Protocol Stack

39

Membershi ership p Protoc

col
l

Aggregation Protocol Aggregation Protocol Epidemic application Aggregation Protocol Epidemic application Network Protocol  overlay topology  physical topology  Uniform Gossiping

SLIDE 40

G. Di Fatta
Epidemic Protocols:

– exchange information with other nodes to achieve some application goals (e.g., information dissemination, data aggregation)

Membership Protocols:

– provide the random peer sampling service for the above and is based

n an epidemic approach too.

Epidemic & Membership Protocols

40

Epidemic emic Proto tocol col Membersh ership p Protoco tocol

request a random node response with random node j send a push msg to j

3 2 1

Epidemic emic Proto tocol col node j node i

SLIDE 41

G. Di Fatta

Overlay Topologies

The overlay topology must have nice properties.

– Sparse (out degree): e.g., a fully connected graph is not scalable (global knowledge) – Robust: no single points of failure - a star topology has optimal propagation time, but it is not scalable and is not robust. – Load balancing (indegree): there should not be bottlenecks. – Connectivity: a single connected component

The overlay topology must be connected at all times. If at any time the graph degenerates

into multiple connected components, it will not heal (*) and the application-layer epidemic protocol will not converge.

– Good propagation/diffusion: random graphs, expanders

(*) with current protocols

41

SLIDE 42

G. Di Fatta

Epidemic Membership Protocols

Practical peer sampling:

– Partial view of the global system: a local cache of (max size) peer IDs is maintained and used to draw a random entry when requested

The cache is initialised with the initial (physical) neighbours.
Caches are periodically exchanged (likewise push/pull messages), merged and

randomly trimmed to max size.

– This is equivalent to multiple random walks: the cache entries quickly converges to a random sample of the peers with uniform distribution.

random sparse regular graph

42

Membersh ership p Protoco tocol Membersh ership p Protoco tocol

node e i node e j push pull a b j c d e f g

a b j c d e f g

SLIDE 43

G. Di Fatta

Epidemic Membership Protocols

Practical peer sampling:

– Partial view of the global system: a local cache of (max size) peer IDs is maintained and used to draw a random entry when requested

The cache is initialised with the initial (physical) neighbours.
Caches are periodically exchanged (likewise push/pull messages), merged and

randomly trimmed to max size.

– This is equivalent to multiple random walks: the cache entries quickly converges to a random sample of the peers with uniform distribution.

random sparse regular graph

43

Membersh ership p Protoco tocol Membersh ership p Protoco tocol

node e i node e j d b j e a d f g

SLIDE 44

G. Di Fatta
At each cycle (synchronous model), the distributed set of caches define a

transient random overlay network.

– The membership protocol keeps changing the overlay topology over time – Aim: the random node sampling from the local partial view results in a uniform distribution over the global system  Uniform Gossiping – The node caches define an overlay network topology:

a random sparse regular digraph that keeps changing over time.

Epidemic Membership Protocols

44

…

Some membership protocols:

Node Cache Protocol, Cyclon, Send&Forget, Newscast, Eddy

SLIDE 45

G. Di Fatta

Random Overlay Topologies

A Membership Protocol is a fully-decentralised generative graph method:

– it takes an input graph and generates a random output graph with similar properties. (assuming a simplified synchronous network model)

Most (not all) MPs adopt (random) regular digraphs: the local cache has fixed size.
Ideally we would like the MP to induce an attraction towards strongly connected

graphs with equal indegree (or with low variance): the indegree can be used as a measure of robustness.

45

robust and strongly connected digraphs digraphs

weakly connected digraphs

Multipl ple e connec nnected ed components ponents

2+ 1 1

initial condition

SLIDE 46

G. Di Fatta

The Expander Membership Protocol

A novel membership protocols inspired by the concept of

expander graphs, aka ‘expanders’.

An expander is a sparse graph with strong connectivity.

– The strong connectivity can be quantified by an index of expansion quality.

The Expander Membership Protocol is designed to maximise

the expansion quality of the overlay topology.

– quasi-random peer selection: random search of a push-pull peer that minimizes cache overlap.

46

SLIDE 47

G. Di Fatta

Expansion Quality

The vertex expansion index h(V,S) and its minimum over

different sample sizes (typically 0<s<½|V|):

47

Set of the network nodes Boundary Nodes Sample

S V S S V h \ ) ( ) , (  

V: the set of network nodes S: a sample of nodes, S V, |S|=s (S): the boundary of S,

i.e. the set of nodes not in S and 1-hop distant from at least one node in S.

S V S s V h

s S V S

\ ) ( min ) , (

, min

 

 

SLIDE 48

G. Di Fatta

Message Forwarding Mechanism

Case 1:

– Qx is local cache of node x and Qy is local cache of node y. – Each iteration node x will send push message to node y. – If |Qx ⋂ Qy| <= Tmax, then y will accept the push message and reply with pull message.

48 x y 1 2 3

Push Msg Pull Msg Accept

SLIDE 49

G. Di Fatta

Message Forwarding Mechanism

Case 2:

– Each iteration node x will send push message to randomly selected node y. – If |Qx ⋂ Qy| > Tmax, then y will forward the push message to another randomly selected node from Qy and repeat the same step until the message is accepted.

49 1 2 3 4 5 x y z

Push Msg Forwarding Push Msg Pull Msg Accept Reject ……………

SLIDE 50

G. Di Fatta

Message Forwarding Mechanism

Case 3:

– In order to prevent excessive communication overhead and delay, the forwarding procedure will be repeated up to Hmax, then the message will return to the node with lowest similarity and force it to accept the message.

50 1 2 3 4 6 x y z

Push Msg Forwarding Push Msg Pull Msg Reject and Hop = Hmax Reject

i 5 Force Accept

SLIDE 51

G. Di Fatta

Recovery Mechanism (WIP)

The simple protocol may still lead to multiple connected components when

– the initial condition is particularly poor (e.g., ring of communities) and – in the presence of interleaving in push-pull operations

An additional heuristic mechanism has been incorporated to facilitate the

recovery from multiple connected components back to a single one.

– Work in progress: only limited experimental verification

General idea to fix loss of connectivity:

– Interleaving causes unwanted duplication of cache entries

If somewhere a duplicate is generated, somewhere else an unwanted drop

is made. – Some selected entries rather than dropped are stored in a secondary cache. – When local duplication is detected, then entries from secondary cache is recovered.

51

SLIDE 52

G. Di Fatta

Simulations

Task: computing global aggregation value (peak distribution)
Network size: 10000
Aggregation protocol: SPSP, peak distribution
Membership protocols: Cyclon, Eddy, S&F, Node Cache

Protocol, Expander Protocol, Ideal Random

Initial overlay topologies (with poor expansion):

52

circular regular graph ring of communities

SLIDE 53

G. Di Fatta

Minimum Expansion Index

Comparison of different membership protocols

– init: circular regular graph – chart: minimum vertex expansion index: hmin(G,5%|V|)

53

SLIDE 54

G. Di Fatta

Minimum Expansion Index

Comparison of different membership protocols

– init: ring of communities – chart: minimum vertex expansion index: hmin(G,5%|V|)

54

SLIDE 55

G. Di Fatta

Convergence Speed

Comparison of different membership protocols

– init: circular regular graph – chart: convergence speed

55

SLIDE 56

G. Di Fatta

Connected Components

Comparison of different membership protocols

– init: ring of communities – chart: max number (over several trials) of connected components vs #cycle

56

SLIDE 57

Global Synchronisation

SLIDE 58

G. Di Fatta

Convergence

1. Local convergence
2. Global convergence
3. Local detection of global convergence (global synchronisation)
Simulations:

– 10K nodes, peak distribution, 5 Aggregation protocols, init random graph – Chart: number of nodes (%) locally converged to the global aggregate within a tolerance error for different accuracy thresholds (stddev).

58

cycle % node

based on global knowledge only available in simulations.

1 2 3 1

SLIDE 59

G. Di Fatta

Global Synchronisation

Can global convergence be estimated locally?
Multiple independent aggregation protocols

– local variance is used to detect convergence  global bal syn ynchro chronis nisati ation

n

withou

ut globa

bal commun

mmunica

ication ion

59

Membership Protocol … Aggregation Protocol #k Global Synchronisation Aggregation Protocol #1 Aggregation Protocol #2

{f1(), f2(),…, fk()}  [µ, ]

SLIDE 60

G. Di Fatta

Global Synchronisation

Global convergence depends on several factors, network conditions and

application requirements.

Ideal synchronisation vs local detection method 6M

60

The ideal step transition (in red) is based on global knowledge, only

available in simulations.

The local method is based on local knowledge available at each node.

2 3 2 3

SLIDE 61

G. Di Fatta

Global Synchronisation

61

Convergence transition of different methods (ε = 10−4 and N = 104)

2 3

SLIDE 62

G. Di Fatta

Transition Period

Nodes detect global convergence at different times during a

transition period.

Chart: the number of cycles from 0% to 100% of nodes that

have detected global convergence for different methods.

62

3

SLIDE 63

G. Di Fatta

Conclusions

Extreme Computing based on Epidemic Protocols

– fully decentralised – fault-tolerant – suitable for extreme-scale networked systems – suitable for asynchronous and dynamic networks

Contributions:

– Symmetric Push-Sum Protocols (SPSP), an aggregation protocol – The Expander Membership Protocol – Methods of global convergence detection (synchronisation) – Epidemic K-Means, the first epidemic data mining algorithm

Current work

– Refining and extending the Expander Membership Protocol: incorporating a connectivity recovery mechanism

63

SLIDE 64

G. Di Fatta

Open Issues and Future Work

Local estimation of convergence: better and faster

convergence detection and synchronisation

Asynchronous epidemic protocols (w/o cycles)
Epidemic formulation of data mining algorithms: e.g., decision

tree induction, recommender systems, etc.

Protection against malicious nodes and loss of network

connectivity

Practical applicability still to be shown
Need to identify potential user applications and their deployment

strategy

64

SLIDE 65

G. Di Fatta

Publications

1.

F. Blasa, S. Cafiero, G. Fortino, G. Di Fatta, "Symm

mmet etric ric Push-Sum Sum Prot

tocol

l for

r Decent

entra ralis lised d Aggre gregat gatio ion", The International Conference on Advances in P2P Systems (AP2PS), Lisbon, Portugal,

Nov. 20-25, 2011

11. 2.

G. Di Fatta, F. Blasa, S. Cafiero, G. Fortino, "Epidemic

idemic K-Mea eans ns Cluster ering ing", IEEE ICDM Workshop on Knowledge Discovery Using Cloud and Distributed Computing Platforms (KDCloud), Vancouver, Canada, 11 Dec. 2011 11. 3.

G. Di Fatta, F. Blasa, S. Cafiero, G. Fortino, "Fault

ult tolera

lerant

nt decent ntra ralis ised d k-Mea eans ns clus uster ering ing for

r

async nchr hron

nous
us large

rge-scale le networ

rks”, Journal of Parallel and Distributed Computing, Elsevier, Volume 73,

Issue 3, March 2013 13, Pages 317–329. 4.

P. Poonpakdee, N. G. Orhon, G. Di Fatta, "Conv

nverg rgenc ence e Detec ection ion in Epidem idemic ic Aggreg gregat ation ion", Proc. of Euro-Par 2013 Workshops, Aachen, Germany, Aug. 26-30, 2013 13, Springer LNCS. 5.

P. Poonpakdee, G. Di Fatta, "Expan

pansio ion n Qualit ality of Epide demic ic Prot

toc
cols
ls", Proceedings of the 8th

International Symposium on Intelligent Distributed Computing (IDC), Madrid, Spain, Sept. 3-5, 2014 14, Studies in Computational Intelligence, Springer, Vol. 570, 2015, pp 291-300.

65

SLIDE 66

G. Di Fatta

66

References

Mathematical models of Epidemics

– Nicholas C. Grassly & Christophe Fraser, "Mathematical models of infectious disease transmission, Nature Reviews Microbiology 6, 477-487 (June 2008)

Gossip-based protocols for information dissemination:

–

A. Demers, D. Greene, C. Hauser, W. Irish, J. Larson, S. Shenker, H. Sturgis, D. Swinehart, D. Terry, Epidemic algorithms for replicated database

maintenance, in: Proceedings of the sixth annual ACM Symposium on Principles of distributed computing, PODC ’87, ACM, 1987 1987, pp. 1–12. –

R. Karp, C. Schindelhauer, S. Shenker, B. Vocking, Randomized rumor spreading, in: Proceedings of the 41st Annual Symposium on Foundations
f Computer Science, IEEE Computer Society, 2000

2000, pp. 565–. – Eugster, P.T.; Guerraoui, R.; Kermarrec, A.-M.; Massoulie, L.; , "Epidemic information dissemination in distributed systems," Computer , vol.37, no.5, pp. 60- 67, May 2004 2004.

Gossip protocols for the data aggregation problem:

–

D. Kempe, A. Dobra, J. Gehrke, Gossip-based computation of aggregate information, in: Proceedings of the 44th Annual IEEE Symposium on

Foundations of Computer Science, 2003 2003, pp. 482 – 491. –

M. Jelasity, A. Montresor, O. Babaoglu, Gossip-based aggregation in large dynamic networks, ACM Transactions on Computer Systems 23, 2005

2005, 219–252. –

S. Boyd, A. Ghosh, B. Prabhakar, D. Shah, Randomized gossip algorithms, Information Theory, IEEE Transactions on 52 (6), 2006

2006, 2508 – 2530. –

F. Blasa, S. Cafiero, G. Fortino, G. Di Fatta, "Symmetric Push-Sum Protocol for Decentralised Aggregation", The International Conference on

Advances in P2P Systems (AP2PS), Lisbon, Portugal, Nov. 20-25, 2011. 2011.

Gossip-based protocols surveys, general studies, applications:

– Samir Khuller, Yoo-Ah Kim, and Yung-Chun Wan, "On generalized gossiping and broadcasting", Journal of Algorithms, 59, 2, May 2006 2006, 81-106. – “Dependability in aggregation by averaging,” P. Jesus, C. Baquero, and P. Almeida, 1st Symposium on Informatics (INForum 2009), Sept. 2009 2009,

pp. 482–491.

– Rafik Makhloufi, Gregory Bonnet, Guillaume Doyen, and Dominique Gaiti, "Decentralized Aggregation Protocols in Peer-to-Peer Networks: A Survey", The 4th IEEE International Workshop on Modelling Autonomic Communications Environments (MACE), 2009 2009. –

P. Jesus, C. Baquero, and P. Almeida, “Dependability in aggregation by averaging”, 1st Symposium on Informatics (INForum 2009), Sept. 2009

2009,

pp. 482–491.

– Philip Soltero, Patrick Bridges, Dorian Arnold, and Michael Lang, “A gossip-based approach to exascale system services”, Proc. of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers (ROSS '13), ACM, 2013 2013.

SLIDE 67

Questions?

Invited Talk:

Epidemic Protocols for Extreme-scale Computing

seppe pe Di Fa Fatta ta

Outline

reme-sca scale le Computin ting

From Large to Extreme-scale Systems

Distributed Systems

Parallel Systems

Extremely Scalable Computing

Epidemic Protocols

unicati ication n and compu putat tatio ion n paradi digm gm for large-scale networked systems:

Exponential Growth

Epidemic Computing

Idea: Virus  Information

Epidemic/Gossip-based Protocol

Information Dissemination: Propagation Time

Seminal Work and History

Open Issues

Applications

Parallel K-Means in share-nothing systems

Epidemic K-Means

Simulations - Data Distributions

Clustering Accuracy

Mean Squared Error of Centroids

Fault-Tolerance of Epidemic K-Means

The Data Aggregation Problem

local value xi, the goal is to determine the value of a global aggregation function f() at every node: f(x0, x1, ..., xN-1)

Data Aggregation: e.g., Sum





x s

All-to-all Communication

) ,..., , (

x x x f

Fault-Tolerance and Robustness

Epidemic/Gossip-based Protocol

Epidemic Data Aggregation: Global Average

Epidemic Protocols

Asymmetric/Symmetric Approaches

The Push-Sum Protocol (PSP)

The Push-Sum Protocol (PSP)

Mass Conservation Invariant

case “Average”, the average of all local values is always the correct average and the sum of all weights is always N.

the true global aggregate.

Diffusion Speed

at a source diffuse evenly through a network (convergence).

The Push-Pull Gossip (PPG) Protocol

The Symmetric Push-Sum Protocol (SPSP)

Comparative Analysis (PSP, PPG, SPSP)

Comparative Analysis (PSP, PPG, SPSP)

Protocol Stack

Epidemic & Membership Protocols

Overlay Topologies

Epidemic Membership Protocols

Epidemic Membership Protocols

Epidemic Membership Protocols

Random Overlay Topologies

The Expander Membership Protocol

Expansion Quality

Message Forwarding Mechanism

Message Forwarding Mechanism

Message Forwarding Mechanism

Recovery Mechanism (WIP)

Simulations

Minimum Expansion Index

Minimum Expansion Index

Convergence Speed

Connected Components

Convergence

Global Synchronisation

{f1(), f2(),…, fk()}  [µ, ]

Global Synchronisation

Global Synchronisation

Transition Period

Conclusions

Open Issues and Future Work

Publications

References

eXtreme Computing