SAND: A Fault-Tolerant Streaming Architecture for Network Traffic - - PowerPoint PPT Presentation

sand a fault tolerant streaming architecture for network
SMART_READER_LITE
LIVE PREVIEW

SAND: A Fault-Tolerant Streaming Architecture for Network Traffic - - PowerPoint PPT Presentation

SAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics Qin Liu , John C.S. Lui 1 Cheng He, Lujia Pan, Wei Fan, Yunlong Shi 2 1 The Chinese University of Hong Kong 2 Huawei Noahs Ark Lab 1 Introduction 2 Motivation


slide-1
SLIDE 1

SAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics

Qin Liu, John C.S. Lui 1 Cheng He, Lujia Pan, Wei Fan, Yunlong Shi 2

1The Chinese University of Hong Kong 2Huawei Noah’s Ark Lab

1

slide-2
SLIDE 2

Introduction

2

slide-3
SLIDE 3

Motivation

Network traffic arrives in a streaming fashion, and should be processed in real-time. For example,

  • 1. Network traffic classification
  • 2. Anomaly detection
  • 3. Policy and charging control in cellular networks
  • 4. Recommendations based on user behaviors

3

slide-4
SLIDE 4

Challenges

  • 1. A stream processing system must sustain high-speed

network traffic in cellular core networks

◮ existing systems: S4 [Neumeyer’10], Storm 1 ... ◮ implemented in Java: heavy processing overheads ◮ cannot sustain high-speed network traffic

1http://storm.incubator.apache.org/

4

slide-5
SLIDE 5

Challenges

  • 1. A stream processing system must sustain high-speed

network traffic in cellular core networks

◮ existing systems: S4 [Neumeyer’10], Storm 1 ... ◮ implemented in Java: heavy processing overheads ◮ cannot sustain high-speed network traffic

  • 2. For critical applications, it is necessary to provide

correct results after failure recovery

◮ high hardware cost ◮ cannot provide “correct results” after failure recovery ◮ at-least-once vs. exactly-once

1http://storm.incubator.apache.org/

4

slide-6
SLIDE 6

Contributions

Design and implement SAND in C++:

  • high performance on network traffic
  • a new fault tolerance scheme

5

slide-7
SLIDE 7

Background

6

slide-8
SLIDE 8

Background

Continuous operator model:

  • Each node runs an operator with in-memory mutable state
  • For each input event, state is updated and new events are

sent out Mutable state is lost if node fails.

7

slide-9
SLIDE 9

Example: AppTracker

  • AppTracker: traffic classification for cellular network

traffic

  • Output traffic distribution in real-time:

Application Distribution HTTP 15.60% Sina Weibo 4.13% QQ 2.56% DNS 2.34% HTTP in QQ 2.17%

8

slide-10
SLIDE 10

Example: AppTracker

Under the continuous operator model:

  • Spout: capture packets from cellular network
  • Decoder: extract IP packets from raw packets
  • DPI-Engine: perform deep packet inspection on packets
  • Tracker: track the distribution of application level

protocols (HTTP, P2P, Skype ...)

9

slide-11
SLIDE 11

System Design

10

slide-12
SLIDE 12

Architecture of SAND

One coordinator and multiple workers. Each worker can be seen as an operator.

11

slide-13
SLIDE 13

Coordinator

Coordinator is responsible for

  • managing worker executions
  • detecting worker failures
  • relaying control messages among workers
  • monitoring performance statistics

Zookeeper cluster provides fault tolerance and reliable coordination service.

12

slide-14
SLIDE 14

Worker

Contain 3 types of processes:

  • The dispatcher decodes streams and distributes them to

multiple analyzers

  • Each analyzer independently processes the assigned

streams

  • The collector aggregates the intermediate results from all

analyzers The container daemon

  • spawns or stops the processes
  • communicates with the coordinator

13

slide-15
SLIDE 15

Communication Channels

Efficient communication channels:

  • Intra-worker: a lock-free shared memory ring buffer
  • Inter-worker: ZeroMQ, a socket library optimized for

clustered products

14

slide-16
SLIDE 16

Fault-Tolerance

15

slide-17
SLIDE 17

Previous Fault-Tolerance Schemes

  • 1. Replication: each operator has a replica
  • perator [Hwang’05,Shah’04,Balazinska’08]

◮ Data streams are processed twice by two identical nodes ◮ Synchronization protocols ensures exact ordering of

events in both nodes

◮ On failure, the system switches over to the replica nodes

2x hardware cost.

16

slide-18
SLIDE 18

Previous Fault-Tolerance Schemes

  • 2. Upstream backup with checkpoint [Fernandez’03,Gu’09]:

◮ Each node maintains backup of the forwarded events

since last checkpoint

◮ On failure, upstream nodes replay the backup events

serially to the failover node to recreate the state

Less hardware cost. It’s hard to provide correct results after recovery.

17

slide-19
SLIDE 19

Why is it hard?

  • Stateful continuous operators tightly integrate

“computation” with “mutable state”

  • Makes it harder to define clear boundaries when

computation and state can be moved around

18

slide-20
SLIDE 20

Checkpointing

  • Need to coordinate checkpointing operation on each

worker

  • 1985: Chandy-Lamport invented an asynchronous

snapshot algorithm for distributed systems

  • A variant algorithm was implemented within SAND

19

slide-21
SLIDE 21

Checkpointing Protocol

  • Coordinator initiates a global checkpoint by sending

markers to all source workers

  • For each worker w,

◮ on receiving a data event E from worker u ◮ if marker from u has arrived, w buffers E ◮ else w processes E normally ◮ on receiving a marker from worker u ◮ if all markers have arrived, w starts checkpointing

  • peration

20

slide-22
SLIDE 22

Checkpointing Operation

On each worker:

  • When a checkpoint starts, the worker creates child

processes using fork

  • The parent processes then resume with the normal

processing

  • The child processes write the internal state to HDFS,

which performs replication for data reliability

21

slide-23
SLIDE 23

Output Buffer

Buffer output events for recovery:

  • Each worker records output data events in its output

buffer, so as to replay output events during failure recovery

  • When global checkpoint c is finished, data in output

buffers before checkpoint c can be deleted

22

slide-24
SLIDE 24

Failure Recovery

F a b c d e f g h DF PF

  • F: failed workers
  • DF: downstream workers of F
  • F ∪ DF: rolled back to the most recent checkpoint c
  • PF: the upstream workers of F ∪ DF
  • Workers in PF replay output events after checkpoint c

23

slide-25
SLIDE 25

Evaluation

24

slide-26
SLIDE 26

Experiment 1

  • Testbed: one quad-core machine with 4GB RAM
  • Dataset: packet header trace; 331 million packets

accounting for 143GB of traffic

  • Application: packet counter

System Packets/s Payload Rate Header Rate Storm 260K 840Mb/s 81.15Mb/s Blockmon 2.7M 8.4Gb/s 844.9Mb/s SAND 9.6M 31.4Gb/s 3031.7Mb/s

  • 3.7X and 37.4X compared to Blockmon [Simoncelli’13]

and Storm

25

slide-27
SLIDE 27

Experiment 2

  • Testbed: three 16-core machines with 94GB RAM
  • Dataset: a 2-hour network trace (32GB) collected from a

commercial GPRS core network in China in 2013

  • Application: AppTracker

26

slide-28
SLIDE 28

Experiment 2

200 400 600 800 1000 1200 2 4 6 8 10 12 Throughput (Mb/s) Number of Analyzers Interval 2s Interval 5s Interval 10s No Fault-Tolerance

  • Scale out by running parallel workers on multiple servers
  • Negligible overheads

27

slide-29
SLIDE 29

Experiment 3

200 400 600 800 1000 10 20 30 40 50 60 Throughput (Mb/s) Time (seconds) Interval 5s Interval 10s t1 t2 t3 t4 t5

  • Recover in order of seconds
  • Recovery time is in proportion to checkpointing interval

28

slide-30
SLIDE 30

Conclusion

  • Present a new distributed stream processing system for

network analytics

  • Propose a novel checkpointing protocol that provides

reliable fault tolerance for stream processing systems

  • SAND can operate at core routers level and can recover

from failure in order of seconds

29

slide-31
SLIDE 31

Thank you! Q & A

30