Hurricane Master semester project IC School Operating Systems - PowerPoint PPT Presentation
Hurricane Master semester project IC School Operating Systems Laboratory Author Diego Antognini Supervisors Prof. Willy Zwaenepoel Laurent Bindschaedler Outline Motivation Hurricane Experiments Future work Conclusion
Hurricane Master semester project – IC School Operating Systems Laboratory Author Diego Antognini Supervisors Prof. Willy Zwaenepoel Laurent Bindschaedler
Outline • Motivation • Hurricane • Experiments • Future work • Conclusion 2
Motivation
Original goal of the project • Implement Chaos on top of HDFS ! • How ? • Replace storage engine by HDFS • Why ? • Industry interested by systems running on Hadoop • Handling cluster easily • Distributed file systems • Fault-tolerance (but at what price ?) Introduction – Hurricane – Experiments – Future work - Conclusion 4
Chaos • Scale-out graph processing from secondary storage • Maximize sequential access • Stripes data across secondary devices in a cluster • Limited only by : • aggregate bandwidth • capacity of all storage devices in the entire cluster Introduction – Hurricane – Experiments – Future work - Conclusion 5
Hadoop Distributed File System Namenode Client Datanodes Datanodes Introduction – Hurricane – Experiments – Future work - Conclusion 6
Experiment : DFSIO • Measure aggregate bandwidth on a cluster when writing & reading 100 GB of data in X files : # Files Size 1 100 GB 2 50 GB … … 4096 25 MB • Use DFSIO benchmark • Each task operates on a distinct block • Measure disk I/O Introduction – Hurricane – Experiments – Future work - Conclusion 7
Clusters DCO OS Ubuntu 14.04.01 # Cores 16 Memory 128 GB HDD : 140 MB/s Storage SSD : 243 MB/s Network 10 Gbit/s Introduction – Hurricane – Experiments – Future work - Conclusion 8
Results DFSIO – DCO cluster I/O to disk writing 100GB of data 8 Nodes - No Replication DCO Cluster 2500 2250 Aggregate bandwidth [MB/s] 2000 1750 1500 1250 1000 750 500 250 0 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 Number of Files Read Write Baseline (dd, hdparm) - Read/Write Introduction – Hurricane – Experiments – Future work - Conclusion 9
Observations: DFSIO • Somewhat lackluster performance • Hard to tune ! HDFS doesn’t fit the requirements Introduction – Hurricane – Experiments – Future work - Conclusion 10
Our solution • Create a standalone distributed storage system based on Chaos storage engine • Give it an HDFS-like RPC interface Actual project ! Introduction – Hurricane – Experiments – Future work - Conclusion 11
Hurricane
Hurricane • Scalable decentralized storage system based on Chaos • Balance I/O load randomly across available disks • Saturate available storage bandwidth • Target rack-scale deployment Introduction – Hurricane – Experiments – Future work - Conclusion 13
Real life scenario • Chaos using Hurricane Introduction – Hurricane – Experiments – Future work - Conclusion 14
Real life scenario • Measuring emotions of countries during Euro 2016 Emotions Switzerland Switzerland Emotions Belgium data Belgium Emotions Romania Romania • And much more ! Introduction – Hurricane – Experiments – Future work - Conclusion 15
Locality does not matter ! • Remote storage bandwidth = local storage bandwidth • Clients can read/write to any storage device • Storage is slower than network • Network not a bottleneck ! • Realistic for most clusters at rack scale or even more Introduction – Hurricane – Experiments – Future work - Conclusion 16
Maximizing I/O bandwidth • Clients pull data records from servers Server Client Server Client Server Client • Batches requests to prevent idle servers (prefetching) Introduction – Hurricane – Experiments – Future work - Conclusion 17
Features • Global file handling (global_*) • Create, exists, delete, fill, drain, rewind etc … • Local file handling (local_*) • Create, exists, delete, fill, drain, rewind etc ... • Add storage nodes dynamically Introduction – Hurricane – Experiments – Future work - Conclusion 18
How does it work ? – Writing files f f S1 S2 C1 C3 C2 f Introduction – Hurricane – Experiments – Future work - Conclusion 19
How does it work ? – Reading files f f S1 S2 C1 C3 C2 Introduction – Hurricane – Experiments – Future work - Conclusion 20
How does it work ? - Join g g f f S1 S2 S3 C1 C3 C2 g f Introduction – Hurricane – Experiments – Future work - Conclusion 21
Experiments
Clusters LABOS DCO TREX OS Ubuntu 14.04.1 Ubuntu 14.04.01 Ubuntu 14.04.01 # Cores 32 16 32 Memory 32 GB 128 GB 128 Gb HDD : 140 MB/s HDD : 414 MB/s Storage HDD : 474 MB/s SSD : 243 MB/s SSD : 464 MB/s Network 1 Gbit/s 10 Gbit/s 40 Gbit/s Introduction – Hurricane – Experiments – Future work - Conclusion 23
List of experiments • Weak scaling • Scalability 1 client • Strong scaling • Case studies • Unbounded buffer • Compression Introduction – Hurricane – Experiments – Future work - Conclusion 24
Weak scaling • Each node writes/reads 16 GB of data • Increasing number of nodes • N servers, N clients • Measure average bandwidth • Compare Chaos storage engine, Hurricane, DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 25
16 GB per node – 40 Gbit/s network TREX SSD Read TREX SSD Write Baseline (dd, hdparm) 500 500 Average bandwidth [MB/s] Average bandwidth [MB/s] 400 400 300 300 200 200 100 100 0 0 1 2 4 8 16 1 2 4 8 16 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 26
16 GB per node – 10 Gbit/s network DCO SSD Read DCO SSD Write Baseline (dd, hdparm) 250 250 Average bandwidth [MB/s] Average bandwidth [MB/s] 200 200 150 150 100 100 50 50 0 0 1 2 4 8 1 2 4 8 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 27
16 GB per node – 1 Gbit/s network LABOS Read LABOS Write Baseline (dd, hdparm) 500 500 Average bandwidth [MB/s] Average bandwidth [MB/s] 400 400 300 300 200 200 100 100 0 0 1 2 4 8 1 2 4 8 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 28
Weak scaling - Summary • Hurricane similar performance with Chaos storage • Scalable • Outperforms HDFS roughly 1.5x • Maximize I/O bandwidth Introduction – Hurricane – Experiments – Future work - Conclusion 29
16 GB per node - 64 nodes DCO SSD Read Baseline (dd, hdparm) 300 Average bandwidth [MB/s] 250 200 150 100 50 0 1 2 4 8 16 32 64 Machines STILL SCALABLE & GOOD I/O BANDWIDTH Chaos storage Hurricane DCO SSD Write Baseline (dd, hdparm) 300 Average bandwidth [MB/s] 250 200 150 100 50 0 1 2 4 8 16 32 64 Machines Chaos storage Hurricane Introduction – Hurricane – Experiments – Future work - Conclusion 30
Scalability with 1 Client • Client writes/reads 16 GB of data per server node • Increasing number of server nodes • N servers, 1 client • Measure aggregate bandwidth • Only Hurricane is used Introduction – Hurricane – Experiments – Future work - Conclusion 31
40 Gbit/s network Unknown network problem TREX SSD Read TREX SSD Write 5000 5000 4500 4500 4000 Aggregate bandwidth [MB/s] 4000 Aggegate bandwidth [MB/s] 3500 3500 3000 3000 2500 2500 2000 2000 1500 1500 1000 1000 500 500 0 0 1 2 4 8 16 1 2 4 8 16 Machines Machines Baseline Actual bandwidth of the network Introduction – Hurricane – Experiments – Future work - Conclusion 32
10 Gbit/s network DCO SSD Read DCO SSD Write 5000 5000 4500 4500 Aggregate bandwidth [MB/s] 4000 4000 Aggegate bandwidth [MB/s] 3500 3500 3000 3000 2500 2500 2000 2000 1500 1500 1000 1000 500 500 0 0 1 2 4 8 1 2 4 8 Machines Machines Baseline Also scale with only 1 client Use the I/O bandwidth of all the server nodes Introduction – Hurricane – Experiments – Future work - Conclusion 33
Strong scaling • Read/write 128 GB of data in total • Increasing number of nodes • N servers, N clients • Measure aggregate bandwidth • Compare Chaos storage engine, Hurricane, DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 34
40 Gbit/s network Baseline TREX SSD Read TREX SSD Write 8000 8000 7000 7000 Aggregate bandwidth [MB/s] Aggregate bandwidth [MB/s] 6000 6000 5000 5000 4000 4000 3000 3000 2000 2000 1000 1000 0 0 1 2 4 8 16 1 2 4 8 16 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 35
1 Gbit/s network Baseline LABOS Read LABOS Write 8000 8000 7000 7000 Aggregate bandwidth [MB/s] Aggregate bandwidth [MB/s] 6000 6000 5000 5000 4000 4000 3000 3000 2000 2000 1000 1000 0 0 1 2 4 8 1 2 4 8 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 36
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.