Asynchronous and Fault-Tolerant Recursive Datalog Evalua9on in - - PowerPoint PPT Presentation

asynchronous and fault tolerant recursive datalog
SMART_READER_LITE
LIVE PREVIEW

Asynchronous and Fault-Tolerant Recursive Datalog Evalua9on in - - PowerPoint PPT Presentation

Asynchronous and Fault-Tolerant Recursive Datalog Evalua9on in Shared-Nothing Engines Jingjing Wang, Magdalena Balazinska, Daniel Halperin University of Washington Modern Analy>cs Requires Itera>on Graph applica>ons Graph


slide-1
SLIDE 1

Asynchronous and Fault-Tolerant Recursive Datalog Evalua9on in Shared-Nothing Engines

Jingjing Wang, Magdalena Balazinska, Daniel Halperin University of Washington

slide-2
SLIDE 2

Modern Analy>cs Requires Itera>on

  • Graph applica>ons

– Graph reachability – Connected components – Shortest Path

  • Machine learning

– Clustering algorithms – Logis>c regression

  • Scien>fic analy>cs

– N-body simula>on

Jingjing Wang - University of Washington 2

slide-3
SLIDE 3

Galaxy Evolu>on: An Itera>ve Example

Jingjing Wang - University of Washington 3

A Simula9on of the Universe Present day Big Bang Millions of years ago … …

Picture from D. H. Stalder et. al. arXiv:1208.3444 [astro-ph.CO]

Galaxy Galaxy

slide-4
SLIDE 4

Galaxy Evolu>on: Itera>ve Lineage Tracing

Present day

Jingjing Wang - University of Washington

Millions of years ago

Galaxy Par9cle

… …

Millions of years ago

slide-5
SLIDE 5

Galaxy Evolu>on: Why It Is not Easy

  • Large-scale data sizes

– Scalability

  • Itera>ve is the core

– Support efficient itera>ve constructs

  • Users are data scien>sts

– Provide an easy-to-use query interface

  • Shared datasets and resources

– Within a data management system

Jingjing Wang - University of Washington 5

slide-6
SLIDE 6

Itera>ve Analy>cs: Where to Do

  • SQL Server

– Single-node, cannot handle huge scale

  • MapReduce

– Rigid programming model – Write to disk, expensive itera>on

  • In-memory systems such as Spark

– Synchronous opera>ons

  • Graph engines such as GraphLab

– Think like a vertex

Jingjing Wang - University of Washington 6

slide-7
SLIDE 7

No Exis>ng System Meets All Requirements

  • Synchronous itera>ons only

– AsterixDB, HaLoop, Pregel, REX, Spark, PrIter, Glog, …

  • Single-node

– LogicBlox, DatalogFS, …

  • No declara>ve language

– Stratosphere, Naiad, Grace, GraphLab, …

  • Specialized for graphs

– GraphLab, Grace, …

  • Not a data management system

– SociaLite, …

  • Theory on recursive queries

– DatalogFS, …

Jingjing Wang - University of Washington 7

slide-8
SLIDE 8

Outline and Contribu>ons

  • Full-stack solu>on for itera>ve processing

– Declara>ve rela>onal query language

  • A subset of Datalog-with-Aggrega>on

– Scalable and easily implementable

  • Small extensions to exis>ng shared-nothing systems

– Efficient itera>ve computa>on

  • Execu>on models and op>miza>ons
  • Implementa>on and empirical evalua>on using

Jingjing Wang - University of Washington 8

slide-9
SLIDE 9

Outline and Contribu>ons

  • Full-stack solu>on for itera>ve processing

– Declara9ve rela9onal query language

  • A subset of Datalog-with-Aggrega9on

– Scalable and easily implementable

  • Small extensions to exis9ng shared-nothing systems

– Efficient itera>ve computa>on

  • Execu>on models and op>miza>ons
  • Implementa>on and empirical evalua>on using

Jingjing Wang - University of Washington 9

slide-10
SLIDE 10

From Datalog Programs to Asynchronous Query Plans

  • Datalog: a rela>onal query language

– Nicely expresses recursions

  • Two special operators

– IDBController

  • Maintains state of “nonconstant” rela>ons

– Termina>onController – Easy extensions to an exis>ng engine

  • Automa>c compila>on

Jingjing Wang - University of Washington 10

DECLARE @id AS INT, @lvl AS INT SET @id = 3 SET @lvl = 2 ;WITH cte (id, parent, child, lvl) AS ( SELECT id, parent, child, 0 FROM t WHERE id = 1 UNION ALL SELECT E.id, E.parent, E.child, M.lvl+1 FROM t AS E JOIN CTE AS M ON E.parent = M.child WHERE lvl < @lvl ) SELECT * FROM CTE --where lvl=@lvl

  • -OPTION (MAXRECURSION 10)

CC(x,x) :- Edges(x, ) CC(y,$Min(v)) :- CC(x,v), Edges(x,y) :- CC(y,v)

slide-11
SLIDE 11

Outline and Contribu>ons

  • Full-stack solu>on for itera>ve processing

– Declara>ve rela>onal query language

  • A subset of Datalog-with-Aggrega>on

– Scalable and easily implementable

  • Small extensions to exis>ng shared-nothing systems

– Efficient itera9ve computa9on

  • Execu9on models and op9miza9ons
  • Implementa9on and empirical evalua9on using

Jingjing Wang - University of Washington 11

slide-12
SLIDE 12

Itera>ve Computa>on: How Can We Do Beqer

Jingjing Wang - University of Washington 12

  • Performance impact: # of intermediate tuples

– More tuples, more work, more resources

  • Op>miza>on: recursive execu>on models

– Synchronous vs. asynchronous

  • Op>miza>on: priori>zing tuples

– For asynchronous model, favor new tuples vs. base tuples

slide-13
SLIDE 13

Op>miza>on: Recursive Execu>on Models

Jingjing Wang - University of Washington 13

  • Synchronous

– Stop at the end of each itera>on

  • Asynchronous

– No barrier, propagate updates when ready

  • Galaxy Evolu>on

– Synchronous

  • Find all galaxies at >mestep 1, then 2, …

– Asynchronous

  • Galaxy A is a part of the evolu>on history
  • A shares par>cles with galaxy B
slide-14
SLIDE 14

Galaxy Evolu>on: Execu>on Model Does Not Maqer Much

Jingjing Wang - University of Washington 14

100 200 300 400 500 600 8 16 32 64

Time (seconds) # workers

80GB, 27 snapshots 16 machines

slide-15
SLIDE 15

Another Applica>on: Least Common Ancestor

Jingjing Wang - University of Washington 15

2 1 4 3 5

dist:1 dist:2 dist:3 Cita>on Paper

slide-16
SLIDE 16

LCA: Asynchronous Can Be Much Slower Than Synchronous

Jingjing Wang - University of Washington 16

20 40 60 80 100 120 140 160 8 16 32 64

Time (seconds) # workers

2 million papers 8 million cita>ons

slide-17
SLIDE 17
  • For asynchronous processing

– Choice: favor new tuples vs. base tuples

  • Example: connected components

Op>miza>on: Priori>zing Tuples

Jingjing Wang - University of Washington 17

1 2 3 4 1 2 3 4

slide-18
SLIDE 18

Connected Components: Pull Order Impacts Run Time

Jingjing Wang - University of Washington 18

500 1000 1500 2000 8 16 32 64

Time (seconds) # workers

21 million ver>ces 776 million edges

Async, new tuples first Async, base tuples first Sync

slide-19
SLIDE 19

Conclusion

  • Full-stack solu>on for itera>ve big-data analy>cs

– A declara>ve language – Small extensions to exis>ng shared-nothing engines – Efficient itera>ve execu>on – Failure handling methods – More details in the paper

  • Empirical evalua>on of various models

– No single method outperforms others – Future work: an adap>ve cost-based op>mizer

Jingjing Wang - University of Washington 19