LARA: A Language of Linear and Relational Algebra for Polystores
Dylan Hutchison advised by Bill Howe, Dan Suciu
- Work in Progress -
L ARA : A Language of Linear and Relational Algebra for Polystores - - PowerPoint PPT Presentation
L ARA : A Language of Linear and Relational Algebra for Polystores Dylan Hutchison advised by Bill Howe, Dan Suciu - Work in Progress - Polystores SQL Matlab Spark Streaming DataFrames Polystores connect backend systems with frontend
Dylan Hutchison advised by Bill Howe, Dan Suciu
Table Store Graph Engine Array Store Key-Value Store Matlab SQL Spark Streaming DataFrames
Polystores connect backend systems with frontend languages through a unifying "narrow API," using each system where it performs best.
Goal: Implement algorithms!
Algorithms Data Cube Matrix Inverse Max Flow PageRank
Ops: Objects: Algorithms Data Cube Matrix Inverse Max Flow PageRank
Goal: Implement algorithms!
Algebras Relations Matrices Graphs Files BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra
Many candidate algebras… Algebra := Objects +
(closed) Operations on Objects
Ops: Objects:
Algorithms Data Cube Matrix Inverse Max Flow PageRank
Goal: Implement algorithms!
Algebras Relations Matrices Graphs Files BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra
Many candidate algebras…
Execution Engines PostgreSQL Neo4J, Allegro CSV, HDF5
Algebra := Objects +
(closed) Operations on Objects
Many algebras have optimized execution engines
ScaLAPACK
LARA Ops: Objects:
Algorithms Data Cube Matrix Inverse Max Flow PageRank Algebras Relations Matrices Graphs Files BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro CSV, HDF5 ScaLAPACK Associative Tables
⋈⊗
mapf promoteV
⋈
⊕
Answer: No choice necessary. Use Lara!
Goal: Implement algorithms!
select equal colliding keys, multiply colliding values
group by colliding keys, sum colliding values
⋈
Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance. Goal: Compute ranks of sites in D for search query Q, weighing by W
Q word score delicious 1 green 1 (others) D site word score pizzanow.com pizza 6 pizzanow.com delicious 5 allrecipes.com delicious 2 allrecipes.com green 2 allrecipes.com potatoes 5 recycle.org green 2 (others) W word score delicious 1 pizza 1 potatoes 3 green 2 (others) Desired Output site score pizzanow.com 1*5*1 = 5 allrecipes.com 1*2*1+1*2*2 = 6 recycle.org 1*2*2 = 4 (others)
Q word score delicious 1 green 1 (others) D site word score pizzanow.com pizza 6 pizzanow.com delicious 5 allrecipes.com delicious 2 allrecipes.com green 2 allrecipes.com potatoes 5 recycle.org green 2 (others) W word score delicious 1 pizza 1 potatoes 3 green 2 (others)
Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance. Goal: Compute ranks of sites in D for search query Q, weighing by W RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W))) LA: diag(Q) +.* D +.* W
Desired Output site score pizzanow.com 1*5*1 = 5 allrecipes.com 1*2*1+1*2*2 = 6 recycle.org 1*2*2 = 4 (others)
Q word score delicious 1 green 1 (others) D site word score pizzanow.com pizza 6 pizzanow.com delicious 5 allrecipes.com delicious 2 allrecipes.com green 2 allrecipes.com potatoes 5 recycle.org green 2 (others) W word score delicious 1 pizza 1 potatoes 3 green 2 (others)
Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance. Goal: Compute ranks of sites in D for search query Q, weighing by W RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W))) LA: diag(Q) +.* D +.* W
Desired Output site score pizzanow.com 1*5*1 = 5 allrecipes.com 1*2*1+1*2*2 = 6 recycle.org 1*2*2 = 4 (others) (Matlab)
Q word score delicious 1 green 1 (others) D site word score pizzanow.com pizza 6 pizzanow.com delicious 5 allrecipes.com delicious 2 allrecipes.com green 2 allrecipes.com potatoes 5 recycle.org green 2 (others) W word score delicious 1 pizza 1 potatoes 3 green 2 (others)
Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance. Goal: Compute ranks of sites in D for search query Q, weighing by W RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W))) LA: diag(Q) +.* D +.* W Hybrid: πword(Q) ⋈ D +.* W
Desired Output site score pizzanow.com 1*5*1 = 5 allrecipes.com 1*2*1+1*2*2 = 6 recycle.org 1*2*2 = 4 (others) (Matlab)
Q word score delicious 1 green 1 (others) D site word score pizzanow.com pizza 6 pizzanow.com delicious 5 allrecipes.com delicious 2 allrecipes.com green 2 allrecipes.com potatoes 5 recycle.org green 2 (others) W word score delicious 1 pizza 1 potatoes 3 green 2 (others)
Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance. Goal: Compute ranks of sites in D for search query Q, weighing by W RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W))) LA: diag(Q) +.* D +.* W Hybrid: πword(Q) ⋈ D +.* W LARA: (Q ⋈* D ⋈* W) Esite
Desired Output site score pizzanow.com 1*5*1 = 5 allrecipes.com 1*2*1+1*2*2 = 6 recycle.org 1*2*2 = 4 (others)
⋈
+
(Matlab)
Q word score delicious 1 green 1 (others) D site word score pizzanow.com pizza 6 pizzanow.com delicious 5 allrecipes.com delicious 2 allrecipes.com green 2 allrecipes.com potatoes 5 recycle.org green 2 (others) W word score delicious 1 pizza 1 potatoes 3 green 2 (others)
Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance. Goal: Compute ranks of sites in D for search query Q, weighing by W RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W))) LA: diag(Q) +.* D +.* W Hybrid: πword(Q) ⋈ D +.* W LARA: (Q ⋈*D ⋈* W) Esite
Desired Output site score pizzanow.com 1*5*1 = 5 allrecipes.com 1*2*1+1*2*2 = 6 recycle.org 1*2*2 = 4 (others)
⋈
+
Executes on both RDBMS and BLAS, depending on cost model Many ways to express algorithms. Lara presents an economical algebra preserving
(Matlab)
Do you have an application more easily expressed in several algebras? Do you seek multi-system optimizations? Let's discuss! ☺
Script SQL SQL SQL Matlab Matlab Matlab SQL SQL … ∪ × πC σf ρ ∖ γ ⊕ ⊗ f ⊕.⊗ T ⋈⊗
mapf promoteV
LA RA RA RDBMS BLAS
Optimize & Schedule
⋈
⊕
LARA
Relational Algebra Object: Relation
Linear Algebra Object: N-D Matrix
Associative Tables. Several interpretations:
Lara RA LA ⋈⊗ ⋈, π⊗, ρ Tensor product γ⊕, ∪ Reduce, e-wise sum mapf πf Apply promoteV Re-index Re-key ⋈
⊕
Inner Join P ⋈ S
P ⟗ S
(formulas out of date)