L ARA : A Language of Linear and Relational Algebra for Polystores - - PowerPoint PPT Presentation

l ara a language of
SMART_READER_LITE
LIVE PREVIEW

L ARA : A Language of Linear and Relational Algebra for Polystores - - PowerPoint PPT Presentation

L ARA : A Language of Linear and Relational Algebra for Polystores Dylan Hutchison advised by Bill Howe, Dan Suciu - Work in Progress - Polystores SQL Matlab Spark Streaming DataFrames Polystores connect backend systems with frontend


slide-1
SLIDE 1

LARA: A Language of Linear and Relational Algebra for Polystores

Dylan Hutchison advised by Bill Howe, Dan Suciu

  • Work in Progress -
slide-2
SLIDE 2

Polystores

Table Store Graph Engine Array Store Key-Value Store Matlab SQL Spark Streaming DataFrames

Polystores connect backend systems with frontend languages through a unifying "narrow API," using each system where it performs best.

slide-3
SLIDE 3

How to choose an algebra?

Goal: Implement algorithms!

Algorithms Data Cube Matrix Inverse Max Flow PageRank

slide-4
SLIDE 4

How to choose an algebra?

Ops: Objects: Algorithms Data Cube Matrix Inverse Max Flow PageRank

Goal: Implement algorithms!

Algebras Relations Matrices Graphs Files BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra

Many candidate algebras… Algebra := Objects +

(closed) Operations on Objects

slide-5
SLIDE 5

Ops: Objects:

How to choose an algebra?

Algorithms Data Cube Matrix Inverse Max Flow PageRank

Goal: Implement algorithms!

Algebras Relations Matrices Graphs Files BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra

Many candidate algebras…

Execution Engines PostgreSQL Neo4J, Allegro CSV, HDF5

Algebra := Objects +

(closed) Operations on Objects

Many algebras have optimized execution engines

ScaLAPACK

slide-6
SLIDE 6

LARA Ops: Objects:

How to choose an algebra?

Algorithms Data Cube Matrix Inverse Max Flow PageRank Algebras Relations Matrices Graphs Files BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro CSV, HDF5 ScaLAPACK Associative Tables

⋈⊗

mapf promoteV

Answer: No choice necessary. Use Lara!

  • 1. Write algorithm in any/all algebras
  • 2. Translate to/from Lara common algebra
  • 3. Use any/all execution engines

Goal: Implement algorithms!

slide-7
SLIDE 7

Operations of Lara

  • ⋈⊗ – Join: horizontally merge columns,

select equal colliding keys, multiply colliding values

  • ⊕ – Union: vertically merge columns,

group by colliding keys, sum colliding values

  • mapf – Map keys and old values to new values
  • promoteV – Promote values to keys

slide-8
SLIDE 8

Example: Ranking a Search

Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance. Goal: Compute ranks of sites in D for search query Q, weighing by W

Q word score delicious 1 green 1 (others) D site word score pizzanow.com pizza 6 pizzanow.com delicious 5 allrecipes.com delicious 2 allrecipes.com green 2 allrecipes.com potatoes 5 recycle.org green 2 (others) W word score delicious 1 pizza 1 potatoes 3 green 2 (others) Desired Output site score pizzanow.com 1*5*1 = 5 allrecipes.com 1*2*1+1*2*2 = 6 recycle.org 1*2*2 = 4 (others)

slide-9
SLIDE 9

Example: Ranking a Search

Q word score delicious 1 green 1 (others) D site word score pizzanow.com pizza 6 pizzanow.com delicious 5 allrecipes.com delicious 2 allrecipes.com green 2 allrecipes.com potatoes 5 recycle.org green 2 (others) W word score delicious 1 pizza 1 potatoes 3 green 2 (others)

Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance. Goal: Compute ranks of sites in D for search query Q, weighing by W RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W))) LA: diag(Q) +.* D +.* W

Desired Output site score pizzanow.com 1*5*1 = 5 allrecipes.com 1*2*1+1*2*2 = 6 recycle.org 1*2*2 = 4 (others)

slide-10
SLIDE 10

Example: Ranking a Search

Q word score delicious 1 green 1 (others) D site word score pizzanow.com pizza 6 pizzanow.com delicious 5 allrecipes.com delicious 2 allrecipes.com green 2 allrecipes.com potatoes 5 recycle.org green 2 (others) W word score delicious 1 pizza 1 potatoes 3 green 2 (others)

Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance. Goal: Compute ranks of sites in D for search query Q, weighing by W RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W))) LA: diag(Q) +.* D +.* W

Desired Output site score pizzanow.com 1*5*1 = 5 allrecipes.com 1*2*1+1*2*2 = 6 recycle.org 1*2*2 = 4 (others) (Matlab)

slide-11
SLIDE 11

Example: Ranking a Search

Q word score delicious 1 green 1 (others) D site word score pizzanow.com pizza 6 pizzanow.com delicious 5 allrecipes.com delicious 2 allrecipes.com green 2 allrecipes.com potatoes 5 recycle.org green 2 (others) W word score delicious 1 pizza 1 potatoes 3 green 2 (others)

Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance. Goal: Compute ranks of sites in D for search query Q, weighing by W RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W))) LA: diag(Q) +.* D +.* W Hybrid: πword(Q) ⋈ D +.* W

Desired Output site score pizzanow.com 1*5*1 = 5 allrecipes.com 1*2*1+1*2*2 = 6 recycle.org 1*2*2 = 4 (others) (Matlab)

slide-12
SLIDE 12

Example: Ranking a Search

Q word score delicious 1 green 1 (others) D site word score pizzanow.com pizza 6 pizzanow.com delicious 5 allrecipes.com delicious 2 allrecipes.com green 2 allrecipes.com potatoes 5 recycle.org green 2 (others) W word score delicious 1 pizza 1 potatoes 3 green 2 (others)

Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance. Goal: Compute ranks of sites in D for search query Q, weighing by W RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W))) LA: diag(Q) +.* D +.* W Hybrid: πword(Q) ⋈ D +.* W LARA: (Q ⋈* D ⋈* W) Esite

Desired Output site score pizzanow.com 1*5*1 = 5 allrecipes.com 1*2*1+1*2*2 = 6 recycle.org 1*2*2 = 4 (others)

+

(Matlab)

slide-13
SLIDE 13

Example: Ranking a Search

Q word score delicious 1 green 1 (others) D site word score pizzanow.com pizza 6 pizzanow.com delicious 5 allrecipes.com delicious 2 allrecipes.com green 2 allrecipes.com potatoes 5 recycle.org green 2 (others) W word score delicious 1 pizza 1 potatoes 3 green 2 (others)

Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance. Goal: Compute ranks of sites in D for search query Q, weighing by W RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W))) LA: diag(Q) +.* D +.* W Hybrid: πword(Q) ⋈ D +.* W LARA: (Q ⋈*D ⋈* W) Esite

Desired Output site score pizzanow.com 1*5*1 = 5 allrecipes.com 1*2*1+1*2*2 = 6 recycle.org 1*2*2 = 4 (others)

+

Executes on both RDBMS and BLAS, depending on cost model Many ways to express algorithms. Lara presents an economical algebra preserving

  • LA's familiar math, numerical prowess
  • RA's flexibility, scale-out optimization

(Matlab)

slide-14
SLIDE 14

LARA: A Unifying Algebra

Do you have an application more easily expressed in several algebras? Do you seek multi-system optimizations? Let's discuss! ☺

slide-15
SLIDE 15
slide-16
SLIDE 16
slide-17
SLIDE 17

Vision for Polystore Systems

Script SQL SQL SQL Matlab Matlab Matlab SQL SQL … ∪ × πC σf ρ ∖ γ ⊕ ⊗ f ⊕.⊗ T ⋈⊗

mapf promoteV

LA RA RA RDBMS BLAS

Optimize & Schedule

LARA

slide-18
SLIDE 18

APIs of RA and LA

Relational Algebra Object: Relation

  • ∪ – Union
  • × – Cartesian Product
  • πC – (Extended) Projection
  • σf – Select
  • ρ – Rename
  • ∖ – Difference
  • γ – Aggregate

Linear Algebra Object: N-D Matrix

  • ⊕ – Element-wise add
  • ⊗ – Element-wise multiply
  • ⊕.⊗ – Matrix multiply
  • Reduce – Sum along a dimension
  • Apply function to each element
  • T – Transpose
  • (Construction & De-construction)
slide-19
SLIDE 19

Objects of Lara

Associative Tables. Several interpretations:

  • Relational table with key columns & value columns with default values
  • Total function from key-space to value-space
  • Sparse tensor
slide-20
SLIDE 20

Lara -> RA & LA

Lara RA LA ⋈⊗ ⋈, π⊗, ρ Tensor product γ⊕, ∪ Reduce, e-wise sum mapf πf Apply promoteV Re-index Re-key ⋈

slide-21
SLIDE 21

Example derived

  • peration:

Outer Join

Inner Join P ⋈ S

P ⟗ S

(formulas out of date) 