Learning Data Systems Components Tim Kraska <kraska@mit.edu> - - PowerPoint PPT Presentation

learning data systems components
SMART_READER_LITE
LIVE PREVIEW

Learning Data Systems Components Tim Kraska <kraska@mit.edu> - - PowerPoint PPT Presentation

Work partially done at Learning Data Systems Components Tim Kraska <kraska@mit.edu> [Disclaimer: I am NOT talking on behalf of Google] Comments on Social Media Sorting Joins Tree Bloom Filter HashMaps Machine Learning Just Ate


slide-1
SLIDE 1

Tim Kraska <kraska@mit.edu>

[Disclaimer: I am NOT talking on behalf of Google]

Learning Data Systems Components

Work partially done at

slide-2
SLIDE 2

HashMaps Sorting Joins Bloom Filter Tree

“Machine Learning Just Ate Algorithms In One Large Bite….” [Christopher Manning, Professor at Stanford]

Comments on Social Media

slide-3
SLIDE 3

Disclaimer

HashMaps Sorting Joins Bloom Filter Tree

slide-4
SLIDE 4

Fundamental Building Blocks

Sorting B-Tree Hash- Map Scheduling Join Priority Queue Bloom Filter Caching Range Filter

slide-5
SLIDE 5

Databases as an Example:

B-Trees

slide-6
SLIDE 6
slide-7
SLIDE 7
slide-8
SLIDE 8
slide-9
SLIDE 9

B-C

C-G

G-J

K-N N-R

Q-S

S-U

U-V

V-X

X-@
slide-10
SLIDE 10

B-C

C-G

G-J

K-N N-R

A-B B-C C-G …

Key

slide-11
SLIDE 11

B-C

C-G

G-J

K-N N-R

A-B B-C C-G …

AA- AL AL- AK AK- AP … BA- BE BI- BL BL- BR … … ... ... …

….

Key

slide-12
SLIDE 12

A-B B-C C-G …

AA- AL AL- AK AK- AP … BA- BE BI- BL BL- BR … … ... ... …

….

… … … … …. … …. …

…. ….

… … … … …. … …. … … … … … …. … …. …

Key

slide-13
SLIDE 13

B-C

C-G

G-J

K-N N-R

slide-14
SLIDE 14

The Librarian

slide-15
SLIDE 15

Harry Potter Childreen Books Curious George O’Reilly Books Travel Books DaVinci Code The Girl

  • n the Train

Bill Brycen The Source The Gruffalo The Gruffalo A Day in the Life

  • f Marlon Bundo

ML With Python

slide-16
SLIDE 16

Harry Potter Childreen Books Curious George O’Reilly Books Travel Books DaVinci Code The Girl

  • n the Train

Bill Brycen The Source The Gruffalo Make Way for Ducklings A Day in the Life

  • f Marlon Bundo

ML With Python

slide-17
SLIDE 17

B-C

C-G

G-J

K-N N-R

slide-18
SLIDE 18

B-C

C-G

G-J

K-N N-R

A- B B- C C- G …

AA- AL AL- AK AK- AP … BA- BE BI- BL BL- BR … … ... ... …

….

… … … … …. … …. …

…. ….

… … … … …. … …. … … … … … …. … …. …

Key

slide-19
SLIDE 19

B-C

C-G

G-J

K-N N-R

Model

Key

slide-20
SLIDE 20

Fundamental Algorithms & Data Structures

Hash-Map Tree Sorting Join Range-Filter Priority Queue

…..

Scheduling Cache Policy Bloom-Filter

slide-21
SLIDE 21

Not convinced yet?

slide-22
SLIDE 22

Another Example:
 
 Index All Integers from 900 to 800M 


900 901 902 903 904 905 906 907 908 909 800M

… … … …

… … … … … … … … … ... ... …

….

… … … … …. … …. …

…. ….

… … … … …. … …. … … … … … …. … …. …

B-Tree?

slide-23
SLIDE 23

A More Concrete Example:
 
 Index All Integers from 900 to 800M 


900 901 902 903 904 905 906 907 908 909 800M

… data_array[lookup_key - 900]

slide-24
SLIDE 24

Goal: 
 
 Index All Integers from 900 to 800M 


900 901 902 903 904 905 906 907 908 909 800M

900 902 904 906 908 910 912 914 916 918 800M

Index All Even Integers from 900 to 800M

data_array[(lookup_key – 900) / 2]

slide-25
SLIDE 25

Still holds for other data distributions

slide-26
SLIDE 26

Key Insight

Traditional data structures (typically) make no assumptions about the data

But knowing the data distribution might allow for significant performance gains and might even change the complexity of data structures (e.g., O(log n) O(1) for lookups or O(n) O(1) for storage)

slide-27
SLIDE 27

Building A System From Scratch For Every Use Case Is Not Economical

slide-28
SLIDE 28

Conceptually a 
 B-Tree maps a key to a page

B- Tree key page For simplicity assume all pages are continuously stored in main memory

slide-29
SLIDE 29

Alternative View
 B-Tree maps a key to a position with a fixed min/max error

For simplicity assume all pages are continuously stored in main memory B- Tree Sorted Array key position pos pos + page-size

  • 1. B-tree: key→pos
  • 2. Binary search within

min/max-error

slide-30
SLIDE 30

Sorted Array key position pos pos + page-size Model

A B-Tree Is A Model

slide-31
SLIDE 31

Finding an item

  • 1. Any model: key → pos estimate
  • 2. Binary search in 


[pos - errmin, pos + errmax] errmin and errmax are known from the training process

Sorted Array key position pos pos + page-size Model

A B-Tree Is A Model

slide-32
SLIDE 32

A B-Tree Is A Model

A form of a regression model

key→ pos is equivalent of modeling the CDF of the (observed) key distribution: Pos-estimate = P(X ≤ key) * #keys

Sorted Array key position pos pos + page-size Model

slide-33
SLIDE 33

A B-Tree Is A Model

Pos-estimate = F(key) * #keys

slide-34
SLIDE 34

B-Trees Are Regression Trees

B- Tree Sorted Array key position

slide-35
SLIDE 35

What Does This Mean

slide-36
SLIDE 36

What Does This Mean

Database people were the first to do 
 large scale machine learning :)

slide-37
SLIDE 37

Potential Advantages of Learned B-Tree Models

  • Smaller indexes → less (main-memory) storage
  • Faster Lookups?
  • More parallelism → Sequential if-statements are exchanged

for multiplications

  • Hardware accelerators → Lower power, better $/compute….
  • Cheaper inserts? → more on that later. For the moment,

assume read-only

slide-38
SLIDE 38

A First Attempt

  • 200M web-server log records by timestamp-sorted
  • 2 layer NN, 32 width, ReLU activated
  • Prediction task: timestamp position within

sorted array

slide-39
SLIDE 39

Cache-Optimized B-Tree

≈250ns ???

A First Attempt

slide-40
SLIDE 40

A First Attempt

≈250ns ≈80,000ns

Cache-Optimized B-Tree

slide-41
SLIDE 41

Reasons

Problem I: Tensorflow is designed for large models Problem II: B-Trees are great for overfitting Problem III: B-Trees are 
 cache-efficient Problem IV: Search does not take advantage of the prediction

slide-42
SLIDE 42

Problem I: 


The Learning Index Framework (LIF)

  • An index synthesis system
  • Given an index configuration generate the best possible code
  • Uses ideas from Tupleware [VLDB15]
  • Simple models are trained “on-the-fly”, whereas for complex

models we use Tensorflow and extract weights afterwards (i.e., no Tensorflow during inference time)

  • Best index configuration is found using auto-tuning (e.g., see

TuPAQ [SOCC15]

slide-43
SLIDE 43

Problem II + III:


Precision Gain per Node

……. ……. ……. ……. ……. ……. ……. …….

Index over 100M records. Page-size: 100

Precision Gain: 100M --> 1M (Min/Max-Error: 1M) Precision Gain: 1M --> 10k Precision Gain: 10k --> 100 100M records (i.e., 1M pages)

slide-44
SLIDE 44

The Last Mile Problem

Pos Key

slide-45
SLIDE 45

Solution: 
 Recursive Model Index (RMI)

slide-46
SLIDE 46

How Does The Lookup-Code Look Like

Model on stage 1: f0(key_type key) Models on stage two: f1[] (e.g., the first model in the second stage is is f1[0](key_type key)) Lookup Code:

pos_estimate f1[f0(key)](key) pos exp_search(key, pos_estimate, data);

Number of operations with linear regression models:

  • ffset a + b * key

weights2 weights_stage2[offset] pos_estimate weights2.a + weights2.b * key pos exp_search(key, pos_estimate, data)

2x multiplies 2x additions 1x array-lookup

slide-47
SLIDE 47

How Does The Lookup-Code Look Like

Model on stage 1: f0(key_type key) Models on stage two: f1[] (e.g., the first model in the second stage is is f1[0](key_type key)) Lookup Code for a 2-stage RMI:

pos_estimate f1[f0(key)](key) pos exp_search(key, pos_estimate, data);

Operations with a 2-stage RMI with linear regression models

  • ffset a + b * key

weights2 weights_stage2[offset] pos_estimate weights2.a + weights2.b * key pos exp_search(key, pos_estimate, data)

2x multiplies 2x additions 1x array-lookup

slide-48
SLIDE 48

Hybrid RMI

Worst-Case Performance is the one of a B-Tree

slide-49
SLIDE 49

Problem IV: Min-/Max-Error vs Average Error

N Actual Position Predicted Position Min-Model Error Max-Model Error

slide-50
SLIDE 50

Binary Search

N Actual Position Predicted Position

Right Left Middle

slide-51
SLIDE 51

Binary Search

N Actual Position Predicted Position

Right Left Middle

slide-52
SLIDE 52

Binary Search

N Actual Position Predicted Position

Middle Left Right

slide-53
SLIDE 53

Quaternary Search

N Actual Position Predicted Position

Right Left Q2

slide-54
SLIDE 54

Quaternary Search

N Actual Position Predicted Position

Right Left Q2 Q1: 
 Prediction – 2x std err Q3: 
 Prediction + 2x std err

slide-55
SLIDE 55

Quaternary Search

N Actual Position Predicted Position

Left Q1 Right Q2 Q3

slide-56
SLIDE 56

Exponential Search

N Actual Position Predicted Position

slide-57
SLIDE 57

Does it have to be

slide-58
SLIDE 58

Does It Work?

Type Config Lookup time Speedup

  • vs. BTree

Size (MB) Size vs. Btree

BTree page size: 128 260 ns 1.0X 12.98 MB 1.0X Learned index 2nd stage size: 10000 222 ns 1.17X 0.15 MB 0.01X Learned index 2nd stage size: 50000 162 ns 1.60X 0.76 MB 0.05X Learned index 2nd stage size: 100000 144 ns 1.67X 1.53 MB 0.12X Learned index 2nd stage size: 200000 126 ns 2.06X 3.05 MB 0.23X

60% faster at 1/20th the space, or 17% faster at 1/100th the space

200M records of map data (e.g., restaurant locations). index on longitude Intel-E5 CPU with 32GB RAM without GPU/TPUs No Special SIMD optimization (there is a lot of potential)

slide-59
SLIDE 59

Does It Work?

Type Config Lookup time Speedup

  • vs. BTree

Size (MB) Size vs. Btree

BTree page size: 128 260 ns 1.0X 12.98 MB 1.0X Learned index 2nd stage size: 10000 222 ns 1.17X 0.15 MB 0.01X Learned index 2nd stage size: 50000 162 ns 1.60X 0.76 MB 0.05X Learned index 2nd stage size: 100000 144 ns 1.67X 1.53 MB 0.12X Learned index 2nd stage size: 200000 126 ns 2.06X 3.05 MB 0.23X

60% faster at 1/20th the space, or 17% faster at 1/100th the space

200M records of map data (e.g., restaurant locations). index on longitude Intel-E5 CPU with 32GB RAM without GPU/TPUs No Special SIMD optimization (there is a lot of potential)

slide-60
SLIDE 60

You Might Have Seen Certain Blog Posts

slide-61
SLIDE 61

0 50 100 150 200 250 300 350 256 32 4 0.5

Size (MB) Lookup-Time (ns)

FAST Lookup Table Fixed-Size Read-Optimized B-Tree w/ interpolation search Learned Index Better Worse Better Worse

slide-62
SLIDE 62

My Own Comparison

slide-63
SLIDE 63

A Comparison To ARTful Indexes (Radix-Tree)

Experimental setup:

  • Dense: continuous keys from 0 to 256M
  • Sparse: 256M keys where each bit is equally likely 0 or 1.

Viktor Leis, Alfons Kemper, Thomas Neumann: The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases. ICDE 2013

slide-64
SLIDE 64

A Comparison To ARTful Indexes (Radix-Tree)

Experimental setup: continuous keys from 0 to 256M Reported lookup throughput: 10M/s ≈ 100ns(1) Size: not measured, but paper says overhead of ≈8 Bytes per key (dense, best case): 256M * 8 Byte ≈ 1953MB

(1)Numbers from the paper

Viktor Leis, Alfons Kemper, Thomas Neumann: The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases. ICDE 2013

slide-65
SLIDE 65

Learned Index

Generate Code: Record lookup(key) { return data[0 + 1 * key]; }

slide-66
SLIDE 66

Learned Index

Generate Code: Record lookup(key) { return data[key]; }

slide-67
SLIDE 67

Learned Index

Generate Code: Record lookup(key) { return data[key]; } Lookup Latency: 10ns (learned index) vs 100ns* (ARTfull)


  • r one-order-of-magnitude better

Space: 0MB vs 1953MB
 Infinitely better :)

slide-68
SLIDE 68

?

slide-69
SLIDE 69

What about Updates and Inserts?

slide-70
SLIDE 70

What about Updates and Inserts?

Alex Galakatos, Michael Markovitch, Carsten Binnig, Rodrigo Fonseca, Tim Kraska: 
 A-Tree: A Bounded Approximate Index Structure https://arxiv.org/abs/1801.10207

slide-71
SLIDE 71

The Simple Approach: Delta Indexing

updates Training a simple Multi-Variate Regression Model Can be done in one pass over the data

slide-72
SLIDE 72

Leverage the Distribution

slide-73
SLIDE 73

Leverage the Distribution for Appends

Inserts (e.g., Timestamps)

Time New Inserts

If the Learned Model Can Generalize to Inserts Insert complexity is O(1) not O(Log N)

slide-74
SLIDE 74

Updates/Inserts

  • Less beneficial as the data still has to be stored sorted
  • Idea: Leave space in the array where more updates/

inserts are expected

  • Can also be done with traditional trees.
  • But, the error of learned indexes should increase with



 per node in RMI whereas traditional indexes with

𝑂 𝑂

slide-75
SLIDE 75

Still at the Beginning!

  • Can we provide bounds for inserts?
  • When to retrain?
  • How to retrain models on the fly?
slide-76
SLIDE 76

Fundamental Algorithms & Data Structures

Hash-Map Tree Sorting Join Range-Filter Priority Queue

…..

Scheduling Cache Policy Bloom-Filter

slide-77
SLIDE 77

Fundamental Algorithms & Data Structures

Tree Sorting Join Range-Filter Priority Queue

…..

Scheduling Cache Policy Bloom-Filter Hash-Map

slide-78
SLIDE 78

Hash Function

Key

Model

Key

Hash Map

Goal: Reduce Conflicts

slide-79
SLIDE 79

25% - 70% Reduction in Hash-Map Conflicts

Hash Map - Results

Skip

slide-80
SLIDE 80

You Might Have Seen Certain Blog Posts

slide-81
SLIDE 81

Independent

  • f Hash-Map

Architecture

slide-82
SLIDE 82

Hash Map – Example Results

Type Time (ns) Utilization Stanford AVX Cuckoo, 4 Byte value 31ns 99% Stanford AVX Cuckoo, 20 Byte record - Standard Hash 43ns 99% Commercial Cuckoo, 20Byte record - Standard Hash 90ns 95% In-place chained Hash-map, 20Byte record, 
 learned hash functions 35ns 100%

slide-83
SLIDE 83

Fundamental Algorithms & Data Structures

Hash-Map Tree Sorting Join Range-Filter Priority Queue

…..

Scheduling Cache Policy Bloom-Filter

slide-84
SLIDE 84

Fundamental Algorithms & Data Structures

Hash-Map Tree Sorting Join Range-Filter Priority Queue

…..

Scheduling Cache Policy Bloom-Filter

slide-85
SLIDE 85

Is This Key In My Set? Maybe Yes No No Maybe No Is This Key In My Set?

Model

Maybe Yes 36% Space Improvement over Bloom Filter 
 at Same False Positive Rate

Bloom Filter- Approach 1

slide-86
SLIDE 86

Bloom Filter- Approach 2 (Future Work)

Hash Function 1

Key

Model

Key

Hash Function 2 Hash Function 3

slide-87
SLIDE 87

Fundamental Algorithms & Data Structures

Hash-Map Tree Sorting Join Range-Filter Priority Queue

…..

Scheduling Cache Policy Bloom-Filter

slide-88
SLIDE 88

Future Work CDF

How Would You Design Your Algorithms/Data Structure If You Have a Model for the Empirical Data Distribution?

slide-89
SLIDE 89

Future Work

Hash-Map Tree Sorting Join Range-Filter Priority Queue

…..

Scheduling Cache Policy Bloom-Filter

slide-90
SLIDE 90

Future work: Multi-Dim Indexes

slide-91
SLIDE 91

Future work: Data Cubes

slide-92
SLIDE 92

Other Database Components

  • Cardinality Estimation
  • Cost Model
  • Query Scheduling
  • Storage Layout
  • Query Optimizer

How Would You Design Your Algorithms/Data Structure If You Have a Model for the Empirical Data Distribution?

slide-93
SLIDE 93
slide-94
SLIDE 94

Related Work

  • Succinct Data Structures Most related, but succinct data

structures usually are carefully, manually tuned for each use case

  • B-Trees with Interpolation search Arbitrary worst-case

performance

  • Perfect Hashing Connection to our Hash-Map approach, but

they usually increase in size with N

  • Mixture of Expert Models Used as part of our solution
  • Adaptive Data Structures / Cracking orthogonal problem
  • Local Sensitive Hashing (LSH) (e.g., learened by NN)


Has nothing to do with Learned Structures

slide-95
SLIDE 95

Local Sensitive Hashing (LSH)

Thanks Alkis for the analogy

slide-96
SLIDE 96

Summarize CDF

How Would You Design Your Algorithms/Data Structure If You Have a Model for the Empirical Data Distribution?

slide-97
SLIDE 97

Adapts To Your Data

slide-98
SLIDE 98

Big Potential For TPUs/GPUs

slide-99
SLIDE 99

O(N2) O(N) O(Log N) O(1) N Time or Space

Can Lower the Complexity Class

data_array[(lookup_key – 900)]

slide-100
SLIDE 100

Warning Not An Almighty Solution

slide-101
SLIDE 101

Data System for AI Lab DSAIL@CSAIL

Research Area System Faculty Founding Sponsors ML Faculty

slide-102
SLIDE 102

Tim Kraska 


<kraska@mit.edu>

Special thanks to:

  • A new approach to indexing
  • Framework to rethink many existing data structures/algorithms
  • Under certain conditions, it might allow to change the complexity class of

data structures

  • The idea might have implications within and outside of database systems

Technical Report: 


Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, Neoklis Polyzotis: The Case for Learned Index Structures

Work partially done at