Database Learning: Toward a Database that Becomes Smarter Over Time - - PowerPoint PPT Presentation

database learning toward a database that becomes smarter
SMART_READER_LITE
LIVE PREVIEW

Database Learning: Toward a Database that Becomes Smarter Over Time - - PowerPoint PPT Presentation

Ahmad Shahab Tajik Michael Cafarella Barzan Mozafari University of Michigan, Ann Arbor Database Learning: Toward a Database that Becomes Smarter Over Time Yongjoo Park Our Goal: reuse the work Users Database query Answer to query After


slide-1
SLIDE 1

Database Learning: Toward a Database that Becomes Smarter Over Time

Yongjoo Park Ahmad Shahab Tajik Michael Cafarella Barzan Mozafari

University of Michigan, Ann Arbor

slide-2
SLIDE 2

Today’s databases

Users Database

query Answer to query

After answering queries, THE WORK is GONE.

Our Goal: reuse the work

1

slide-3
SLIDE 3

Today’s databases

Users Database

query Answer to query

After answering queries, THE WORK is GONE.

Our Goal: reuse the work

1

slide-4
SLIDE 4

Today’s databases

Users Database

query Answer to query

After answering queries, THE WORK is GONE.

Our Goal: reuse the work

1

slide-5
SLIDE 5

Today’s databases

Users Database

query Answer to query

After answering queries, THE WORK is GONE.

Our Goal: reuse the work

1

slide-6
SLIDE 6

Today’s databases

Users Database

query Answer to query

After answering queries, THE WORK is GONE.

Our Goal: reuse the work

1

slide-7
SLIDE 7

Today’s databases

Users Database

query Answer to query

After answering queries, THE WORK is GONE.

Our Goal: reuse the work

1

slide-8
SLIDE 8

Our high-level approach

Users

Database Learning Query Synopsis AQP engine Q A (10% err, 1 sec) Q Q A (10% err) A (2% err)

2

slide-9
SLIDE 9

Our high-level approach

Users

Database Learning Query Synopsis AQP engine Q A (10% err, 1 sec) Q Q A (10% err) A (2% err)

2

slide-10
SLIDE 10

Our high-level approach

Users

Database Learning Query Synopsis AQP engine Q A (10% err, 1 sec) Q Q A (10% err) A (2% err)

2

slide-11
SLIDE 11

Our high-level approach

Users

Database Learning Query Synopsis AQP engine Q A (10% err, 1 sec) Q Q A (10% err) A (2% err)

2

slide-12
SLIDE 12

Our high-level approach

Users

Database Learning Query Synopsis AQP engine Q A (10% err, 1 sec) Q Q A (10% err) A (2% err)

2

slide-13
SLIDE 13

Our high-level approach

Users

Database Learning Query Synopsis AQP engine Q A (10% err, 1 sec) Q Q A (10% err) A (2% err)

2

slide-14
SLIDE 14

Our high-level approach

Users

Database Learning Query Synopsis AQP engine Q A (10% err, 1 sec) Q Q A (10% err) A (2% err)

2

slide-15
SLIDE 15

Our high-level approach

Users

Database Learning Query Synopsis AQP engine Q A (10% err, 1 sec) Q Q A (10% err) ˆ A (2% err)

2

slide-16
SLIDE 16

Our high-level approach

Users

Database Learning Query Synopsis AQP engine Q Q A (10% err) ˆ A (2% err)

1 2 3 4 5 6 7 8 9 10 11 12 2 4 6 8 10 Time (sec) Error(%) AQP engine Database learning 3

slide-17
SLIDE 17

Our high-level approach

Users

Database Learning Query Synopsis AQP engine Q Q A (10% err) ˆ A (2% err)

1 2 3 4 5 6 7 8 9 10 11 12 2 4 6 8 10 Time (sec) Error(%) AQP engine Database learning 3

slide-18
SLIDE 18

Our high-level approach

Users

Database Learning Query Synopsis AQP engine Q Q A (10% err) ˆ A (2% err)

1 2 3 4 5 6 7 8 9 10 11 12 2 4 6 8 10 Time (sec) Error(%) AQP engine Database learning 3

slide-19
SLIDE 19

Technical challenges

· · · . . . . . . . . . . . . . . . Queries use the data in different columns/rows. How to leverage those queries for future queries?

4

slide-20
SLIDE 20

Technical challenges

· · · . . . . . . . . . . . . . . . Queries use the data in different columns/rows. How to leverage those queries for future queries?

4

slide-21
SLIDE 21

Technical challenges

· · · . . . . . . . . . . . . . . . Queries use the data in different columns/rows. How to leverage those queries for future queries?

4

slide-22
SLIDE 22

Technical challenges

· · · . . . . . . . . . . . . . . . Queries use the data in different columns/rows. How to leverage those queries for future queries?

4

slide-23
SLIDE 23

Technical challenges

· · · . . . . . . . . . . . . . . . Queries use the data in different columns/rows. How to leverage those queries for future queries?

4

slide-24
SLIDE 24

Our idea

Q1 Q1 A1 Q2 Q2 A2

more queries and answers · · · . . . . . . . . . . . . . . .

?

5

slide-25
SLIDE 25

Our idea

Q1 Q1 A1 Q2 Q2 A2

more queries and answers

  • · · ·

. . . . . . . . . . . . . . .

?

5

slide-26
SLIDE 26

Our idea

Q1 (Q1, A1) Q2 Q2 A2

more queries and answers

  • · · ·

. . . . . . . . . . . . . . .

?

5

slide-27
SLIDE 27

Our idea

Q1 (Q1, A1) Q2 Q2 A2

more queries and answers · · · . . . . . . . . . . . . . . .

?

5

slide-28
SLIDE 28

Our idea

Q1 Q1 A1 Q2 Q2 A2

more queries and answers

  • · · ·

. . . . . . . . . . . . . . .

?

5

slide-29
SLIDE 29

Our idea

Q1 Q1 A1 Q2 (Q2, A2)

more queries and answers

  • · · ·

. . . . . . . . . . . . . . .

?

5

slide-30
SLIDE 30

Our idea

Q1 Q1 A1 Q2 (Q2, A2)

more queries and answers · · · . . . . . . . . . . . . . . .

?

5

slide-31
SLIDE 31

Our idea

Q1 Q1 A1 Q2 Q2 A2

more queries and answers · · · . . . . . . . . . . . . . . .

?

5

slide-32
SLIDE 32

Concrete example

1 20 40 60 80 100 20M 30M 40M Week Number SUM(count)

True data Ranges observed by past queries Model (with 95% confidence interval)

1 20 40 60 80 100 20M 30M 40M Week Number SUM(count) 1 20 40 60 80 100 20M 30M 40M Week Number SUM(count)

6

slide-33
SLIDE 33

Concrete example

1 20 40 60 80 100 20M 30M 40M Week Number SUM(count)

True data Ranges observed by past queries Model (with 95% confidence interval)

1 20 40 60 80 100 20M 30M 40M Week Number SUM(count) 1 20 40 60 80 100 20M 30M 40M Week Number SUM(count)

6

slide-34
SLIDE 34

Concrete example

1 20 40 60 80 100 20M 30M 40M Week Number SUM(count)

True data Ranges observed by past queries Model (with 95% confidence interval)

1 20 40 60 80 100 20M 30M 40M Week Number SUM(count) 1 20 40 60 80 100 20M 30M 40M Week Number SUM(count)

6

slide-35
SLIDE 35

Concrete example

1 20 40 60 80 100 20M 30M 40M Week Number SUM(count)

True data Ranges observed by past queries Model (with 95% confidence interval)

1 20 40 60 80 100 20M 30M 40M Week Number SUM(count) 1 20 40 60 80 100 20M 30M 40M Week Number SUM(count)

6

slide-36
SLIDE 36

Concrete example

1 20 40 60 80 100 20M 30M 40M Week Number SUM(count)

True data Ranges observed by past queries Model (with 95% confidence interval)

1 20 40 60 80 100 20M 30M 40M Week Number SUM(count) 1 20 40 60 80 100 20M 30M 40M Week Number SUM(count)

6

slide-37
SLIDE 37

Concrete example

1 20 40 60 80 100 20M 30M 40M Week Number SUM(count)

True data Ranges observed by past queries Model (with 95% confidence interval)

1 20 40 60 80 100 20M 30M 40M Week Number SUM(count) 1 20 40 60 80 100 20M 30M 40M Week Number SUM(count)

6

slide-38
SLIDE 38

Design goals

select X3, avg(Y1) from t where 5 < X1 < 8; select sum(Y2) from t where X2 between Apr and May group by X3;

  • 1. Support a wide class of SQL queries
  • 2. No Assumptions about Data

BlinkDB DBL latency

  • 3. Lightweight

7

slide-39
SLIDE 39

Design goals

select X3, avg(Y1) from t where 5 < X1 < 8; select sum(Y2) from t where X2 between Apr and May group by X3;

  • 1. Support a wide class of SQL queries
  • 2. No Assumptions about Data

BlinkDB DBL latency

  • 3. Lightweight

7

slide-40
SLIDE 40

Design goals

select X3, avg(Y1) from t where 5 < X1 < 8; select sum(Y2) from t where X2 between Apr and May group by X3;

  • 1. Support a wide class of SQL queries
  • 2. No Assumptions about Data

BlinkDB DBL latency

  • 3. Lightweight

7

slide-41
SLIDE 41

Our Approach

slide-42
SLIDE 42

Problem statement

Problem: Given past queries (q1 qn), a new query (qn

1), and their approximate answers,

Find the most likely answer to the new query (qn

1) and its estimated error.

Our result: Under a certain model assumption,

  • ur answer’s error bound
  • riginal answer’s error bound

(in practice, much more accurate) if the error bounds provide the same probabilistic guarantees.

8

slide-43
SLIDE 43

Problem statement

Problem: Given past queries (q1, . . . , qn), a new query (qn+1), and their approximate answers, Find the most likely answer to the new query (qn+1) and its estimated error. Our result: Under a certain model assumption,

  • ur answer’s error bound
  • riginal answer’s error bound

(in practice, much more accurate) if the error bounds provide the same probabilistic guarantees.

8

slide-44
SLIDE 44

Problem statement

Problem: Given past queries (q1, . . . , qn), a new query (qn+1), and their approximate answers, Find the most likely answer to the new query (qn+1) and its estimated error. Our result: Under a certain model assumption,

  • ur answer’s error bound ≤ original answer’s error bound

(in practice, much more accurate) if the error bounds provide the same probabilistic guarantees.

8

slide-45
SLIDE 45

Overview of our technique

select count(Y2) from t where 1 < X1 < 2; select avg(Y2) from t where 6 < X1 < 8; select sum(Y2) from t where 5 < X1 < 8;

1 2 3

Random variables (our uncertainty on answers) 1

Pr

1 2 3

Probability distribution 2

Pr

3 1 2

Estimated answer 3

Two aggregations involve common values correlation between answers

9

slide-46
SLIDE 46

Overview of our technique

select count(Y2) from t where 1 < X1 < 2; select avg(Y2) from t where 6 < X1 < 8; select sum(Y2) from t where 5 < X1 < 8;

1 2 3

Random variables (our uncertainty on answers) 1

Pr

1 2 3

Probability distribution 2

Pr

3 1 2

Estimated answer 3

Two aggregations involve common values correlation between answers

9

slide-47
SLIDE 47

Overview of our technique

select count(Y2) from t where 1 < X1 < 2; select avg(Y2) from t where 6 < X1 < 8; select sum(Y2) from t where 5 < X1 < 8;

θ1, θ2, θ3

Random variables (our uncertainty on answers) 1

Pr

1 2 3

Probability distribution 2

Pr

3 1 2

Estimated answer 3

Two aggregations involve common values correlation between answers

9

slide-48
SLIDE 48

Overview of our technique

select count(Y2) from t where 1 < X1 < 2; select avg(Y2) from t where 6 < X1 < 8; select sum(Y2) from t where 5 < X1 < 8;

θ1, θ2, θ3

Random variables (our uncertainty on answers) 1

Pr(θ1, θ2, θ3)

Probability distribution 2

Pr

3 1 2

Estimated answer 3

Two aggregations involve common values correlation between answers

9

slide-49
SLIDE 49

Overview of our technique

select count(Y2) from t where 1 < X1 < 2; select avg(Y2) from t where 6 < X1 < 8; select sum(Y2) from t where 5 < X1 < 8;

θ1, θ2, θ3

Random variables (our uncertainty on answers) 1

Pr(θ1, θ2, θ3)

Probability distribution 2

Pr

3 1 2

Estimated answer 3

Two aggregations involve common values → correlation between answers

9

slide-50
SLIDE 50

Overview of our technique

select count(Y2) from t where 1 < X1 < 2; select avg(Y2) from t where 6 < X1 < 8; select sum(Y2) from t where 5 < X1 < 8;

θ1, θ2, θ3

Random variables (our uncertainty on answers) 1

Pr(θ1, θ2, θ3)

Probability distribution 2

Pr(θ3 | θ1, θ2)

Estimated answer 3

Two aggregations involve common values → correlation between answers

9

slide-51
SLIDE 51

How to define random variables

select sum(Y2) from t where 5 < X1 < 8; Aggregate function Selection predicates

We define a random variable for every combination of:

select X3, avg(Y1), sum(Y2) from t where 5 < X1 < 8 and X2 between Apr and May group by X3;

What if your query is complex?

10

slide-52
SLIDE 52

How to define random variables

select sum(Y2) from t where 5 < X1 < 8; Aggregate function Selection predicates

We define a random variable θ for every combination of:

select X3, avg(Y1), sum(Y2) from t where 5 < X1 < 8 and X2 between Apr and May group by X3;

What if your query is complex?

10

slide-53
SLIDE 53

How to define random variables

select sum(Y2) from t where 5 < X1 < 8; Aggregate function Selection predicates

We define a random variable θ for every combination of:

select X3, avg(Y1), sum(Y2) from t where 5 < X1 < 8 and X2 between Apr and May group by X3;

What if your query is complex?

10

slide-54
SLIDE 54

How to define random variables

select sum(Y2) from t where 5 < X1 < 8; Aggregate function Selection predicates

We define a random variable θ for every combination of:

select X3, avg(Y1), sum(Y2) from t where 5 < X1 < 8 and X2 between Apr and May group by X3;

What if your query is complex?

10

slide-55
SLIDE 55

How to define random variables

select sum(Y2) from t where 5 < X1 < 8; Aggregate function Selection predicates

We define a random variable θ for every combination of:

select X3, avg(Y1), sum(Y2) from t where 5 < X1 < 8 and X2 between Apr and May group by X3;

What if your query is complex?

10

slide-56
SLIDE 56

How to define random variables

select sum(Y2) from t where 5 < X1 < 8; Aggregate function Selection predicates

We define a random variable θ for every combination of:

select X3, avg(Y1), sum(Y2) from t where 5 < X1 < 8 and X2 between Apr and May group by X3;

What if your query is complex?

10

slide-57
SLIDE 57

How to determine the probability distribution

The Principle of Maximum Entropy (ME) Statistical Info of

1 2 3

Most-likely Pr

1 2 3

Low Amount of Info High Amount of Info Simple Pr Complex Pr Fast Inference Low-fidelity Slow Inference High-fidelity

Our choice: (co)variances between pairs of answers.

11

slide-58
SLIDE 58

How to determine the probability distribution

The Principle of Maximum Entropy (ME) Statistical Info of (θ1, θ2, θ3) Most-likely Pr

1 2 3

Low Amount of Info High Amount of Info Simple Pr Complex Pr Fast Inference Low-fidelity Slow Inference High-fidelity

Our choice: (co)variances between pairs of answers.

11

slide-59
SLIDE 59

How to determine the probability distribution

The Principle of Maximum Entropy (ME) Statistical Info of (θ1, θ2, θ3) Most-likely Pr(θ1, θ2, θ3)

Low Amount of Info High Amount of Info Simple Pr Complex Pr Fast Inference Low-fidelity Slow Inference High-fidelity

Our choice: (co)variances between pairs of answers.

11

slide-60
SLIDE 60

How to determine the probability distribution

The Principle of Maximum Entropy (ME) Statistical Info of (θ1, θ2, θ3) Most-likely Pr(θ1, θ2, θ3)

Low Amount of Info High Amount of Info Simple Pr Complex Pr Fast Inference Low-fidelity Slow Inference High-fidelity

Our choice: (co)variances between pairs of answers.

11

slide-61
SLIDE 61

How to determine the probability distribution

The Principle of Maximum Entropy (ME) Statistical Info of (θ1, θ2, θ3) Most-likely Pr(θ1, θ2, θ3)

Low Amount of Info High Amount of Info Simple Pr Complex Pr Fast Inference Low-fidelity Slow Inference High-fidelity

Our choice: (co)variances between pairs of answers.

11

slide-62
SLIDE 62

How to determine the probability distribution

The Principle of Maximum Entropy (ME) Statistical Info of (θ1, θ2, θ3) Most-likely Pr(θ1, θ2, θ3)

Low Amount of Info High Amount of Info Simple Pr Complex Pr Fast Inference Low-fidelity Slow Inference High-fidelity

Our choice: (co)variances between pairs of answers.

11

slide-63
SLIDE 63

How to determine the probability distribution

The Principle of Maximum Entropy (ME) Statistical Info of (θ1, θ2, θ3) Most-likely Pr(θ1, θ2, θ3)

Low Amount of Info High Amount of Info Simple Pr Complex Pr Fast Inference Low-fidelity Slow Inference High-fidelity

Our choice: (co)variances between pairs of answers.

11

slide-64
SLIDE 64

How to determine the probability distribution

The Principle of Maximum Entropy (ME) Statistical Info of (θ1, θ2, θ3) Most-likely Pr(θ1, θ2, θ3)

Low Amount of Info High Amount of Info Simple Pr Complex Pr Fast Inference Low-fidelity Slow Inference High-fidelity

Our choice: (co)variances between pairs of answers.

11

slide-65
SLIDE 65

Most-likely probability distribution θ1 θ3 θ2

Statistical Information: Mean, variances, covariances

MaxEnt

Multivariate normal distribution Fast inference using a closed form

12

slide-66
SLIDE 66

Most-likely probability distribution θ1 θ3 θ2

Statistical Information: Mean, variances, covariances

MaxEnt

Multivariate normal distribution Fast inference using a closed form

12

slide-67
SLIDE 67

Most-likely probability distribution θ1 θ3 θ2

Statistical Information: Mean, variances, covariances

MaxEnt

Multivariate normal distribution Fast inference using a closed form

12

slide-68
SLIDE 68

Most-likely probability distribution θ1 θ3 θ2

Statistical Information: Mean, variances, covariances

MaxEnt

Multivariate normal distribution Fast inference using a closed form

12

slide-69
SLIDE 69

Benefits of database learning

Database learning vs. indexing

Indexing DBL database size storage

  • 1. Little storage overhead

Database learning vs. materialized view

date

  • 2. Without alignment

view selection DBL system uptime

  • verhead
  • 3. No upfront overhead

13

slide-70
SLIDE 70

Benefits of database learning

Database learning vs. indexing

Indexing DBL database size storage

  • 1. Little storage overhead

Database learning vs. materialized view

date

  • 2. Without alignment

view selection DBL system uptime

  • verhead
  • 3. No upfront overhead

13

slide-71
SLIDE 71

Benefits of database learning

Database learning vs. indexing

Indexing DBL database size storage

  • 1. Little storage overhead

Database learning vs. materialized view

date

  • 2. Without alignment

view selection DBL system uptime

  • verhead
  • 3. No upfront overhead

13

slide-72
SLIDE 72

Benefits of database learning

Database learning vs. indexing

Indexing DBL database size storage

  • 1. Little storage overhead

Database learning vs. materialized view

date

  • 2. Without alignment

view selection DBL system uptime

  • verhead
  • 3. No upfront overhead

13

slide-73
SLIDE 73

Benefits of database learning

Database learning vs. indexing

Indexing DBL database size storage

  • 1. Little storage overhead

Database learning vs. materialized view

date

  • 2. Without alignment

view selection DBL system uptime

  • verhead
  • 3. No upfront overhead

13

slide-74
SLIDE 74

Experiment

slide-75
SLIDE 75

Experiment setup

  • 1. Two systems:
  • NoLearn: Approximate query processing engine (The longer runtime, the more accurate

answer)

  • Verdict: Our database learning system (on top of NoLearn)
  • 2. Datasets:
  • Customer1: 536GB data and query log from a customer
  • TPC-H: 100GB TPC-H dataset
  • 3. Environment:
  • 5 Amazon EC2 workers (m4.2xlarge) + 1 master
  • SSD-backed HDFS for Spark’s data loading

14

slide-76
SLIDE 76

Experiment setup

  • 1. Two systems:
  • NoLearn: Approximate query processing engine (The longer runtime, the more accurate

answer)

  • Verdict: Our database learning system (on top of NoLearn)
  • 2. Datasets:
  • Customer1: 536GB data and query log from a customer
  • TPC-H: 100GB TPC-H dataset
  • 3. Environment:
  • 5 Amazon EC2 workers (m4.2xlarge) + 1 master
  • SSD-backed HDFS for Spark’s data loading

14

slide-77
SLIDE 77

Experiment setup

  • 1. Two systems:
  • NoLearn: Approximate query processing engine (The longer runtime, the more accurate

answer)

  • Verdict: Our database learning system (on top of NoLearn)
  • 2. Datasets:
  • Customer1: 536GB data and query log from a customer
  • TPC-H: 100GB TPC-H dataset
  • 3. Environment:
  • 5 Amazon EC2 workers (m4.2xlarge) + 1 master
  • SSD-backed HDFS for Spark’s data loading

14

slide-78
SLIDE 78

Our experimental claims

  • 1. Verdict supports a large portion of real-world queries
  • 2. Verdict achieves speedup compared to NoLearn
  • 3. Verdict works with small memory and computational overhead

15

slide-79
SLIDE 79

Our experimental claims

  • 1. Verdict supports a large portion of real-world queries
  • 2. Verdict achieves speedup compared to NoLearn
  • 3. Verdict works with small memory and computational overhead

15

slide-80
SLIDE 80

Our experimental claims

  • 1. Verdict supports a large portion of real-world queries
  • 2. Verdict achieves speedup compared to NoLearn
  • 3. Verdict works with small memory and computational overhead

15

slide-81
SLIDE 81

Generality of Verdict

Dataset # Analyzed # Supported Percentage Customer1 3,342 2,463 73.7% TPC-H 21 14 66.7% Unsupported queries:

  • 1. Nested queries (that cannot be flattened)
  • 2. Textual filters:

city like '%arbor%'

16

slide-82
SLIDE 82

Runtime-error trade-off

Results on the TPC-H dataset (the paper has the Customer1 results) Number of past queries fixed to 50

10 20 30 40 50 60 5 10 15 Runtime (sec) Error bound (%) NoLearn Verdict

(a) Data in Memory

6 12 18 24 30 5 10 15 Runtime (min) Error bound (%)

(b) Data on SSD

17

slide-83
SLIDE 83

Runtime-error trade-off

Results on the TPC-H dataset (the paper has the Customer1 results) Number of past queries fixed to 50

10 20 30 40 50 60 5 10 15 Runtime (sec) Error bound (%) NoLearn Verdict

(a) Data in Memory

6 12 18 24 30 5 10 15 Runtime (min) Error bound (%)

(b) Data on SSD

17

slide-84
SLIDE 84

Speedup

The results on the Customer1 dataset (the paper has the TPC-H results)

4% 2% 5 10 15 20 25 30 7.7 2.5 Target Error Bound Speedup (x)

(a) Data in memory

4% 2% 5 10 15 20 25 30 23 5.7 Target Error Bound Speedup (x)

(b) Data on SSD

18

slide-85
SLIDE 85

Speedup

The results on the Customer1 dataset (the paper has the TPC-H results)

4% 2% 5 10 15 20 25 30 7.7 2.5 Target Error Bound Speedup (x)

(a) Data in memory

4% 2% 5 10 15 20 25 30 23 5.7 Target Error Bound Speedup (x)

(b) Data on SSD

18

slide-86
SLIDE 86

Memory and computational overhead

  • 1. Memory overhead:
  • Queries and their answer, some matrices and their inverses
  • 23.2 KB per query for the Customer1 dataset
  • 15.8 KB per query for the TPC-H dataset
  • 2. Computational overhead:

Latency for memory Latency for SSD NoLearn 2.083 sec 52.50 sec Verdict 2.093 sec 52.51 sec Overhead 0.010 sec (0.48%) 0.010 sec (0.02%)

19

slide-87
SLIDE 87

Memory and computational overhead

  • 1. Memory overhead:
  • Queries and their answer, some matrices and their inverses
  • 23.2 KB per query for the Customer1 dataset
  • 15.8 KB per query for the TPC-H dataset
  • 2. Computational overhead:

Latency for memory Latency for SSD NoLearn 2.083 sec 52.50 sec Verdict 2.093 sec 52.51 sec Overhead 0.010 sec (0.48%) 0.010 sec (0.02%)

19

slide-88
SLIDE 88

Memory and computational overhead

  • 1. Memory overhead:
  • Queries and their answer, some matrices and their inverses
  • 23.2 KB per query for the Customer1 dataset
  • 15.8 KB per query for the TPC-H dataset
  • 2. Computational overhead:

Latency for memory Latency for SSD NoLearn 2.083 sec 52.50 sec Verdict 2.093 sec 52.51 sec Overhead 0.010 sec (0.48%) 0.010 sec (0.02%)

19

slide-89
SLIDE 89

Memory and computational overhead

  • 1. Memory overhead:
  • Queries and their answer, some matrices and their inverses
  • 23.2 KB per query for the Customer1 dataset
  • 15.8 KB per query for the TPC-H dataset
  • 2. Computational overhead:

Latency for memory Latency for SSD NoLearn 2.083 sec 52.50 sec Verdict 2.093 sec 52.51 sec Overhead 0.010 sec (0.48%) 0.010 sec (0.02%)

19

slide-90
SLIDE 90

Thank You!

19