A B C D A B C D A B C D A B C D A B C D A B C D A B C D A B - - PowerPoint PPT Presentation

a b c d a b c d a b c d a b c d a b c d a b c d a b c d a
SMART_READER_LITE
LIVE PREVIEW

A B C D A B C D A B C D A B C D A B C D A B C D A B C D A B - - PowerPoint PPT Presentation

CS 591: Da Data S Systems Arch chitect ctures Prof. Manos Athanassoulis mathan@bu.edu http://manos.athanassoulis.net/classes/CS591 CS591 progress bar Storage Layouts Rows vs Cols vs Hybrid A B C D A B C D A B C D A B C D A B C D A B


slide-1
SLIDE 1

CS 591: Da

Data S Systems Arch chitect ctures

  • Prof. Manos Athanassoulis

mathan@bu.edu http://manos.athanassoulis.net/classes/CS591

slide-2
SLIDE 2

CS591 progress bar

Storage Layouts Rows vs Cols vs Hybrid

A B C D A B C D A B C D A B C D A B C D A B C D A B C D A B C D A B C D A B C D A B C D

slide-3
SLIDE 3

CS591 progress bar

Storage Layouts Rows vs Cols vs Hybrid New Hardware Flash Storage Multi-core

slide-4
SLIDE 4

CS591 progress bar

Storage Layouts Rows vs Cols vs Hybrid New Hardware Flash Storage Multi-core Indexing When to use? UpBit

index scan

  • r
slide-5
SLIDE 5

CS591 progress bar

Storage Layouts Rows vs Cols vs Hybrid New Hardware Flash Storage Multi-core Indexing When to use? UpBit

1 1 1 1 1 1 1 1 A=10 A=20 A=30

UB UB UB

slide-6
SLIDE 6

CS591 progress bar

Storage Layouts Rows vs Cols vs Hybrid New Hardware Flash Storage Multi-core Indexing When to use? UpBit NoSQL Engines LSM-Trees Hash-based

memory storage fence pointers

X

Bloom filters buf buffer

slide-7
SLIDE 7

CS591 progress bar

Storage Layouts Rows vs Cols vs Hybrid New Hardware Flash Storage Multi-core Indexing When to use? UpBit NoSQL Engines LSM-Trees Hash-based

Stable

LA = 0

Read-Only

LA = ∞

Mutable

In-Memory Disk Increasing Logical Address Read-Copy-Update In-Place-Update

Figure 5: Logical Address Space in

slide-8
SLIDE 8

CS591 progress bar

Storage Layouts Rows vs Cols vs Hybrid New Hardware Flash Storage Multi-core Indexing When to use? UpBit NoSQL Engines LSM-Trees Hash-based Indexing Data Skipping Adaptive Indexing

2012 A DB 2011 A AI 2011 B OS 2013 C DB 2011 A AI 2011 B OS 2012 A DB 2013 C DB grade A A B C year 2011 2011 2012 2013

t1 t2 t3 t4 t1 t2 t3 t4 t2 t3 t1 t4 t2 t3 t1 t4

year grade course course AI OS DB DB year grade course

slide-9
SLIDE 9

CS591 progress bar

Storage Layouts Rows vs Cols vs Hybrid New Hardware Flash Storage Multi-core Indexing When to use? UpBit NoSQL Engines LSM-Trees Hash-based Indexing Data Skipping Adaptive Indexing

?

Index Column

< 13 >= 13 < 42 >= 42

Index Column Q0=[13,42) Index Column sorted Q2 Qn ... Q1=[6,27)

< 6 >= 6 < 13 >= 13 < 27 >=27 < 42 >= 42

Index Column

slide-10
SLIDE 10

CS591 progress bar

Storage Layouts Rows vs Cols vs Hybrid New Hardware Flash Storage Multi-core Indexing When to use? UpBit NoSQL Engines LSM-Trees Hash-based Indexing Data Skipping Adaptive Indexing Scientific Data Management In-situ Query Processing

Raw Data File

Positional Map

BF BF+BTree BF BTree BF BF BTree

Adaptive Partitioning Cache

slide-11
SLIDE 11

CS591 progress bar

Storage Layouts Rows vs Cols vs Hybrid New Hardware Flash Storage Multi-core Indexing When to use? UpBit NoSQL Engines LSM-Trees Hash-based Indexing Data Skipping Adaptive Indexing Scientific Data Management In-situ Query Processing Today: Array Data

slide-12
SLIDE 12

Today: Array Data Storage Manager

Up to now: uni uni-dim dimensio nsional nal data (integers, real, string) Array Data: mu multi-dim dimensio nsional nal data No unique order (cannot sort!) How to store? Co Concepts: multi-dimensional arrays, storage manager, tiles, thread-safe, dense vs. sparse arrays, global cell order, fragments, dense vs. sparse fragments, consolidation why is this a challenge?

slide-13
SLIDE 13

CS591 progress bar

Storage Layouts Rows vs Cols vs Hybrid New Hardware Flash Storage Multi-core Indexing When to use? UpBit NoSQL Engines LSM-Trees Hash-based Indexing Data Skipping Adaptive Indexing Scientific Data Management In-situ Query Processing Today: Array Data

New Paradigms

slide-14
SLIDE 14

CS591 progress bar

Storage Layouts Rows vs Cols vs Hybrid New Hardware Flash Storage Multi-core Indexing When to use? UpBit NoSQL Engines LSM-Trees Hash-based Indexing Data Skipping Adaptive Indexing Scientific Data Management In-situ Query Processing Today: Array Data Distributed DB Database Systems at Global Scale MapReduce Computing at Scale Systems for ML ML building blocks ML for Systems Automatic Data System Design Learned Indexes Learn Data Distributions for Indexing Data Calculator Synthesize Indexes

New Paradigms

slide-15
SLIDE 15

Do not forget: re reviews ws

You can skip up to 3 reviews 18 classes: 5 long + 10 short + 3 skipped ne new r w rule ule: you can do extra long reviews, 1 long counts as 3 short Normally for full marks: 5 long + 10 short

  • r 6 long + 7 short
  • r 7 long + 4 short
  • r 8 long + 1 short
slide-16
SLIDE 16

Do not forget: pr projec ect

Do not leave your project work for last minute! Until Tu Tuesday April 16th

th every group in OH to discuss progress

April 30 and May 2 project presentations: problem + approach + results + open questions Project presentations will also be peer-evaluated