Big and Fast Anti-Caching in OLTP Systems Justin DeBrabant Online - - PowerPoint PPT Presentation

big and fast
SMART_READER_LITE
LIVE PREVIEW

Big and Fast Anti-Caching in OLTP Systems Justin DeBrabant Online - - PowerPoint PPT Presentation

Big and Fast Anti-Caching in OLTP Systems Justin DeBrabant Online Transaction Processing transaction-oriented small footprint write-intensive 2 A bit of history 3 OLTP Through the Years relational model rise of the web


slide-1
SLIDE 1

Justin DeBrabant

Big and Fast


Anti-Caching in OLTP Systems

slide-2
SLIDE 2

Online Transaction Processing

2

transaction-oriented small footprint write-intensive

slide-3
SLIDE 3

3

A bit of history…

slide-4
SLIDE 4

4

1972 relational model 1993 OLAP rise of the web Ingres/System R 2015 “end of an era”

OLTP Through the Years

slide-5
SLIDE 5

Modern OLTP Requirements

5

  • 1. web-scale (big)
  • 2. high-throughput (fast)
slide-6
SLIDE 6

Thesis Motivation

▸traditional disk-based architectures aren’t fast enough

6

▸newer main memory architectures aren’t big enough

slide-7
SLIDE 7

7

Can we have main- memory performance for larger-than-memory datasets?

slide-8
SLIDE 8

Thesis Overview: Contributions

  • 1. anti-caching architecture
  • larger than memory datasets in main

memory DBMS

  • 2. anti-caching + persistent memory
  • exploring next-generation hardware and

OLTP systems

8

slide-9
SLIDE 9

Outline

▸Introduction

9

▸Overview and Motivation ▸Anti-Caching Architecture ▸Memory Optimizations ▸Anti-Caching on NVM ▸Future Work and Conclusions

slide-10
SLIDE 10

Disk-Oriented Architectures

▸assumption: data won’t fit in memory ▸disk-resident data, main memory buffer pool for execution ▸concurrency is a must ▸ transaction serialization and locks

10

slide-11
SLIDE 11

11 price per GB ($) 1E+00 1E+05 1E+10 1970 1973 1976 1979 1982 1985 1988 1991 1994 1997 2000 2003 2006 2009 2012

Memory Costs

slide-12
SLIDE 12

Now What?

  • 1. DBMS buffer pool

12

  • 2. distributed cache
  • 3. in-memory DBMS
slide-13
SLIDE 13

Buffer Pool

▸must still…

  • maintain buffer pool
  • lock/latch data
  • maintain ARIES-style recovery logs

▸question: What is the overhead of all these things?

13

slide-14
SLIDE 14

14

OLTP Through the Looking Glass, and What We Found There


SIGMOD ‘08

12%

26% 31% 31%

Buffer Pool Locking Recovery Real Work

slide-15
SLIDE 15

Now What?

  • 1. DBMS buffer pool

15

  • 2. distributed cache
  • 3. in-memory DBMS
slide-16
SLIDE 16

16

Cache Layer Persistence Layer

slide-17
SLIDE 17

Main Memory Cache

▸ fast and scalable, but…

17

▸ key-value interface ▸ not ACID (AI, not CD)

slide-18
SLIDE 18

Consistency and Durability

▸ reads are easy, writes are not

▸ multiple copies of data ▸ application’s responsibility

18

▸ for OLTP, writes are common and consistency is essential

slide-19
SLIDE 19

Now What?

  • 1. DBMS buffer pool

19

  • 2. distributed cache
  • 3. in-memory DBMS
slide-20
SLIDE 20

20

slide-21
SLIDE 21

H-Store Architecture

▸partitioned, shared-nothing ▸single-threaded main memory execution

  • no need for locks and latches

▸lightweight recovery

  • snapshots + command log

21

slide-22
SLIDE 22

22

slide-23
SLIDE 23

23

data > memory? virtual memory!

slide-24
SLIDE 24

24

persistent storage

slide-25
SLIDE 25

Big and Fast

big: disk-oriented

25

fast: memory-oriented big and fast: anti-caching

slide-26
SLIDE 26

26

OLTP workloads are skewed

slide-27
SLIDE 27

Design Principles

▸ asynchronous disk fetches

  • don’t block

▸ maintain ordering of evicted data accesses

  • ensures transactional consistency

▸ single copy of data

  • consistency is free

▸ efficient memory use, no swizzling

27

slide-28
SLIDE 28

Outline

▸Introduction

28

▸Anti-Caching Architecture ▸Overview and Motivation ▸Memory Optimizations ▸Anti-Caching on NVM ▸Future Work and Conclusions

slide-29
SLIDE 29

Architectural Overview

▸memory is primary storage, cold data is evicted to disk-based anti- cache ▸reading data from the anti-cache is done in 3 phases

  • avoids blocking, ensures consistency

29

slide-30
SLIDE 30

Anti-Caching Phases

▸evict ▸pre-pass ▸fetch ▸merge

30

slide-31
SLIDE 31

Evict

  • 1. data > anti-cache threshold
  • 2. dynamically construct anti-

cache blocks of coldest tuples

  • 3. asynchronously write to disk

31

slide-32
SLIDE 32

Pre-Pass

  • 1. a transaction enters pre-pass when

evicted data is accessed

  • 2. continues execution, creating list
  • f evicted blocks
  • 3. abort, queue blocks to be fetched

32

slide-33
SLIDE 33

Fetch

  • 1. data is fetched asynchronously

from disk

  • avoids blocking
  • 2. moved into merge buffer

33

slide-34
SLIDE 34

Merge

  • 1. data is moved from in-memory

merge buffer to in-memory table

  • 2. previously aborted transaction is

restarted

  • 3. transaction executes normally

34

slide-35
SLIDE 35

Anti-Caching Phase: Evict Anti-Caching Phase: Pre-Pass Anti-Caching Phase: Fetch Anti-Caching Phase: Merge

anti-cache

slide-36
SLIDE 36

Tracking Access Patterns

▸done online, more responsive to changes in workload ▸goal is low CPU and memory

  • verhead

▸approximate ordering is OK

36

slide-37
SLIDE 37

Approximate LRU (aLRU)

▸maintain LRU chain embedded in tuple headers ▸per-partition ▸transactions that update LRU chain are sampled randomly ▸ configurable sample rate

37

slide-38
SLIDE 38

Anti-Caching vs. Swapping

▸ fine-grained eviction ▸ blocks constructed dynamically ▸ asynchronous batched fetches ▸ possible because of transactions

38

slide-39
SLIDE 39

Anti-Caching vs. Caching

▸data exists in exactly one location

  • caching architectures have multiple

copies, must maintain consistency

  • data is moved, not copied

▸goal is increased data size, not throughput

39

slide-40
SLIDE 40

Benchmarking

▸YCSB ▸Zipfian skew ▸data > memory ▸read/write mix ▸MySQL, MySQL + memcached

40

slide-41
SLIDE 41

YCSB, read-only, data 8X memory

41

throughput (txn/s)

30000 60000 90000 120000

workload skew (high —> low)

1.5 1.25 1 0.75 0.5

anti-cache MySQL MySQL + memcached

slide-42
SLIDE 42

YCSB, read-heavy, data 8X memory

42

throughput (txn/s)

30000 60000 90000 120000

workload skew (high —> low)

1.5 1.25 1 0.75 0.5

anti-cache MySQL MySQL + memcached

slide-43
SLIDE 43

Tracking Accesses Revisited

▸approximate ordering is OK ▸original implementation ▸ aLRU (linked list) ▸compute vs. memory

43

Can we reduce the memory overhead?

slide-44
SLIDE 44

Timestamp-Based Eviction

▸ use relative timestamps to track accesses ▸ to evict, take subset of tuples and evict based on timestamp age ▸ questions: ▸ timestamp granularity ▸ sample size (power of two)

44

slide-45
SLIDE 45

Timestamp Granularity

▸4 byte timestamps ▸ use instruction counter ▸2 byte timestamps ▸use epochs, set the timestamp to the current epoch

45

slide-46
SLIDE 46

YCSB, read-heavy, data 8X

46

throughput (txn/s)

22500 45000 67500 90000

workload skew (high —> low)

1.5 1.25 1 0.75 0.5

aLRU chain timestamp-low timestamp-high

slide-47
SLIDE 47

Key Take-Aways

▸8-17X improvement for skewed workloads at larger- than-memory data sizes ▸disk becomes the bottleneck for lower skew

47

slide-48
SLIDE 48

Hardware Assumptions are Key

▸heavily influence system architectures ▸many factors ▸ capacity ▸ latency ▸ volatility

48

slide-49
SLIDE 49

49

What’s next for OLTP?

slide-50
SLIDE 50

50

Non-Volatile Memory

slide-51
SLIDE 51

Properties of NVM

▸ non-volatile ▸ random-access ▸ high write endurance

  • except flash

▸ byte-addressable

  • except flash

51

slide-52
SLIDE 52

The NVM Arms Race

▸FeRAM

  • high write endurance

▸MRAM

  • DRAM-like latency

▸PCM (PRAM)

  • DRAM-like capacity

52

slide-53
SLIDE 53

Looking Forward…

▸OLTP architectures and NVM

  • anti-cache architecture
  • disk-based architecture

▸open questions

  • Which architecture is best suited for NVM?
  • What adaptations are needed?

53

slide-54
SLIDE 54

NVM Emulation

▸goal: provide product-independent analysis ▸test wide range of latency profiles ▸automatically add specified latency ▸built by collaborators at Intel

54

slide-55
SLIDE 55

Anti-Caching on NVM

▸replace disk with NVM ▸several adaptations necessary ▸lightweight array-based anti-cache ▸utilizes mmap interface ▸fine-grained block and tuple eviction interface

55

slide-56
SLIDE 56

Disk-Oriented Architectures on NVM

▸must adapt both storage and log files to be use NVM mmap interface ▸configure to use fine-grained buffer pool pages

56

slide-57
SLIDE 57

YCSB, read-only, data 8X

57

throughput (txn/s)

45000 90000 135000 180000

workload skew (high —> low)

1.5 1.25 1 0.75 0.5

anti-caching MySQL

slide-58
SLIDE 58

YCSB, read-heavy, data 8X

58

throughput (txn/s)

45000 90000 135000 180000

workload skew (high —> low)

1.5 1.25 1 0.75 0.5

anti-caching MySQL

slide-59
SLIDE 59

59

Future Work

slide-60
SLIDE 60

Multi-Tier Architectures

▸DRAM -> NVM -> Disk/SSD ▸open questions ▸ indexing structures ▸synchronous/asynchronous fetches

60

slide-61
SLIDE 61

Anti-Caching Indexes

▸index size can be significant ▸can cold index ranges be evicted to an anti-cache? ▸open questions ▸ how/what to evict ▸ execution changes

61

slide-62
SLIDE 62

Semantic Anti-Caching

▸current implementation makes no assumption about types of skew ▸skew typically as semantic meaning ▸ e.g., temporal, spatial ▸can we leverage these domain semantics?

62

slide-63
SLIDE 63

Conclusions

▸anti-caching architecture outperforms and outscales previous OLTP architectures ▸well-suited for next-generation NVM- based architectures

63

slide-64
SLIDE 64

64

slide-65
SLIDE 65

Questions? 
 


debrabant@cs.brown.edu


65