HYDRAstor: a Scalable Secondary Storage 7th USENIX Conference on - - PowerPoint PPT Presentation

hydrastor a scalable secondary storage
SMART_READER_LITE
LIVE PREVIEW

HYDRAstor: a Scalable Secondary Storage 7th USENIX Conference on - - PowerPoint PPT Presentation

HYDRAstor: a Scalable Secondary Storage 7th USENIX Conference on File and Storage Technologies (FAST '09) February 26 th 2009 C. Dubnicki, L. Gryz, L. Heldt, M. Kaczmarczyk, W. Kilian, P. Strzelczak, J. Szczepkowski, M. Welnicki C. Ungureanu


slide-1
SLIDE 1

HYDRAstor: a Scalable Secondary Storage

  • C. Dubnicki, L. Gryz, L. Heldt,
  • M. Kaczmarczyk, W. Kilian,
  • P. Strzelczak, J. Szczepkowski,
  • M. Welnicki
  • C. Ungureanu

7th USENIX Conference on File and Storage Technologies (FAST '09)

February 26th 2009

slide-2
SLIDE 2

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 2

Scalable secondary storage

Characteristics Requirements

Huge amount of data

  • Scalability (dynamic)
  • Low cost per TB

Small backup windows

  • Very high write performance

Duplication between backup streams

  • Global deduplication

Reliable, on-line retrieval

  • Failure tolerance
  • High restore performance

Varying value of data

  • Adjust resilience overhead
  • Data deletion
slide-3
SLIDE 3

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 3

Scalable secondary storage

Characteristics Requirements

Huge amount of data

  • Scalability (dynamic)
  • Low cost per TB

Small backup windows

  • Very high write performance

Duplication between backup streams

  • Global deduplication

Reliable, on-line retrieval

  • Failure tolerance
  • High restore performance

Varying value of data

  • Adjust resilience overhead
  • Data deletion
slide-4
SLIDE 4

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 4

Scalable secondary storage

Characteristics Requirements

Huge amount of data

  • Scalability (dynamic)
  • Low cost per TB

Small backup windows

  • Very high write performance

Duplication between backup streams

  • Global deduplication

Reliable, on-line retrieval

  • Failure tolerance
  • High restore performance

Varying value of data

  • Adjust resilience overhead
  • Data deletion
slide-5
SLIDE 5

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 5

Scalable secondary storage

Characteristics Requirements

Huge amount of data

  • Scalability (dynamic)
  • Low cost per TB

Small backup windows

  • Very high write performance

Duplication between backup streams

  • Global deduplication

Reliable, on-line retrieval

  • Failure tolerance
  • High restore performance

Varying value of data

  • Adjust resilience overhead
  • Data deletion
slide-6
SLIDE 6

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 6

Scalable secondary storage

Characteristics Requirements

Huge amount of data

  • Scalability (dynamic)
  • Low cost per TB

Small backup windows

  • Very high write performance

Duplication between backup streams

  • Global deduplication

Reliable, on-line retrieval

  • Failure tolerance
  • High restore performance

Varying value of data

  • Adjust resilience overhead
  • Data deletion
slide-7
SLIDE 7

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 7

Challenges

  • High-performance, decentralized

global deduplication ... in a dynamic, distributed system ... with deletion and failures

  • Combination introduces complexity
  • Tension between:
  • Deduplication and dynamic scalability
  • Deduplication and on-demand deletion
  • Failure tolerance and deletion
slide-8
SLIDE 8

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 8

  • Satisfies Scalable secondary storage

requirements

  • Started as a research project at

NEC Laboratories America, in Princeton, NJ

  • Successfully commercialized
  • Today: real-world, commercial system
  • Sold by NEC in the US and Japan
  • Development of back-end continues at

9LivesData, LLC in Warsaw, Poland

  • Spinoff from NEC Laboratories
slide-9
SLIDE 9

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 9

HYDRAstor functionality

  • Content addressable storage (CAS)
  • Vast data repository
  • Storing and extracting streams of blocks
  • Single system image built of independent nodes
  • Support for standard access methods
  • Filesystem, VTL
  • Dynamic capacity sharing
  • Self-recovery from failures
  • On-demand deletion
slide-10
SLIDE 10

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 10

Programming Model

  • Repository of blocks
  • Content-addressed
  • Immutable
  • Variable-sized

hash=011..0

slide-11
SLIDE 11

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 11

Programming Model

  • Repository of blocks
  • Content-addressed
  • Immutable
  • Variable-sized
  • Exposed pointers to other

blocks

E hash=011..0 011..0

slide-12
SLIDE 12

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 12

Programming Model

  • Repository of blocks
  • Content-addressed
  • Immutable
  • Variable-sized
  • Exposed pointers to other

blocks

  • Trees of blocks

E E E Root1 E hash=010..1 hash=011..0 011..0

slide-13
SLIDE 13

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 13

Programming Model

  • Repository of blocks
  • Content-addressed
  • Immutable
  • Variable-sized
  • Exposed pointers to other

blocks

  • Trees of blocks
  • DAGs due to deduplication
  • No cycles possible

E E 011..0 E Root1 E E Root2 hash=010..1 hash=110..0 hash=011..0 1 1 . .

slide-14
SLIDE 14

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 14

Programming Model

  • Repository of blocks
  • Content-addressed
  • Immutable
  • Variable-sized
  • Exposed pointers to other

blocks

  • Trees of blocks
  • DAGs due to deduplication
  • No cycles possible
  • Deletion of whole trees

E E 1 1 . . E Root1 E E Root2 hash=010..1 hash=110..0 hash=011..0 011..0

slide-15
SLIDE 15

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 15

Programming Model

  • Repository of blocks
  • Content-addressed
  • Immutable
  • Variable-sized
  • Exposed pointers to other

blocks

  • Trees of blocks
  • DAGs due to deduplication
  • No cycles possible
  • Deletion of whole trees

E E 1 1 . . E Root1 E E Root2 hash=010..1 hash=110..0 hash=011..0 011..0

slide-16
SLIDE 16

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 16

Programming Model

  • Repository of blocks
  • Content-addressed
  • Immutable
  • Variable-sized
  • Exposed pointers to other

blocks

  • Trees of blocks
  • DAGs due to deduplication
  • No cycles possible
  • Deletion of whole trees

E E 1 1 . . E Root1 E E Root2 hash=010..1 hash=110..0 hash=011..0 011..0

slide-17
SLIDE 17

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 17

Programming Model

  • Repository of blocks
  • Content-addressed
  • Immutable
  • Variable-sized
  • Exposed pointers to other

blocks

  • Trees of blocks
  • DAGs due to deduplication
  • No cycles possible
  • Deletion of whole trees

E 1 1 . . E Root2 hash=110..0 hash=011..0 011..0

slide-18
SLIDE 18

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 18

Architecture overview

  • Standard server-grade hardware running Linux
  • Scalability on data-center level

Storage Nodes Access Nodes

NFS / CIFS

Front-end Back-end (CAS Layer)

Internal Network

slide-19
SLIDE 19

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 19

Data organization: selected requirements

Requirements on scalable storage Required internal data services

Failure tolerance

  • Identify data resilience reduction
  • Fast data rebuilding

High performance

  • Preserve locality of data streams
  • Prefetching

Dynamic scalability

  • Decentralized data management
  • Load balancing
  • Fast data transfer to new location

Deduplication

  • Location of potential duplicates
  • Availability & resiliency verification

On-demand deletion

  • Failure-tolerant, distributed deletion
slide-20
SLIDE 20

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 20

Data organization: selected requirements

Requirements on scalable storage Required internal data services

Failure tolerance

  • Identify data resilience reduction
  • Fast data rebuilding

High performance

  • Preserve locality of data streams
  • Prefetching

Dynamic scalability

  • Decentralized data management
  • Load balancing
  • Fast data transfer to new location

Deduplication

  • Location of potential duplicates
  • Availability & resiliency verification

On-demand deletion

  • Failure-tolerant, distributed deletion
slide-21
SLIDE 21

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 21

Data organization: selected requirements

Requirements on scalable storage Required internal data services

Failure tolerance

  • Identify data resilience reduction
  • Fast data rebuilding

High performance

  • Preserve locality of data streams
  • Prefetching

Dynamic scalability

  • Decentralized data management
  • Load balancing
  • Fast data transfer to new location

Deduplication

  • Location of potential duplicates
  • Availability & resiliency verification

On-demand deletion

  • Failure-tolerant, distributed deletion
slide-22
SLIDE 22

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 22

Data organization: selected requirements

Requirements on scalable storage Required internal data services

Failure tolerance

  • Identify data resilience reduction
  • Fast data rebuilding

High performance

  • Preserve locality of data streams
  • Prefetching

Dynamic scalability

  • Decentralized data management
  • Load balancing
  • Fast data transfer to new location

Deduplication

  • Location of potential duplicates
  • Availability & resiliency verification

On-demand deletion

  • Failure-tolerant, distributed deletion
slide-23
SLIDE 23

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 23

Data organization: selected requirements

Requirements on scalable storage Required internal data services

Failure tolerance

  • Identify data resilience reduction
  • Fast data rebuilding

High performance

  • Preserve locality of data streams
  • Prefetching

Dynamic scalability

  • Decentralized data management
  • Load balancing
  • Fast data transfer to new location

Deduplication

  • Location of potential duplicates
  • Availability & resiliency verification

On-demand deletion

  • Failure-tolerant, distributed deletion
slide-24
SLIDE 24

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 24

Failure tolerance: erasure coding

Decode Any 3 fragments can be lost

Example: N=8, m=5

Encode

Original block

Original Fragments Redundant Fragments

  • Block erasure-coded into N fragments
  • Storage overhead tunable
slide-25
SLIDE 25

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 25

Scalability with DHT: data placement

  • Block location: DHT with prefix routing

1 01 10 11 empty prefix 00 01

slide-26
SLIDE 26

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 26

Scalability with DHT: data placement

  • Block location: DHT with prefix routing
  • Block mapped to hash prefix

hash=011..0

1 01 10 11 empty prefix 00

Block

01

slide-27
SLIDE 27

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 27

Scalability with DHT: data placement

  • Block location: DHT with prefix routing
  • Block mapped to hash prefix
  • Prefix components
  • Hosted on SNs
  • N components

per prefix

hash=011..0 Block

Node 1 Node 6 Node 1 Node 5 Node 1 Node 4 Node 1 Node 3 Node 1 Node 2 Node 1 Node 1

1

1 3 2 1 2 3 1 2 3 2 3 1

01 10 11 empty prefix 00

N=4

slide-28
SLIDE 28

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 28

Scalability with DHT: data placement

hash=011..0 Block

Node 1 Node 6 Node 1 Node 5 Node 1 Node 4 Node 1 Node 3 Node 1 Node 2 Node 1 Node 1

1

1 3 2 1 2 3 1 2 3 2 3 1

01 10 11 empty prefix 00

N=4

  • Block location: DHT with prefix routing
  • Block mapped to hash prefix
  • Prefix components
  • Hosted on SNs
  • N components

per prefix

  • Store fragments
slide-29
SLIDE 29

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 29

Scalability with DHT: data placement

hash=011..0 Block

Node 1 Node 6 Node 1 Node 5 Node 1 Node 4 Node 1 Node 3 Node 1 Node 2 Node 1 Node 1

1

1 3 2 1 2 3 1 2 3 2 3 1

01 10 11 empty prefix 00

N=4

  • Block location: DHT with prefix routing
  • Block mapped to hash prefix
  • Prefix components
  • Hosted on SNs
  • N components

per prefix

  • Store fragments
  • Distributed

consensus

slide-30
SLIDE 30

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 30

Scalability with DHT: data placement

hash=011..0 Block

Node 1 Node 6 Node 1 Node 5 Node 1 Node 4 Node 1 Node 3 Node 1 Node 2 Node 1 Node 1

1

1 3 2 1 2 3 1 2 3 2 3 1

01 10 11 empty prefix 00

N=4

  • Block location: DHT with prefix routing
  • Block mapped to hash prefix
  • Prefix components
  • Hosted on SNs
  • N components

per prefix

  • Store fragments
  • Distributed

consensus

slide-31
SLIDE 31

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 31

Scalability with DHT: data placement

hash=011..0 Block

Node 1 Node 6 Node 1 Node 5 Node 1 Node 4 Node 1 Node 3 Node 1 Node 2 Node 1 Node 1

1

1 3 2 1 2 3 1 2 3 2 3 1

01 10 11 empty prefix 00

N=4

  • Block location: DHT with prefix routing
  • Block mapped to hash prefix
  • Prefix components
  • Hosted on SNs
  • N components

per prefix

  • Store fragments
  • Distributed

consensus

slide-32
SLIDE 32

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 32

Scalability with DHT: data placement

hash=011..0 Block

Node 1 Node 6 Node 1 Node 5 Node 1 Node 4 Node 1 Node 1 Node 3 Node 2 Node 1 Node 1

1

1 3 2 1 2 3 1 2 3 2 3 1

01 10 11 empty prefix 00

N=4

  • Block location: DHT with prefix routing
  • Block mapped to hash prefix
  • Prefix components
  • Hosted on SNs
  • N components

per prefix

  • Store fragments
  • Distributed

consensus

slide-33
SLIDE 33

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 33

Scalability with DHT: data placement

hash=011..0 Block

Node 1 Node 6 Node 1 Node 5 Node 1 Node 4 Node 1 Node 3 Node 1 Node 2 Node 1 Node 1

1

1 3 2 1 2 3 1 2 3 2 3 1

01 10 11 empty prefix 00

N=4

  • Block location: DHT with prefix routing
  • Block mapped to hash prefix
  • Prefix components
  • Hosted on SNs
  • N components

per prefix

  • Store fragments
  • Distributed

consensus

  • Load balancing
slide-34
SLIDE 34

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 34

Data organization: synchrun chains

A B E C D F G

  • Data stream split to blocks
slide-35
SLIDE 35

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 35

Data organization: synchrun chains

A B E C D F G

Hash 010… Hash 101… Hash 110… Hash 011… Hash 000… Hash 011… Hash 100…

  • Data stream split to blocks
  • Hashes of blocks computed
slide-36
SLIDE 36

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 36

Data organization: synchrun chains

A B E C D F G

Hash 010… Hash 101… Hash 110… Hash 011… Hash 000… Hash 011… Hash 100…

  • Data stream split to blocks
  • Hashes of blocks computed
  • Routing through DHT
slide-37
SLIDE 37

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 37

Data organization: synchrun chains

A B E C D F G

Hash 010… Hash 101… Hash 110… Hash 011… Hash 000… Hash 011… Hash 100…

  • Data stream split to blocks
  • Hashes of blocks computed
  • Routing through DHT

Prefix 01

slide-38
SLIDE 38

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 38

Data organization: synchrun chains

A B E C D F G

Hash 010… Hash 101… Hash 110… Hash 011… Hash 000… Hash 011… Hash 100…

Erasure Coding Compression

  • Data stream split to blocks
  • Hashes of blocks computed
  • Routing through DHT

Prefix 01

slide-39
SLIDE 39

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 39

Data organization: synchrun chains

A B E C D F G

Hash 010… Hash 101… Hash 110… Hash 011… Hash 000… Hash 011… Hash 100…

Prefix 01 Erasure Coding Compression

  • Data stream split to blocks
  • Hashes of blocks computed
  • Routing through DHT

Component Component

1

Component

2

Component

3

  • Erasure-coded fragments

stored by components

slide-40
SLIDE 40

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 40

Data organization: synchrun chains

A B E C D F G

Hash 010… Hash 101… Hash 110… Hash 011… Hash 000… Hash 011… Hash 100…

Erasure Coding Compression

  • Data stream split to blocks
  • Hashes of blocks computed
  • Routing through DHT

A D F A D F A D F A D F

Component Component

1

Component

2

Component

3 Prefix 01

  • Erasure-coded fragments

stored by components

slide-41
SLIDE 41

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 41

Data organization: synchrun chains

A B E C D F G

Hash 010… Hash 101… Hash 110… Hash 011… Hash 000… Hash 011… Hash 100…

Synchrun 1 Synchrun 2 Synchrun 3

Prefix 01 Erasure Coding Compression

Synchrun

  • Data stream split to blocks
  • Hashes of blocks computed
  • Routing through DHT

Component Component

1

Component

2

Component

3

  • Erasure-coded fragments

stored by components

  • Grouped into synchruns

A D F A D F A D F A D F

slide-42
SLIDE 42

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 42

Data organization: synchrun chains

A B E C D F G

Hash 010… Hash 101… Hash 110… Hash 011… Hash 000… Hash 011… Hash 100…

Synchrun 1 Synchrun 2 Synchrun 3

Prefix 01 Erasure Coding Compression

  • Data stream split to blocks
  • Hashes of blocks computed
  • Routing through DHT

Component Component

1

Component

2

Component

3 Container

  • Erasure-coded fragments

stored by components

  • Grouped into synchruns
  • Containers stored on disks
  • Fragment metadata

separately from data Synchrun

A D F A D F A D F A D F

slide-43
SLIDE 43

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 43

Data organization: synchrun chains

A B E C D F G

Hash 010… Hash 101… Hash 110… Hash 011… Hash 000… Hash 011… Hash 100…

Synchrun 1 Synchrun 2 Synchrun 3

Erasure Coding Compression

  • Data stream split to blocks
  • Hashes of blocks computed
  • Routing through DHT

A D F A D F A D F A D F

Component Component

1

Component

2

Component

3 Prefix 01

  • Erasure-coded fragments

stored by components

  • Grouped into synchruns
  • Containers stored on disks
  • Fragment metadata

separately from data

  • Ordered synchrun chains
  • Preserve order & locality
  • Manageable

Container

Synchrun

slide-44
SLIDE 44

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 44 Component

01:1

Synchrun chains in a dynamic system

slide-45
SLIDE 45

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 45 Component

01:1

System growth: split

Component

010:1

Component

011:1

slide-46
SLIDE 46

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 46 Component

01:1

Component

010:1

Component

011:1

System growth: split

slide-47
SLIDE 47

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 47 Component

010:1

Component

011:1

System growth: split

Component

01:1

slide-48
SLIDE 48

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 48 Component

01:1

Concatenation

Component

010:1

slide-49
SLIDE 49

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 49 Component

01:1

Concatenation

Component

010:1

Component

010:1

slide-50
SLIDE 50

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 50 Component

01:1

Component

010:1

Marking blocks to reclaim

Component

010:1

slide-51
SLIDE 51

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 51 Component

01:1

Component

010:1

Space reclamation & Concatenation

Component

010:1

Component

010:1

slide-52
SLIDE 52

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 52 Component

01:0

Component

01:1

Component

01:2

Component

01:3

Data Services: Identification of data resiliency level

Missing fragments

slide-53
SLIDE 53

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 53

Data Services: Identification of data resiliency level

Component

01:0

Component

01:1

Component

01:2

Component

01:3

Chain scanning

slide-54
SLIDE 54

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 54

Data Services: Identification of data resiliency level

Component

01:0

Component

01:1

Component

01:2

Component

01:3

Chain scanning

slide-55
SLIDE 55

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 55

Data Services: Identification of data resiliency level

Component

01:0

Component

01:1

Component

01:2

Component

01:3

Chain scanning

slide-56
SLIDE 56

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 56

Data Services: Identification of data resiliency level

Component

01:0

Component

01:1

Component

01:2

Component

01:3

Chain scanning

slide-57
SLIDE 57

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 57

Data services: reconstruction

Component

01:0

Component

01:1

Component

01:2

Component

01:3

  • Sequential read/write of entire Containers
  • Erasure decoding and re-encoding
slide-58
SLIDE 58

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 58

Data services: reconstruction

Component

01:0

Component

01:1

Component

01:2

Component

01:3

  • Sequential read/write of entire Containers
  • Erasure decoding and re-encoding
slide-59
SLIDE 59

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 59

Data services: reconstruction

Component

01:0

Component

01:1

Component

01:2

Component

01:3

  • Sequential read/write of entire Containers
  • Erasure decoding and re-encoding
slide-60
SLIDE 60

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 60

Data services: fast data transfer

Component

01:0

Component

01:1

Component

01:2

Component

01:3 Old component 01:3

Location of new node (DHT)

slide-61
SLIDE 61

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 61

Data services: fast data transfer

Component

01:0

Component

01:1

Component

01:2

Component

01:3 Old component 01:3

Data transfer

slide-62
SLIDE 62

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 62

Data services: fast data transfer

Component

01:0

Component

01:1

Component

01:2

Component

01:3 Old component 01:3

Data transfer

slide-63
SLIDE 63

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 63

Data services: fast data transfer

Component

01:0

Component

01:1

Component

01:2

Component

01:3 Old component 01:3

Data transfer

slide-64
SLIDE 64

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 64

Data services: fast data transfer

Component

01:0

Component

01:1

Component

01:2

Component

01:3 Old component 01:3

slide-65
SLIDE 65

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 65

Data services for deduplication

  • Global: duplicates detected in entire system
  • DHT routing based on content
  • Inline deduplication: has to be high-performance
  • Prefetching Containers for streams of duplicates
  • Block hashes stored separately
slide-66
SLIDE 66

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 66

Data services for deduplication

Component

01:0

Component

01:1

Component

01:2

Component

01:3

hash=011.. Block

Choose complete chain

Completeness: “definitely not a duplicate” Deletion interaction: wasn't the block scheduled for deletion?

slide-67
SLIDE 67

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 67

Data services for deduplication

hash=011.. Block

Component

01:0

Component

01:1

Component

01:2

Component

01:3 Query

slide-68
SLIDE 68

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 68

Data services for deduplication

hash=011.. Block

Local candidate found

Component

01:0

Component

01:1

Component

01:2

Component

01:3

slide-69
SLIDE 69

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 69

Data services for deduplication

hash=011.. Block

Candidate verification

Successful dedup

Component

01:0

Component

01:1

Component

01:2

Component

01:3

slide-70
SLIDE 70

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 70

On-demand data deletion

  • Distributed garbage collection
  • Per-block reference counter stored per-

fragment

  • Failure-tolerant
  • Block reference counter calculated independently
  • n peer Container chains
  • Interference with duplicate elimination:
  • read-only phase for block tree traversal
  • space reclamation in background
slide-71
SLIDE 71

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 71

Writes during node failure

Writing Reconstruction

slide-72
SLIDE 72

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 72

Write Scaling nodes added while writing

slide-73
SLIDE 73

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 73

Write Scaling nodes added while writing

slide-74
SLIDE 74

HYDRAstor: a Scalable Secondary Storage. 9LivesData, LLC 74

Questions?