Dynamo Saurabh Agarwal What have we looked at so far ? Assumptions - - PowerPoint PPT Presentation

dynamo
SMART_READER_LITE
LIVE PREVIEW

Dynamo Saurabh Agarwal What have we looked at so far ? Assumptions - - PowerPoint PPT Presentation

Dynamo Saurabh Agarwal What have we looked at so far ? Assumptions CAP Theorem SQL and NoSQL Hashing Origins of Dynamo This is year 2004 One Amazon was growing and other shrinking What led to Dynamo ? What led to Dynamo ?


slide-1
SLIDE 1

Dynamo

Saurabh Agarwal

slide-2
SLIDE 2

What have we looked at so far ?

slide-3
SLIDE 3
slide-4
SLIDE 4

Assumptions

  • CAP Theorem
  • SQL and NoSQL
  • Hashing
slide-5
SLIDE 5

Origin’s of Dynamo

slide-6
SLIDE 6

This is year 2004

One Amazon was growing and other shrinking

slide-7
SLIDE 7

What led to Dynamo ?

slide-8
SLIDE 8

What led to Dynamo ?

  • Amazon was using Oracle enterprise edition
  • Despite access to experts at Oracle, the DB just couldn’t handle the load.
slide-9
SLIDE 9

What did folks at Amazon Do ?

slide-10
SLIDE 10

Query Analysis

90% of operations weren't using the JOIN functionality that is core to a relational database

slide-11
SLIDE 11

Goals which Dynamo wanted to achieve

  • Highly Always available
  • Consistent performance
  • Horizontal Scaling
  • Decentralized
slide-12
SLIDE 12

Goals which Dynamo wanted to achieve

  • Highly Always available
  • Consistent performance
  • Horizontal Scaling
  • Decentralized
slide-13
SLIDE 13

Major aspects of Dynamo design

  • Interface
  • Data Partitioning
  • Data Replication
  • Load Balancing
  • Eventual Consistency
  • And a lot of other this and that, hopefully we will cover all of it.
slide-14
SLIDE 14

Consistency Model

slide-15
SLIDE 15

Eventually Consistent

  • The reads can contain stale data for some bounded time .
slide-16
SLIDE 16

Amazon chose Eventual Consistency Model

  • Application will work just fine with eventual consistency
  • They needed a scalable DB
slide-17
SLIDE 17

Let’s Finally get to Dynamo !!

slide-18
SLIDE 18

This is Dynamo !!

A B C D E F

slide-19
SLIDE 19

Origin of this ring ?

  • Consistent Hashing ?
  • How can we increase or decrease number of nodes in distributed cache

without re-calculating the full distribution of hash table ?

slide-20
SLIDE 20
slide-21
SLIDE 21
  • Each node is assigned a spot in

the ring

  • A data point is the responsibility
  • f the first node in the

clockwise direction (coordinator node)

slide-22
SLIDE 22

Some issues with Consistent Hashing

  • Random Assignment
  • Heterogeneous Performance of

Node

slide-23
SLIDE 23

How replication work ?

  • The coordinator node

replicates to next N-1 nodes.

  • N is the replication factor
slide-24
SLIDE 24

Data Versioning

  • Eventual Consistency
  • Multiple Versions of same data

might exist in systems

  • Come Vector Clocks
slide-25
SLIDE 25

Vector Clocks

slide-26
SLIDE 26

Dynamo DB deployment

  • Loadbalancer
  • Client Aware library
slide-27
SLIDE 27

Dynamo DB query interface

  • get() and put() operations
  • Configurable R and W.
  • R = Min Number of Nodes to read from before returning
  • W = Min number of Nodes on which data should be written before

returning

slide-28
SLIDE 28

Making Dynamo Consistent

  • If R+W > N

○ Dynamo becomes consistent

  • Availability and Performance takes a hit.
slide-29
SLIDE 29

Handling Failures

  • Hinted Handoff
  • Replica Synchronization
slide-30
SLIDE 30

Hinted Handoff

slide-31
SLIDE 31

Replica Synchronization

  • Each node maintains separate Merkle Tree of the key ranges it’s handling
  • A background job runs trying to do a quick match and find which set of

replicas need to be merged.

slide-32
SLIDE 32

Failure Detection

  • If a node is not reachable the request is routed to the next node,
  • No need to explicitly detect failure. As node removal is explicit operation.
slide-33
SLIDE 33

Differences between GFS/BigTable and Dynamo

  • No centralized control
  • No locks on data.
slide-34
SLIDE 34

Optimizations done later

  • Instead of write to disk, write to buffer
  • Separate writer , write to disk
  • Faster write performance
slide-35
SLIDE 35

Change in key partition strategy

  • The one described -

○ Random ○ Hash space not uniform

  • Problems-

○ Data copy difficult ○ Merkle Tree reconstructed

slide-36
SLIDE 36

New Partition Strategy

  • Divide hash space equally in Q portions
  • Each node S is given Q/S tokens
  • A new node randomly picks it’s Q/S+1 tokens
  • A removal of node randomly distributes Q/S

tokens

slide-37
SLIDE 37

Impact

  • A lasting impact on industry, forced SQL advocated to build distributed

SQL DB’s

  • Cassandra, Couchbase
  • Established scalability of NoSQL databases.
slide-38
SLIDE 38

Questions

slide-39
SLIDE 39

Adding a node to the ring

  • The administrator issues a request to one of the node in the ring.
  • The serving request node makes a persistent copy of the membership

change and propagates via gossip protocol

slide-40
SLIDE 40

Node on startup