Riak a distributed, web-inspired database NoSQLBerlin'09 Martin - - PowerPoint PPT Presentation

riak
SMART_READER_LITE
LIVE PREVIEW

Riak a distributed, web-inspired database NoSQLBerlin'09 Martin - - PowerPoint PPT Presentation

Riak a distributed, web-inspired database NoSQLBerlin'09 Martin Scholl <ms@diskware.net> @zeit_geist Historical Notes Riak is Basho Incs brainchild Apache 2.0 licensed first public release 09/08/07


slide-1
SLIDE 1

Riak

a distributed, web-inspired database NoSQLBerlin'09 Martin Scholl <ms@diskware.net> @zeit_geist

slide-2
SLIDE 2

Historical Notes

  • Riak is Basho Inc’s brainchild
  • Apache 2.0 licensed
  • first public release 09/08/07
  • http://riak.basho.com/
  • http://bitbucket.org/justin/riak
  • http://github.com/zeitgeist/riak
slide-3
SLIDE 3
  • 1. Overture
slide-4
SLIDE 4

What is Riak?

  • a lot of Twitter fame recently
  • uses a bunch of buzzword technology
  • its so NoSQL, MapReduce and that stuff
  • written in Erlang
  • even your mother-in-law loves Riak
  • obvious question: how awesome is it really?
slide-5
SLIDE 5

Scientific Model

  • f Awesomeness

Cassandra CouchDB Riak

cool?

✓ ✓ ✓

distributed

✓ ✓

HTTP/REST

✓ ✓

JSON

✓ ✓ Erlang ✓ ✓ M/R ✓ ✓

slide-6
SLIDE 6

We have a winner

  • result of a fair and
  • bjective competition:

Riak is 100% awesome

25 50 75 100 Cassandra CouchDB Riak

awesomeness %

slide-7
SLIDE 7
  • 2. The Serious Part

(caffeine will be served in 42 minutes)

slide-8
SLIDE 8

What Riak really is

  • Distributed Data Storage System (DDSS)
  • BASE
  • Dynamo inspired
  • Erlang implemented
  • MapReduce’ing
  • Textbook style DDSS implementation
slide-9
SLIDE 9

Data Model

  • Data-Sphere: Bucket x Key x Document
  • Bucket: a named scope of keys and values
  • created implicitly, on demand
  • has constraints
  • Key: choose freely
slide-10
SLIDE 10

Document Model

  • Documents hold the actual data
  • actual data can be virtually anything
  • internal data format: Erlang-Tuple
  • current gold-standard: JSON objects
  • model the Web’s nature
  • embedded doc-links!
slide-11
SLIDE 11

2.1 A tour through Riak

We jump off cliff HTTP/REST and land in Riak’s guts

slide-12
SLIDE 12

HTTP/REST JSON-API

  • GET /jiak/<bucket>/<key>
  • fetch a document
  • POST /jiak/<bucket>
  • create a new entry, key gets generated
  • PUT /jiak/<bucket>/<key>
  • create / update a doc
slide-13
SLIDE 13

JSON Documents

User A User B

knows

User C

knows

User D

knows

{ bucket:“users”, key:“A”

  • bject: {

name:... } links:[ [”users”,”B”,”B”], [“users”,”C”,”C”]

] }

slide-14
SLIDE 14

MapReduce Links

  • query Documents via M/R
  • model Graph Structure
  • chain M/R stages
  • Map and Reduce: parallel executed
  • M/R via HTTP/REST:
  • GET /jiak/<Bucket>/<Key>[/<MR>]+

A B C D

slide-15
SLIDE 15

M/R Example

  • Link: [<B>,<K>,<T>]
  • M/R: <B>,<K>,<T>
  • get A’s friends

GET /jiak/users/A/

users,_,_

  • get A’s friends’ friends

GET /jiak/users/A/

users,_,_/users,_,_

A B C D

slide-16
SLIDE 16

Request processing

  • REST API is transparent
  • Each Request is

modelled as an Erlang process

  • different FSMs for Put,

Get, Map and Reduce

  • perations.

HTTP / REST PUT / GET FSM Node Node Node

spawn query

slide-17
SLIDE 17

The Ring

  • Ring: a fixed-size

distribution map

  • data-base for

determining nodes responsible for a key

  • hash: (B x K) -> 160b
  • filtered_preflist:

(Ring x 160b)->Node

slide-18
SLIDE 18

Request Distribution

  • eventual consistency
  • N or n_val: # replicas
  • R: min get()s
  • W: min put()s
  • implemented as Erlang

gen_fsm processes

slide-19
SLIDE 19

The Big Picture

Erlang VM Ring Ring Gossip Eventer

Data Storage Engines

VClocks Put FSM Get FSM HTTP / REST native Client

slide-20
SLIDE 20

Riak is a DDSS Minix

  • Riak’s kernel: ~3.5k LOC!
  • Riak is more than a Document DB
  • clean and self-documenting codebase
  • extensible in many ways
  • Riak is a perfect fit for building reliable and

scalable custom data storage systems!

slide-21
SLIDE 21

Thank you

Riak is more: http://riak.basho.com/ don’t hesitate to contact me [to talk about e.g. Riak, Distributed systems, Erlang, etc.] Martin Scholl <ms (at) globalinfinity.de> global infinity GmbH