This talk is about Data consistency in 3D Understanding consistency - - PowerPoint PPT Presentation

this talk is about data consistency in 3d
SMART_READER_LITE
LIVE PREVIEW

This talk is about Data consistency in 3D Understanding consistency - - PowerPoint PPT Presentation

This talk is about Data consistency in 3D Understanding consistency (Its the invariants, stupid) Primitive consistency mechanisms How primitives compose models How models relate / differ What they cost Understanding


slide-1
SLIDE 1

Data consistency in 3D

(It’s the invariants, stupid)

Marc Shapiro Masoud Saieda Ardekani Gustavo Petri

[Consistency in 3D]

This talk is about…

Understanding consistency

  • Primitive consistency mechanisms
  • How primitives compose models
  • How models relate / differ
  • What they cost

Understanding invariants

  • Some interesting classes of invariants

Relating consistency to invariants

  • Which primitives guarantee which invariants

Useful intuitions for app. and system designers

2 [Consistency in 3D]

Shared database

Social, web, e-commerce: shared mutable data Scalability ⇒ replication ⇒ consistency issues

3

q.push(e) c.inc() c.inc() q.val() c.val()

q: Queue c: Counter { |q| ≤ c }

[Consistency in 3D]

Geo-replicated database

Social, web, e-commerce: shared mutable data Scalability ⇒ replication ⇒ consistency issues

4

5 ms – ∞

q.push(e) c.inc() c.inc() q.val() c.val() q3 ∈ Queue? q1 = q2 ? |q1| ≤ c4 ?

q: Queue c: Counter { |q| ≤ c } q: Queue c: Counter { |q| ≤ c } q: Queue c: Counter { |q| ≤ c }

slide-2
SLIDE 2

[Consistency in 3D]

Consistency

More replicas:

  • Better read availability, responsiveness,

performance, etc.

  • More work to keep replicas in sync

Consistent = behavior similar to sequential:

  • Satisfies specs: does q behave like a queue?
  • Replicas agree: is q identical everywhere?
  • Objects agree: is |q| ≤ c?
  • Same flow of time? q1.push() before q2.push()

5 [Consistency in 3D]

Consistency

  • pportunities and costs

CAP Availability

⟹ Parallelism keeps the hardware busy ⟹ More implem. options, scalable

But consistency constrains order of events:

  • Delay delivery
  • Stale reads
  • Waits, synchronisation (mutual wait)

Keeping track of order requires metadata Significant!

6 [Consistency in 3D]

Serrano-SI P-Store-SER GMU-US Jessy2pc-NMSI RC Walter-PSI SDUR-SER Credit: Masoud Saeida Ardekani

Costs illustrated

7

3× 4.5×

Termination Latency of Update Transactions (ms) 90% Read-only transactions; Disaster Tolerant 70% Read-only transactions; Disaster Tolerant

[Consistency in 3D]

Strict Serialisability

8

T1 T1 R1 R2 R3 Invariant

client

T1 Invariant Invariant T3 T3 T2 T2

slide-3
SLIDE 3

[Consistency in 3D]

Eventual consistency

9

Op1 R1 R2 R3 Op2

[Consistency in 3D]

High performance Low performance

Strong vs. weak?

10

Hard to program Predictable

Strict Serialisability Eventual Consistency Snapshot Isolation Strict Serialisability PRAM

[Consistency in 3D]

High performance Low performance

Strong vs. weak?

11

Hard to program Predictable

Strict Serialisability Eventual Consistency Snapshot Isolation Strict Serialisability PRAM Serialis- ability

[Consistency in 3D]

PL-1 PL-2 Cursor Stability (PL-CS) Monotonic View (PL-2L) Monotonic Snapshot Reads (PL-MSR) Consistent View (PL-2+) Forward Consistent View (PL-FCV) Snapshot Isolation (PL-SI) Update Serializability (PL-3U) Full Serializability (PL-3) Strict Serializability (PL-SS) Repeatable Read (PL-2.99)

Strong vs. weak?

12

Linearizability Sequential Regular Safe Eventual Causal+ Real-time causal Causal Read-your-writes (RYW) Monotonic Reads (MR) Writes-follow-reads (WFR) Monotonic Writes (MW) PRAM (FIFO) Fork Fork* Fork-join causal Bounded fork-join causal Fork sequential Eventual linearizability Timed serial & ∆,Γ-atomicity Processor Fork-based models Slow memory Per-object models Per-record timeline & Coherence Timed causal Bounded staleness & Delta Weak fork-lin. Strong eventual Quiescent Weak k-regular k-safe PBS k-staleness k-atomicity Release Weak ordering Location Scope Lazy release Entry Synchronized models Causal models Staleness-based models Per-object causal Per-key sequential Prefix linearizable Prefix sequential PBS t-visibility Session models Eventual serializability

Transactional Adya 1999 Non-transactional Viotti & Vukolić 2016

slide-4
SLIDE 4

[Consistency in 3D]

Three classes…

13

…of invariant … of protocol Gen1 Object value Total order of operations PO Relative ordering

  • f operations

Visibility EQ State equivalence Composition

[Consistency in 3D]

Three dimensions

14

Eventual Consistency Snapshot Isolation Txnl CC Mostly orthogonal (but not all combinations make sense.)

Gen1 / Total Order E Q / C

  • m

p

  • s

i t i

  • n

PO / Visibility

Causal Linearisability Serialisability Strict Serialisability

CAP

[Consistency in 3D]

Operation

generator: read, compute, generate effector effector: compute, write side-effect Sequential execution:

  • precondition ⟹ invariant
  • each effector individually safe

15

x:=3 x+y>0 x=4 y=–2 x+y≥0 y≔–3 x+y>0 ¬x+y>0 skip

[Consistency in 3D]

Sequential correctness

generator: read, compute, generate effector effector: compute, write side-effect Sequential execution:

  • precondition ⟹ invariant
  • each effector individually safe

16

x:=3 x=4 y=–2 x+y≥0 y≔–3 skip

x=4 y=–2

x=3

y=–2 true

slide-5
SLIDE 5

[Consistency in 3D]

Guarantee vs. semantics

Guarantee:

  • Class of invariants that is always true
  • Regardless of application code
  • Assuming sequentially correct

Application can compensate for absence

  • f guarantee
  • e.g. Inv={ c≥0 }, app: c.inc()

17 [Consistency in 3D]

Data types

Register

  • Update: assign with constant
  • Not commutative
  • Absorbing

High-level types

  • Counter, ORset, Sequence:

effectors commute

  • Stock, Account, Queue: ¬ commute

Composed data

  • + structural invariants

18 [Consistency in 3D]

Replicated operation

u: state ⤻ (retval, (state ⤻ state)) Read one, write all (ROWA) Deferred-update replication (DUR)

19

  • rigin

replica

u! u! u?

client

u

replica

uPRE u!

replica

v? v! uPRE uPRE

[Consistency in 3D]

Sharded, geo-replicated

20

x1 y1 z1 x2 y2 z2 DC1 DC2 z2%2=0 x2:=0 x1:=0 x1>0 y2+=1 y1+=1 x=1 y’=1 x=0 y=0 DC3

¬ read my writes arbitrary origin sharded, parallel concurrent updates

slide-6
SLIDE 6

[Consistency in 3D]

Type EQ invariants

  • A = B
  • x.friendOf (y) ⟺ y.friendOf (x)
  • x + y = constant
  • South ⨄ Boat ⨄ North

= { sheep, dog, wolf } Joint update to two objects Atomicity (all-or-nothing) property of transactions Protocol: single update message

  • Asynchronous

21 [Consistency in 3D]

EQ: transactional composition

Airplane reservation

  • Allocate a seat to me
  • Pay for the flight

Two EQ relations:

  • paid = have_seat
  • my $$ + airline $$ = constant

Ad-hoc grouping (This txn also needs TO + snapshot)

22 [Consistency in 3D]

EQ/Composition axis

Transaction groups operations All-or-nothing effects:

  • Deliver effectors indivisibly
  • packaged together
  • + same TOE
  • ≈ 2-phase commit

Snapshot reads:

  • all generators read from

same set of effectors

  • maintain versions
  • + same TO, VIS guarantees
  • coordination

23

All-or-nothing effects + snapshot 0 = Independent

  • perations

[Consistency in 3D]

EQ/Composition axis

Transaction groups operations All-or-nothing effects:

  • Deliver effectors indivisibly
  • packaged together
  • + same TOE
  • ≈ 2-phase commit

Snapshot reads:

  • all generators read from

same set of effectors

  • maintain versions
  • + same TO, VIS guarantees
  • coordination

24

All-or-nothing effects + snapshot 0 = Independent

  • perations

Serialisability Snapshot Isolation

  • Trans. Causal

RC Linearisability PRAM

slide-7
SLIDE 7

[Consistency in 3D]

EQ/Composition axis

Transaction groups operations All-or-nothing effects:

  • Deliver effectors indivisibly
  • packaged together
  • + same TOE
  • ≈ 2-phase commit

Snapshot reads:

  • all generators read from

same set of effectors

  • maintain versions
  • + same TO, VIS guarantees
  • coordination

25

All-or-nothing effects + snapshot 0 = Independent

  • perations

Serialisability Snapshot Isolation

  • Trans. Causal

RC Linearisability PRAM

[Consistency in 3D]

Type PO invariants

  • employee.manager.salary ≥ employee.salary
  • S1; S2; S3 ≣ S1 ⟸ S2 ⟸ S3
  • dog ∈ S ⟸ sheep ∈ S ∧ wolf ∈ S
  • Referential integrity
  • “inode references disk block”
  • ACL (u, p) ⟸ access (u, p)

Demarcation Protocol:

  • 1. increase LHS by c
  • 2. increase RHS by c' ≤ c

⟹ ordered delivery No synchronisation: Available

26 [Consistency in 3D]

PO: transitive / causal visibility

x = 100; y = 100 Inv = { x ≥ y } Ex 1:

  • P1: x += 100
  • P2: if x > y then y += (x–y)/2
  • P3: x ≥ y?
  • Transitive visibility vis* ⊆ vis

Ex 2:

  • P1: x += 100; d ≔ 100
  • P2: if d > 0 then y += d/2
  • P3: x ≥ y?

Causal visibility (vis; po)* ⊆ vis

27

x! x! x! y! y! x!

[Consistency in 3D]

PO: transitive / causal visibility

x = 100; y = 100 Inv = { x ≥ y } Ex 1:

  • P1: x += 100
  • P2: if x > y then y += (x–y)/2
  • P3: x ≥ y?
  • Transitive visibility vis* ⊆ vis

Ex 2:

  • P1: x += 100; d ≔ 100
  • P2: if d > 0 then y += d/2
  • P3: x ≥ y?

Causal visibility (vis; po)* ⊆ vis

28

x! x! y! y! x! d! d! x! x! y! y! x!

client is part of DB

slide-8
SLIDE 8

[Consistency in 3D]

Monotonic client Total causal order Transitive Visibility Causal Visibility

PO/Visibility axis

29

Monotonic client Total causal order 0 = Rollbacks External

Visibility

  • Which writes visible to

reads Transitive closure property

  • Metadata
  • System-wide

Sender not delayed ⟹ writes available Stale data ⟹ reads available

[Consistency in 3D]

Monotonic client

  • Read My Writes
  • Monotonic Reads

Often assumed

  • Buffer

30

Monotonic client Total causal order 0 = Rollbacks Transitive Visibility Causal Visibility External

Eventual Consistency “Not reasonable”

[Consistency in 3D]

Transitive, causal vis.

  • Effector: metadata identifies set of

predecessor effectors

  • Delay delivery after predecessors
  • Read stale data
  • Graph: unbounded
  • Vector clock: 104—106 entries × 8 bytes!
  • Approximate VC: stronger order

31

Monotonic client Total causal order 0 = Rollbacks Transitive Visibility Causal Visibility External

SER NMSI PSI

x y! y x! d! d x! x! y! y! x!

client is part of DB

[Consistency in 3D]

Total/external causal

Total order extends causal order Metadata: 1 single scalar

  • but cost of total order

External: real-time clock

32

Monotonic client Total causal order 0 = Rollbacks Transitive Visibility Causal Visibility External

Gentle Rain Linearisable SSER

slide-9
SLIDE 9

[Consistency in 3D]

Gen1 invariants

Inv = “0 ≤ x” u! = “x ≔ x–1” { Inv ∧1≤ x} u! { Inv } Predict that Inv will be true after u!:

  • Sequential: weakest precondition
  • Generalises to bounded concurrency

Unbounded concurrency: no sufficient precondition

  • Invariant is not stable
  • Limit concurrency: escrow
  • No concurrency: order updates

33 [Consistency in 3D]

Gen1: total order

34

Gen1 Gapless TO effectors 0 = Concurrent Total order, capricious TO generators + TO effectors TO generators = effectors CAP

Do replicas observe events in the same order? Pick a unique number

[Consistency in 3D]

0 = unordered

No: concurrent

  • Commute ⟹ converge
  • Stable precondition ⟹ Invariant

35

Gapless TO effectors 0 = Concurrent Total order, capricious TO generators + TO effectors TO generators = effectors CAP

Do replicas observe events in the same order? Pick a unique number

x! x! y! y!

[Consistency in 3D]

Capricious TO effectors

Pick a number locally: capricious Gap: will arrive later?

  • Non-monotonic: rollback
  • Monotonic
  • Wait for gap to fill (Lamport 78)
  • Lost updates (LWW)

36

Gapless TO effectors 0 = Concurrent Total order, capricious TO generators + TO effectors TO generators = effectors CAP

Do replicas observe events in the same order? Pick a unique number

10 7

EC Lamport LWW

slide-10
SLIDE 10

[Consistency in 3D]

Capricious TO effectors

Pick a number locally: capricious Gap: will arrive later?

  • Non-monotonic: rollback
  • Monotonic
  • Wait for gap to fill (Lamport 78)
  • Lost updates (LWW)

37

Gapless TO effectors 0 = Concurrent Total order, capricious TO generators + TO effectors TO generators = effectors CAP

Do replicas observe events in the same order? Pick a unique number

10 7

EC Lamport LWW

[Consistency in 3D]

Gapless TO effectors

Gapless:

  • No lost updates
  • Consensus, 2PC to uniquely

allocate next free number ⟹ not available

38

Gapless TO effectors 0 = Concurrent Total order, capricious TO generators + TO effectors TO generators = effectors CAP

Do replicas observe events in the same order? Pick a unique number

8 7

SMR PSI NMSI

[Consistency in 3D]

TO generators

TO effectors + TO generators

  • separate from effectors
  • same order as effectors

39

Gapless TO effectors 0 = Concurrent Total order, capricious TO generators + TO effectors TO generators = effectors CAP

Do replicas observe events in the same order? Pick a unique number

SI LIN SER SSER

[Consistency in 3D]

Gen1 / Total Order

Three dimensions

40

EQ / Composition PO / Visibility

Total causal order 0 = Rollbacks Transitive Visibility Causal Visibility External Monotonic client A l l

  • r
  • n
  • t

h i n g e f f e c t s + s n a p s h

  • t

= I n d e p e n d e n t

  • p

e r a t i

  • n

s Gapless TO effectors 0 = Concurrent Total order, capricious TO generators + TO effectors TO generators = effectors CAP

slide-11
SLIDE 11

[Consistency in 3D]

Gen1 / Total Order

Three dimensions

41

EQ / Composition PO / Visibility

Total causal order 0 = Rollbacks Transitive Visibility Causal Visibility External Monotonic client A l l

  • r
  • n
  • t

h i n g e f f e c t s + s n a p s h

  • t

= I n d e p e n d e n t

  • p

e r a t i

  • n

s Gapless TO effectors 0 = Concurrent Total order, capricious TO generators + TO effectors TO generators = effectors CAP

EC

[Consistency in 3D]

Gen1 / Total Order

Three dimensions

42

EQ / Composition PO / Visibility

Total causal order 0 = Rollbacks Transitive Visibility Causal Visibility External Monotonic client A l l

  • r
  • n
  • t

h i n g e f f e c t s + s n a p s h

  • t

= I n d e p e n d e n t

  • p

e r a t i

  • n

s Gapless TO effectors 0 = Concurrent Total order, capricious TO generators + TO effectors TO generators = effectors CAP

Lin

[Consistency in 3D]

Gen1 / Total Order

Three dimensions

43

EQ / Composition PO / Visibility

Total causal order 0 = Rollbacks Transitive Visibility Causal Visibility External Monotonic client A l l

  • r
  • n
  • t

h i n g e f f e c t s + s n a p s h

  • t

= I n d e p e n d e n t

  • p

e r a t i

  • n

s Gapless TO effectors 0 = Concurrent Total order, capricious TO generators + TO effectors TO generators = effectors CAP

SSER Lin

[Consistency in 3D]

Gen1 / Total Order

Three dimensions

44

EQ / Composition PO / Visibility

Total causal order 0 = Rollbacks Transitive Visibility Causal Visibility External Monotonic client A l l

  • r
  • n
  • t

h i n g e f f e c t s + s n a p s h

  • t

= I n d e p e n d e n t

  • p

e r a t i

  • n

s Gapless TO effectors 0 = Concurrent Total order, capricious TO generators + TO effectors TO generators = effectors CAP

SER SSER Lin

slide-12
SLIDE 12

[Consistency in 3D]

Gen1 / Total Order

Three dimensions

45

EQ / Composition PO / Visibility

Total causal order 0 = Rollbacks Transitive Visibility Causal Visibility External Monotonic client A l l

  • r
  • n
  • t

h i n g e f f e c t s + s n a p s h

  • t

= I n d e p e n d e n t

  • p

e r a t i

  • n

s Gapless TO effectors 0 = Concurrent Total order, capricious TO generators + TO effectors TO generators = effectors CAP

SSER SER PSI Lin

[Consistency in 3D]

Gen1 / Total Order

Three dimensions

46

EQ / Composition PO / Visibility

Total causal order 0 = Rollbacks Transitive Visibility Causal Visibility External Monotonic client A l l

  • r
  • n
  • t

h i n g e f f e c t s + s n a p s h

  • t

= I n d e p e n d e n t

  • p

e r a t i

  • n

s Gapless TO effectors 0 = Concurrent Total order, capricious TO generators + TO effectors TO generators = effectors CAP

SSER SER PSI SI Lin

[Consistency in 3D]

Gen1 / Total Order

Three dimensions

47

EQ / Composition PO / Visibility

Total causal order 0 = Rollbacks Transitive Visibility Causal Visibility External Monotonic client A l l

  • r
  • n
  • t

h i n g e f f e c t s + s n a p s h

  • t

= I n d e p e n d e n t

  • p

e r a t i

  • n

s Gapless TO effectors 0 = Concurrent Total order, capricious TO generators + TO effectors TO generators = effectors CAP

Txnl CC

[Consistency in 3D] 48

Total Order Composition Visibility Rollbacks Monotonic Transitive Causal External All-or-Nothing + Snapshot SER SSER All-or-Nothing Effectors TOG=TOE Single Operation SC LIN All-or-Nothing + Snapshot NMSI PSI SSI All-or-Nothing Effectors Gapless TOE Single Operation All-or-Nothing + Snapshot Bayou ∅ All-or-Nothing Effectors ∅ Capricious TOE Single Operation LWW ∅ All-or-Nothing + Snapshot Causal HAT ∅ All-or-Nothing Effectors RC ∅ Concurrent Ops Single Operation EC PRAM CC ∅

slide-13
SLIDE 13

[Consistency in 3D]

Summary

Distributed, replicated data

  • Improves read availability
  • Parallel updates may violate invariants
  • Guarantee: invariants maintained by system
  • System vs. application cost trade-off
  • Tools needed

3D consistency design space

  • Total order (effectors, generators)
  • Visibility order
  • Transactional Composition

Work in progress

49 [Consistency in 3D]

Creative Commons Attribution-ShareAlike 4.0 Intl. License

You are free to:

  • Share — copy and redistribute the material in any medium
  • r format
  • Adapt — remix, transform, and build upon the material

for any purpose, even commercially, under the following terms: Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.

50 [Consistency in 3D]

4 session guarantees ≣ causal

w1! w1! Monotonic reads r2 r3

Client / No rollback: r3 must include w1

52

w1! w1! w1 Read My Writes r2

Client / RMW: r2 must include w1

w1! w1 w1! Monotonic writes w2! w2 w2!

Global / No rollback: r3 must include w1

Writes Follow Reads w1! w1! w3! w3 w3!

Global / WR dependence: w3 must follow w1

r2