Programming Distributed Systems 07 Consistency Annette Bieniusa AG - - PowerPoint PPT Presentation

programming distributed systems
SMART_READER_LITE
LIVE PREVIEW

Programming Distributed Systems 07 Consistency Annette Bieniusa AG - - PowerPoint PPT Presentation

Programming Distributed Systems 07 Consistency Annette Bieniusa AG Softech FB Informatik TU Kaiserslautern Summer Term 2018 Annette Bieniusa Programming Distributed Systems Summer Term 2018 1/ 48 Motivation One of the most important


slide-1
SLIDE 1

Programming Distributed Systems

07 Consistency Annette Bieniusa

AG Softech FB Informatik TU Kaiserslautern

Summer Term 2018

Annette Bieniusa Programming Distributed Systems Summer Term 2018 1/ 48

slide-2
SLIDE 2

Motivation

One of the most important abstraction in distributed computing is shared state. Problematic:

Communication is typically slow and/or unreliable Cannot achieve strong consistency, low latency, and availability at the same time

All material and graphics in this section are based on material by Sebastian Burkhardt (Microsoft Research)[1].

Annette Bieniusa Programming Distributed Systems Summer Term 2018 2/ 48

slide-3
SLIDE 3

Consistency in Database Systems

The distributed systems and database communities use the same word, con- sistency, with different meanings. Distributed systems: “consistency” refers to the observable be- haviour of a data store. Databases: roughly the same concept is called “isolation”,whereas the term “consistency” refers to the property that application code is se- quentially safe (the C in ACID).

Annette Bieniusa Programming Distributed Systems Summer Term 2018 3/ 48

slide-4
SLIDE 4

“Single-Value Register”

Operations rd() → v and wr(v) → ok System architecture:

Annette Bieniusa Programming Distributed Systems Summer Term 2018 4/ 48

slide-5
SLIDE 5

Implementation 1: Single-copy Register

Single replica of shared register Forward all read and write requests

Annette Bieniusa Programming Distributed Systems Summer Term 2018 5/ 48

slide-6
SLIDE 6

Implementation 2: Epidemic Register

Each replica stores a timestamped value Reads return this value; writes update this value, stamped with current time (e.g. logical clock) At random times, replicas send stored timestamped value to random recipients When receiving timestamped value, replace locally stored value if incoming timestamp is later

Annette Bieniusa Programming Distributed Systems Summer Term 2018 6/ 48

slide-7
SLIDE 7

Question

Can clients observe a difference between the two implementations (single-copy vs. epidemic)?

Assumptions: Asynchronous communication Fairness of transport “Randomly” generated values

Annette Bieniusa Programming Distributed Systems Summer Term 2018 7/ 48

slide-8
SLIDE 8

Notions of consistency

Single-Copy Register: Linearizability Epidemic Register: Sequential Consistency When generalized to key-value store, the epidemic variant guarantees Eventual Consistency (if sending randomly selected tuple in each message) or Causal Consistency (if sending all tuples in each message)

Annette Bieniusa Programming Distributed Systems Summer Term 2018 8/ 48

slide-9
SLIDE 9

Consistency model

Required for any type of storage (system) that processes more than one operation at a time. Unless the consistency model is linearizability (= single-copy semantics), applications observe non-sequential behaviors, called anomalies. The set of possible behaviors, and conversely of possible anomalies, constitutes the consistency model of the data store.

Annette Bieniusa Programming Distributed Systems Summer Term 2018 9/ 48

slide-10
SLIDE 10

Consistency specifications

Annette Bieniusa Programming Distributed Systems Summer Term 2018 10/ 48

slide-11
SLIDE 11

What is a replicated shared object / service?

Different names and examples: REST Service, file system, key-value store, counters, registers, . . . Formally specified by a set of operations Op and either

a sequential semantics S, or a concurrent semantics F

Annette Bieniusa Programming Distributed Systems Summer Term 2018 11/ 48

slide-12
SLIDE 12

Sequential semantics

S : Op × Op∗ → V al Operation to be performed Sequence of all prior operations (“current state”) Returned value Example: Register S(rd, ǫ) = undef (read returns initial value) S(rd, wr(2) · wr(8)) = 8 (read returns last value written) S(wr(3), rd · wr(2) · wr(8)) = ok (write always returns ok)

Annette Bieniusa Programming Distributed Systems Summer Term 2018 12/ 48

slide-13
SLIDE 13

Histories

A history records all the interactions between clients and the system. Operations performed Indication whether operation successfully completed and return value Relative order of concurrent operations Session of an operation (corresponds to client / connection)

Annette Bieniusa Programming Distributed Systems Summer Term 2018 13/ 48

slide-14
SLIDE 14

Classically, histories are represented as sequences of calls and returns[2]. ⇒ Generalize this to event graphs

Annette Bieniusa Programming Distributed Systems Summer Term 2018 14/ 48

slide-15
SLIDE 15

Annette Bieniusa Programming Distributed Systems Summer Term 2018 15/ 48

slide-16
SLIDE 16

Annette Bieniusa Programming Distributed Systems Summer Term 2018 16/ 48

slide-17
SLIDE 17

Annette Bieniusa Programming Distributed Systems Summer Term 2018 17/ 48

slide-18
SLIDE 18

Annette Bieniusa Programming Distributed Systems Summer Term 2018 18/ 48

slide-19
SLIDE 19

Annette Bieniusa Programming Distributed Systems Summer Term 2018 19/ 48

slide-20
SLIDE 20

Annette Bieniusa Programming Distributed Systems Summer Term 2018 20/ 48

slide-21
SLIDE 21

Annette Bieniusa Programming Distributed Systems Summer Term 2018 21/ 48

slide-22
SLIDE 22

Annette Bieniusa Programming Distributed Systems Summer Term 2018 22/ 48

slide-23
SLIDE 23

Event graphs

An event graph represents an execution of a system. Vertices: events Attributes: label for vertices with information on the corresponding event (e.g. which operation, parameters, return values) Relations: orderings or groupings of events

Definition

An event graph G is a tuple (E, d1, . . . , dn) where E ⊆ Events is a finite or countably infinite set of events, and each di is an attribute or relation over E.

Annette Bieniusa Programming Distributed Systems Summer Term 2018 23/ 48

slide-24
SLIDE 24

Histories as event graphs

A history is an event graph (E, op, rval, rb, ss) where

  • p : E → Op associate operation with an event

rval : E → V alues ∪ {∇} are return values (∇ denotes that

  • peration never returns)

rb is returns-before order ss is same-session relation

Annette Bieniusa Programming Distributed Systems Summer Term 2018 24/ 48

slide-25
SLIDE 25

Hands-on: Timeline diagram vs. event graph

Annette Bieniusa Programming Distributed Systems Summer Term 2018 25/ 48

slide-26
SLIDE 26

Annette Bieniusa Programming Distributed Systems Summer Term 2018 26/ 48

slide-27
SLIDE 27

When is a history valid?

Common approach: Require linearizability

Insert linearization points between begin and end of operation Semantics of operations must hold with respect to these linearization points Linearization points serves as justification / witness for a history

Here: Consistency semantics beyond linearizability!

Annette Bieniusa Programming Distributed Systems Summer Term 2018 27/ 48

slide-28
SLIDE 28

Specifying the Consistency Semantics

History: defines the what client interaction is observable Specification: is a “test” on histories

But how do we specify such a “test” / predicate?

Execution: is an account of what happened when executing the implementation

Operational consistency model

Provides an abstract reference implementation whose behaviors provide the specifications Well-studied methodology for proving correctness (e.g. simulation relations or refinement) Problem: Typically close to specific concrete implementation technique

Annette Bieniusa Programming Distributed Systems Summer Term 2018 28/ 48

slide-29
SLIDE 29

Specifying the Consistency Semantics

Abstract execution: account of the “essence” of what happened

Applicable to many implementations Correctness critirion: History is valid if consistent with an abstract execution satisfying some consistency guarantees

Concrete execution: account of what happened when executing a particular actual implementation

Axiomatic consistency model

Uses logical conditions to define valid behaviors Allows to combine different aspects (here: consistency guarantees)

Annette Bieniusa Programming Distributed Systems Summer Term 2018 29/ 48

slide-30
SLIDE 30

Decomposing abstract executions

Essence of what happened can be tracked down to two basic responsibilities of the underlying protocol:

  • 1. Update Propagation: All operations must eventually become visible

everywhere

  • 2. Conflict Resolution: Conflicting operations must be arbitrated

consistently

Annette Bieniusa Programming Distributed Systems Summer Term 2018 30/ 48

slide-31
SLIDE 31

Visibility

Relation that determines the subset of operations “visible” to an

  • peration

Relative timing of update propagation and operations a vis − − → b Effect of operation a is visible to the client performing b Updates are concurrent if they are not ordered by visibility (i.e. if they cannot see each other)

Annette Bieniusa Programming Distributed Systems Summer Term 2018 31/ 48

slide-32
SLIDE 32

Arbitration

Used for resolution of update conflicts (i.e. concurrent updates that do not commute) a ar − → b Total order on operations Often solved in practice by using timestamps

Annette Bieniusa Programming Distributed Systems Summer Term 2018 32/ 48

slide-33
SLIDE 33

Abstract Executions

An abstract execution is an event graph (E, op, rval, rb, ss, vis, ar) such that (E, op, rval, rb, ss) is a history vis is acyclic ar is a total order

Annette Bieniusa Programming Distributed Systems Summer Term 2018 33/ 48

slide-34
SLIDE 34

Abstract Executions

An abstract execution is an event graph (E, op, rval, rb, ss, vis, ar) such that (E, op, rval, rb, ss) is a history vis is acyclic ar is a total order

Annette Bieniusa Programming Distributed Systems Summer Term 2018 33/ 48

slide-35
SLIDE 35

Return Values in Abstract Executions

An abstract execution (E, op, rval, rb, ss, vis, ar) satisfies a sequential semantics S if rval(e) = S(op(e), vis−1.sort(ar)) Observed state = visible operations sorted by arbitration

Annette Bieniusa Programming Distributed Systems Summer Term 2018 34/ 48

slide-36
SLIDE 36

Consistency guarantee

A consistency guarantee is a predicate or property of an abstract execution. Consistency model is collection of all the guarantees needed; histories must be justifiable by an abstraction execution that satisfies them all. Ordering guarantees ensure that the order of operations is preserved (under certain conditions). Transactions ensure that operation sequences do not become visible individually. Synchronization operations can enforce ordering selectively.

Annette Bieniusa Programming Distributed Systems Summer Term 2018 35/ 48

slide-37
SLIDE 37

Important consistency models

Linearizability = SingleOrder ∧ Realtime ∧ RVal SequentialConsistency = SingleOrder ∧ ReadMyWrites ∧ RVal CausalConsistency = EventualVisibility ∧ Causality ∧ RVal BasicEventualConsistency = EventualVisibility ∧ NoCircularCausality ∧ RVal

RVal refers to ReadValueConsistency

Annette Bieniusa Programming Distributed Systems Summer Term 2018 36/ 48

slide-38
SLIDE 38

Eventual Consistency (Quiescent Consistency)

In any execution where the updates stop at some point (i.e. where there are only finitely many updates), then eventually (i.e. after some unspecified amount of time) each session converges to the same state. Often used in replicated data stores In essence: Convergence It says nothing about

when the replicas will converge what the state is that they will converge to what is allowed in the meantime when there is no phase of quiescence

Very weak guarantee ⇒ Difficult to program against

Annette Bieniusa Programming Distributed Systems Summer Term 2018 37/ 48

slide-39
SLIDE 39

Eventual visibility

An abstract execution satisfies EventualVisibility if all events become eventually visible. ∀e ∈ E : |{e′ ∈ E|(e rb − → e′) ∧ (e vis − − → e′)} < ∞

Annette Bieniusa Programming Distributed Systems Summer Term 2018 38/ 48

slide-40
SLIDE 40

Session guarantees

When issuing multiple operations in sequence within a session, we usually expect additional properties (session consistency) Session Order: so = rb ∩ ss

Annette Bieniusa Programming Distributed Systems Summer Term 2018 39/ 48

slide-41
SLIDE 41

Read My Writes

It would be confusing if Alice would not see her own message. Fix: Require that session order implies visiblity so ⊆ vis

Annette Bieniusa Programming Distributed Systems Summer Term 2018 40/ 48

slide-42
SLIDE 42

Monotonic Reads

It would be confusing if Bob read Alice’ message, but when he later read again, he would not see the message anymore Fix: Require that visibility is monotonic with respect to session

  • rder:

vis ◦ so ⊆ vis

Annette Bieniusa Programming Distributed Systems Summer Term 2018 41/ 48

slide-43
SLIDE 43

Consistent Prefix

Alice and Bob post concurrent different values, and the write of Bob is arbitrated after the update of Alice. Charlie reads and sees Bob’s message; then later, in the same session, he

  • nly sees the “earlier” message of Alice.

Fix: Require that remote operations become visible after all operations that precede them in arbitration order ar ◦ (vis ∩ ¬ss) ⊆ vis

Annette Bieniusa Programming Distributed Systems Summer Term 2018 42/ 48

slide-44
SLIDE 44

Causality Guarantees

Axiomatic definition of happens-before relation: hb = ((rb ∩ ss) ∪ vis)+ Captures session order and transitive closure of session order and visibility NoCircularCausality: acyclic(hb) CausalVisibility: hb ⊆ vis CausalArbitration: hb ⊆ ar Causality: CausalVisibility ∧ CausalArbitration

Annette Bieniusa Programming Distributed Systems Summer Term 2018 43/ 48

slide-45
SLIDE 45

Causal Consistency

Strongest model that can implemented in such a way as to be available even under (network) partitions Causal consistency implies all session guarantees with the exception of Consistent Prefix. CausalConsistency = EventualVisibility ∧ Causality ∧ RVal

Annette Bieniusa Programming Distributed Systems Summer Term 2018 44/ 48

slide-46
SLIDE 46

Strong Models

Ensure a single global order of operations that determines both visibility and arbitration SingleOrder: ∃E′ ⊆ rval−1(∇) : vis = ar (E′ × E) What it means: Arbitration and visibility are the same except for subset E′ that represents incomplete operations that are not visible to any other operation. Assuming, arbitration order corresponds to (global) timestamps, the SingleOrder implies that:

  • 1. An operation can only see operations with earlier timestamps.
  • 2. An operation must see all complete operations with earlier

timestamps.

Annette Bieniusa Programming Distributed Systems Summer Term 2018 45/ 48

slide-47
SLIDE 47

Linearizability vs. Sequential Consistency

Linearizability requires RealTime: rb ⊆ ar Sequential consistency requires ReadMyWrites (restricted to sessions) To observe the difference between the two, clients must be able to communicate over some “side channel” that allows them to

  • bserve real time ordering.

Annette Bieniusa Programming Distributed Systems Summer Term 2018 46/ 48

slide-48
SLIDE 48

Conclusion

Consistency for single operations

Transactions introduce another dimension

Next lectures: Replicated Datatypes

Annette Bieniusa Programming Distributed Systems Summer Term 2018 47/ 48

slide-49
SLIDE 49

Further reading I

[1] Sebastian Burckhardt. “Principles of Eventual Consistency”. In: Foundations and Trends in Programming Languages 1.1-2 (2014),

  • S. 1–150. doi: 10.1561/2500000011. url:

https://doi.org/10.1561/2500000011. [2] Maurice Herlihy und Jeannette M. Wing. “Linearizability: A Correctness Condition for Concurrent Objects”. In: ACM Trans.

  • Program. Lang. Syst. 12.3 (1990), S. 463–492. doi:

10.1145/78969.78972. url: http://doi.acm.org/10.1145/78969.78972.

Annette Bieniusa Programming Distributed Systems Summer Term 2018 48/ 48