CS 764: Topics in Database Management Systems Lecture 11: Two-Phase - - PowerPoint PPT Presentation

cs 764 topics in database management systems lecture 11
SMART_READER_LITE
LIVE PREVIEW

CS 764: Topics in Database Management Systems Lecture 11: Two-Phase - - PowerPoint PPT Presentation

CS 764: Topics in Database Management Systems Lecture 11: Two-Phase Commit (2PC) Xiangyao Yu 10/12/2020 1 Announcement Submit a 1-page course project proposal by Oct. 21 A list of project ideas are uploaded to the course website


slide-1
SLIDE 1

Xiangyao Yu 10/12/2020

CS 764: Topics in Database Management Systems Lecture 11: Two-Phase Commit (2PC)

1

slide-2
SLIDE 2

Announcement

2

Submit a 1-page course project proposal by Oct. 21 A list of project ideas are uploaded to the course website

(http://pages.cs.wisc.edu/~yxy/cs764-f20/CS764-Fall2020-project-ideas.pdf)

slide-3
SLIDE 3

Today’s Paper: Distributed Transactions in R*

ACM Trans. Database Syst. 1986.

3

slide-4
SLIDE 4

Agenda

4

Two-phase commit Presumed abort (PA) Presumed Commit (PC) Deadlock detection

slide-5
SLIDE 5

Distributed Transactions

5

Architectures: shared-nothing vs. shared-disk Data is partitioned and stored in each server A distributed transaction accesses data across multiple partitions

Network

CPU HDD Memory CPU HDD Memory CPU HDD Memory

Shared Nothing

CPU HDD Memory CPU HDD Memory CPU HDD Memory

Shared Disk

Network

slide-6
SLIDE 6

Distributed Transactions

6

Architectures: shared-nothing vs. shared-disk Data is partitioned and stored in each server A distributed transaction accesses data across multiple partitions

Network

CPU HDD Memory CPU HDD Memory CPU HDD Memory

Shared Nothing

CPU HDD Memory CPU HDD Memory CPU HDD Memory

Shared Disk

Network

tuple A tuple B

Transaction T: write(A) write(B)

slide-7
SLIDE 7

Atomic Commit Protocol (ACP)

7

Atomic commit protocol: applies a set of distinct changes as a single operation

tuple A tuple B

Transaction T: write(A) write(B)

The two updates must commit or abort atomically Example:

slide-8
SLIDE 8

The Challenge of Atomic Commit

8

A naïve approach: all nodes log and commit independently

tuple A tuple B

Transaction T: write(A) write(B)

Log and commit

Commit Log and commit

Node 1 Node 2

back to caller

slide-9
SLIDE 9

The Challenge of Atomic Commit

9

A naïve approach: all nodes log and commit independently Node 2 crashes before logging

  • Transaction T commits in node 1 but not in node 2

tuple A tuple B

Transaction T: write(A) write(B)

Commit Log and commit

Node 1 Node 2

slide-10
SLIDE 10

Two-Phase Commit (2PC)

10

Coordinator Subordinate 2 tuple A tuple B

Key idea: let the coordinator log the final commit/abort decision

Subordinate 1

slide-11
SLIDE 11

Two-Phase Commit (2PC)

11

Coordinator Subordinate 2 tuple A tuple B

Key idea: let the coordinator log the final commit/abort decision Phase 1: prepare phase

Subordinate 1 [log] prepare*

PREPARE VOTE YES

[log] prepare*

VOTE YES PREPARE

slide-12
SLIDE 12

Two-Phase Commit (2PC)

12

Coordinator Subordinate 2 tuple A tuple B Subordinate 1 [log] prepare* [log] commit*

PREPARE VOTE YES

[log] prepare*

VOTE YES PREPARE

Key idea: let the coordinator log the final commit/abort decision Phase 1: prepare phase Phase 2: commit phase

  • Coordinator logs the decision

back to caller

slide-13
SLIDE 13

Two-Phase Commit (2PC)

13

Coordinator Subordinate 2 tuple A tuple B Subordinate 1 back to caller [log] prepare* [log] commit*

COMMIT PREPARE VOTE YES

end

ACK

forget the txn [log] prepare*

VOTE YES ACK

[log] commit*

PREPARE

[log] commit*

Key idea: let the coordinator log the final commit/abort decision Phase 1: prepare phase Phase 2: commit phase

  • Coordinator logs the decision
  • Coordinator sends the decision to

subordinates

  • Coordinator forgets the transaction

after receiving ACKs

slide-14
SLIDE 14

2PC – Abort Example

14

abort* Coord Subord1

PREPARE VOTE NO

Subordinate returns VOTE NO if the transaction is aborted

  • Subordinate can release locks

and forget the transaction

Subord2 prepare*

VOTE YES

slide-15
SLIDE 15

2PC – Abort Example

15

back to caller abort* abort* Coord Subord1 abort*

ABORT PREPARE VOTE NO

end forget the txn Subord2 prepare*

VOTE YES ACK

Subordinate returns VOTE NO if the transaction is aborted

  • Subordinate can release locks

and forget the transaction

Skip the commit phase for aborted subordinates

slide-16
SLIDE 16

2PC – All Subordinates Abort

16

back to caller abort* abort* + end

PREPARE VOTE NO

forget the txn

Skip the second phase entirely if the transaction aborts at all the subordinates

abort*

VOTE NO

Coord Subord1 Subord2

slide-17
SLIDE 17

2PC – Failures

17

Use timeout to detect failures Subordinate timeout

  • Waiting for PREPARE: self abort

back to caller prepare* / abort* commit* / abort*

PREPARE VOTE YES/NO

commit* / abort* end

COMMIT/ABORT ACK

forget the txn

Time out

Coord Subord

slide-18
SLIDE 18

2PC – Failures

18

back to caller prepare* / abort* commit* / abort*

PREPARE VOTE YES/NO

commit* / abort* end

COMMIT/ABORT ACK

forget the txn Coord Subord

Use timeout to detect failures Subordinate timeout

  • Waiting for PREPARE: self abort

Coordinator timeout

  • Waiting for vote: self abort

Time out

slide-19
SLIDE 19

2PC – Failures

19

back to caller prepare* / abort* commit* / abort*

PREPARE VOTE YES/NO

commit* / abort* end

COMMIT/ABORT ACK

forget the txn Coord Subord

Use timeout to detect failures Subordinate timeout

  • Waiting for PREPARE: self abort
  • Waiting for decision: contact

coordinator or peer subordinates (may block and wait indefinitely)

Coordinator timeout

  • Waiting for vote: self abort

Time out

slide-20
SLIDE 20

2PC – Failures

20

back to caller prepare* / abort* commit* / abort*

PREPARE VOTE YES/NO

commit* / abort* end

COMMIT/ABORT ACK

forget the txn Coord Subord

Use timeout to detect failures Subordinate timeout

  • Waiting for PREPARE: self abort
  • Waiting for decision: contact

coordinator or peer subordinates (may block and wait indefinitely)

Coordinator timeout

  • Waiting for vote: self abort
  • Waiting for ACK: contact

subordinates

Time out

slide-21
SLIDE 21

2PC – Alternative Designs?

21

Subordinate returns vote to coordinator before logging prepare?

back to caller prepare commit*

PREPARE VOTE YES/NO

commit* end

COMMIT/ABORT ACK

forget the txn Coord Subord

slide-22
SLIDE 22

2PC – Alternative Designs?

22

Subordinate returns vote to coordinator before logging prepare? Problem: subordinate may crash before the log record is written to disk. The log record is thus lost but the coordinator already committed the transaction

back to caller prepare commit*

PREPARE VOTE YES/NO

commit* end

COMMIT/ABORT ACK

forget the txn Coord Subord

slide-23
SLIDE 23

2PC – Alternative Designs?

23

Coordinator sends decision to subordinates before logging the decision?

back to caller prepare* commit

PREPARE VOTE YES/NO

commit* end

COMMIT/ABORT ACK

forget the txn Coord Subord

slide-24
SLIDE 24

2PC – Alternative Designs?

24

Coordinator sends decision to subordinates before logging the decision? Problem: coordinator crashes before logging the decision and decides to abort after restart

back to caller prepare* commit

PREPARE VOTE YES/NO

commit* end

COMMIT/ABORT ACK

forget the txn Coord Subord

slide-25
SLIDE 25

Optimization 1: Presumed Abort (PA)

25

Observation: It is safe for a coordinator to “forget” a transaction immediately after it makes the decision to abort it and to write an abort record

slide-26
SLIDE 26

PA: Aborted Transaction

26

Coord Subord1 Subord2 back to caller abort

PREPARE VOTE NO

prepare* abort* Coord Subord1

PREPARE VOTE NO

Subord2 prepare*

VOTE YES

Standard 2PC Presumed Abort

  • The abort record is not forced in subordinate
slide-27
SLIDE 27

PA: Aborted Transaction

27

Coord Subord1 Subord2 back to caller abort abort abort

ABORT PREPARE

forget the txn prepare*

VOTE YES

back to caller abort* abort* Coord Subord1 abort*

ABORT PREPARE VOTE NO

end forget the txn Subord2 prepare*

VOTE YES ACK

Standard 2PC Presumed Abort

  • The abort record is not forced in subordinate
  • The abort record is not forced in coordinator
  • Coordinator forgets the transaction early
  • No ACK for aborts
  • Behavior of committed transactions unchanged

VOTE NO

slide-28
SLIDE 28

PA: Partially Readonly Transactions

28

back to caller commit* Coord Subord1 commit*

COMMIT PREPARE VOTE READ

end forget the txn Subord2 prepare*

VOTE YES ACK

back to caller prepare* commit* Coord Subord1 commit*

COMMIT PREPARE VOTE YES

end

ACK

forget the txn Subord2 prepare*

VOTE YES ACK

commit*

Readonly subordinate does not log in prepare phase and skips commit phase

slide-29
SLIDE 29

PA: Completely Readonly Transactions

29

back to caller Coord Subord1

PREPARE VOTE READ

forget the txn Subord2

VOTE READ

back to caller prepare* commit* Coord Subord1 commit*

COMMIT PREPARE VOTE YES

end

ACK

forget the txn Subord2 prepare*

VOTE YES ACK

commit*

Completely readonly transactions skip the commit phase entirely

slide-30
SLIDE 30

Optimization 2: Presumed Commit (PC)

30

Since most transactions are expected to commit, can we make commits cheaper by eliminating the ACKs for COMMITS?

slide-31
SLIDE 31

PC: Committed Transaction

31

Coord Subord1 Subord2 prepare*

PREPARE VOTE YES

prepare*

VOTE YES

collecting* back to caller commit* commit

COMMIT

commit back to caller prepare* commit* Coord Subord1 commit*

COMMIT PREPARE VOTE YES

end

ACK

forget the txn Subord2 prepare*

VOTE YES ACK

commit*

Need to force log collecting due to potential abort of coordinator No need to send ACK for COMMITS

slide-32
SLIDE 32

PC: Aborted Transaction

32

abort* Coord Subord1

PREPARE VOTE NO

Subord2 prepare*

VOTE YES

collecting* back to caller abort* abort*

COMM IT

end forget the txn

ACK

back to caller abort* abort* Coord Subord1 abort*

ABORT PREPARE VOTE NO

end forget the txn Subord2 prepare*

VOTE YES ACK

Abort behavior is similar to standard 2PC but requires logging collecting

slide-33
SLIDE 33

Summary

33

Presumed Abort (PA) is better than standard 2PC (widely used in practice) Presumed Commit (PC) is worse than PA in most cases

slide-34
SLIDE 34

Distributed Deadlock Detection

34

Maintains wait-for graph

  • Send wait-for graph to the next node based on lexicographic order of

transactions

  • When a cycle is detected, choose the local transaction as the victim
  • In practice, timeout works pretty well
slide-35
SLIDE 35

Conclusions

35

Distributed transaction requires an atomic commit protocol Two-phase commit (2PC) is the most widely used atomic commit protocol

  • Standard 2PC
  • Optimization 1: presumed abort (PA) — most commonly used in practice
  • Optimization 2: presumed commit (PC)
slide-36
SLIDE 36

Q/A – Two Phase Commit

36

Which is more expensive? Forced-writes vs. network messages Architecture for hierarchical 2PC? Deadlock detection vs. avoidance? How is the coordinator chosen? A hotspot? Readonly vs. PA/PC? What if a whole subordinate went down indefinitely? What if the coordinator went down and new subordinate is elected as the new coordinator?

slide-37
SLIDE 37

Before Next Lecture

Look for teammates for the course project J Submit review before next lecture

  • David J. DeWitt, Jim Gray, Parallel Database Systems: The Future of High

Performance Database Systems. Comm. ACM 1992.

37