Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication - - PowerPoint PPT Presentation

low latency network scalable byzantine fault tolerant
SMART_READER_LITE
LIVE PREVIEW

Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication - - PowerPoint PPT Presentation

Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication 12th EuroSys Doctoral Workshop (EuroDW 2018) Ines Messadi, TU Braunschweig, Germany, 2018-04-23 New PhD student (Second month) in the distributed systems group Research area:


slide-1
SLIDE 1

Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication

12th EuroSys Doctoral Workshop (EuroDW 2018)

Ines Messadi, TU Braunschweig, Germany, 2018-04-23

New PhD student (Second month) in the distributed systems group Research area: Resiliency of distributed systems, Byzantine Fault Tolerance Advisor: Rüdiger Kapitza

slide-2
SLIDE 2

Overview

Text

R eplica 1 R eplica 4 C lients R eplica 1 R eplica 1 Byzantine Fault

3f + 1 nodes to tolerate f faults

2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 1 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication

slide-3
SLIDE 3

Overview

C lient Voting C lient Leader

Pre-prepare Prepare

R eplica R eplica R eplica Byzantine Agreement

C

  • mmit

Execution Execution Execution Execution 2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 2 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication

slide-4
SLIDE 4

Overview

C lient Voting C lient Leader

Pre-prepare Prepare

R eplica R eplica R eplica Byzantine Agreement

C

  • mmit

Execution Execution Execution Execution

Problem: Agreement latency overhead & message complexity in BFT

2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 2 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication

slide-5
SLIDE 5

Overview

C lient Voting C lient Leader

Pre-prepare Prepare

R eplica R eplica R eplica Byzantine Agreement

C

  • mmit

Execution Execution Execution Execution

Problem: Agreement latency overhead & message complexity in BFT Reason: Multiple communication rounds & slow TCP networking

2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 2 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication

slide-6
SLIDE 6

Overview

C lient Voting C lient Leader

Pre-prepare Prepare

R eplica R eplica R eplica Byzantine Agreement

C

  • mmit

Execution Execution Execution Execution

Problem: Agreement latency overhead & message complexity in BFT Reason: Multiple communication rounds & slow TCP networking New trend: Availability of modern hardware technology such as Remote Direct Memory Access (RDMA)

2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 2 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication

slide-7
SLIDE 7

Overview

C lient Voting C lient Leader

Pre-prepare Prepare

R eplica R eplica R eplica Byzantine Agreement

C

  • mmit

Execution Execution Execution Execution

Problem: Agreement latency overhead & message complexity in BFT Reason: Multiple communication rounds & slow TCP networking New trend: Availability of modern hardware technology such as Remote Direct Memory Access (RDMA) Consequence: A need to redesign current BFT systems ֒ → How can we build a secure fast and scalable RDMA-based BFT?

2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 2 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication

slide-8
SLIDE 8

Remote Direct Memory Access (RDMA)

Why RDMA ?

Zero-copy data transfer Reduce communication CPU usage

֒ → Low latency and CPU efficiency

1 10 100 200 400 600 800 Payload (KB) Latency (µs)

TCP RDMA Send/Recv RDMA Read/Write

2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 3 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication

slide-9
SLIDE 9

Remote Direct Memory Access (RDMA)

Why RDMA ?

Zero-copy data transfer Reduce communication CPU usage

֒ → Low latency and CPU efficiency

1 10 100 200 400 600 800 Payload (KB) Latency (µs)

TCP RDMA Send/Recv RDMA Read/Write

Challenges

Different communication mechanisms Inappropriate design ⇒ unexpected bad performance Security issues

֒ → Require an explicit design of applications

2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 3 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication

slide-10
SLIDE 10

Remote Direct Memory Access (RDMA)

Why RDMA ?

Zero-copy data transfer Reduce communication CPU usage

֒ → Low latency and CPU efficiency

1 10 100 200 400 600 800 Payload (KB) Latency (µs)

TCP RDMA Send/Recv RDMA Read/Write

Challenges

Different communication mechanisms Inappropriate design ⇒ unexpected bad performance Security issues

֒ → Require an explicit design of applications

Observation

Necessity to redesign the existing BFT protocols for RDMA

2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 3 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication

slide-11
SLIDE 11

Towards building RDMA-based BFT

Basis BFT protocol: Hybster [Behl et al., EuroSys’17]

Building an RDMA-tailored BFT protocol Investigating RDMA communication tradeoffs Counter-measures for the resilient use of RDMA

2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 4 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication

slide-12
SLIDE 12

Towards building RDMA-based BFT

Basis BFT protocol: Hybster [Behl et al., EuroSys’17]

Building an RDMA-tailored BFT protocol Investigating RDMA communication tradeoffs Counter-measures for the resilient use of RDMA

Preliminary approach

Build similar interfaces to TCP programming using RDMA

⇒ Aiming to take fully advantage of RDMA

RDMA C hannel

R eplica R eplica

RDMA-based selector

RDMA C hannel

RDMA-based selector

2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 4 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication

slide-13
SLIDE 13

Towards building RDMA-based BFT

Basis BFT protocol: Hybster [Behl et al., EuroSys’17]

Building an RDMA-tailored BFT protocol Investigating RDMA communication tradeoffs Counter-measures for the resilient use of RDMA

Preliminary approach

Build similar interfaces to TCP programming using RDMA

⇒ Aiming to take fully advantage of RDMA Example applications: Blockchain & coordination services

RDMA C hannel

R eplica R eplica

RDMA-based selector

RDMA C hannel

RDMA-based selector

2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 4 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication