Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication - - PowerPoint PPT Presentation
Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication - - PowerPoint PPT Presentation
Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication 12th EuroSys Doctoral Workshop (EuroDW 2018) Ines Messadi, TU Braunschweig, Germany, 2018-04-23 New PhD student (Second month) in the distributed systems group Research area:
Overview
Text
R eplica 1 R eplica 4 C lients R eplica 1 R eplica 1 Byzantine Fault
3f + 1 nodes to tolerate f faults
2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 1 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication
Overview
C lient Voting C lient Leader
Pre-prepare Prepare
R eplica R eplica R eplica Byzantine Agreement
C
- mmit
Execution Execution Execution Execution 2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 2 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication
Overview
C lient Voting C lient Leader
Pre-prepare Prepare
R eplica R eplica R eplica Byzantine Agreement
C
- mmit
Execution Execution Execution Execution
Problem: Agreement latency overhead & message complexity in BFT
2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 2 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication
Overview
C lient Voting C lient Leader
Pre-prepare Prepare
R eplica R eplica R eplica Byzantine Agreement
C
- mmit
Execution Execution Execution Execution
Problem: Agreement latency overhead & message complexity in BFT Reason: Multiple communication rounds & slow TCP networking
2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 2 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication
Overview
C lient Voting C lient Leader
Pre-prepare Prepare
R eplica R eplica R eplica Byzantine Agreement
C
- mmit
Execution Execution Execution Execution
Problem: Agreement latency overhead & message complexity in BFT Reason: Multiple communication rounds & slow TCP networking New trend: Availability of modern hardware technology such as Remote Direct Memory Access (RDMA)
2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 2 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication
Overview
C lient Voting C lient Leader
Pre-prepare Prepare
R eplica R eplica R eplica Byzantine Agreement
C
- mmit
Execution Execution Execution Execution
Problem: Agreement latency overhead & message complexity in BFT Reason: Multiple communication rounds & slow TCP networking New trend: Availability of modern hardware technology such as Remote Direct Memory Access (RDMA) Consequence: A need to redesign current BFT systems ֒ → How can we build a secure fast and scalable RDMA-based BFT?
2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 2 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication
Remote Direct Memory Access (RDMA)
Why RDMA ?
Zero-copy data transfer Reduce communication CPU usage
֒ → Low latency and CPU efficiency
1 10 100 200 400 600 800 Payload (KB) Latency (µs)
TCP RDMA Send/Recv RDMA Read/Write
2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 3 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication
Remote Direct Memory Access (RDMA)
Why RDMA ?
Zero-copy data transfer Reduce communication CPU usage
֒ → Low latency and CPU efficiency
1 10 100 200 400 600 800 Payload (KB) Latency (µs)
TCP RDMA Send/Recv RDMA Read/Write
Challenges
Different communication mechanisms Inappropriate design ⇒ unexpected bad performance Security issues
֒ → Require an explicit design of applications
2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 3 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication
Remote Direct Memory Access (RDMA)
Why RDMA ?
Zero-copy data transfer Reduce communication CPU usage
֒ → Low latency and CPU efficiency
1 10 100 200 400 600 800 Payload (KB) Latency (µs)
TCP RDMA Send/Recv RDMA Read/Write
Challenges
Different communication mechanisms Inappropriate design ⇒ unexpected bad performance Security issues
֒ → Require an explicit design of applications
Observation
Necessity to redesign the existing BFT protocols for RDMA
2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 3 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication
Towards building RDMA-based BFT
Basis BFT protocol: Hybster [Behl et al., EuroSys’17]
Building an RDMA-tailored BFT protocol Investigating RDMA communication tradeoffs Counter-measures for the resilient use of RDMA
2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 4 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication
Towards building RDMA-based BFT
Basis BFT protocol: Hybster [Behl et al., EuroSys’17]
Building an RDMA-tailored BFT protocol Investigating RDMA communication tradeoffs Counter-measures for the resilient use of RDMA
Preliminary approach
Build similar interfaces to TCP programming using RDMA
⇒ Aiming to take fully advantage of RDMA
RDMA C hannel
R eplica R eplica
RDMA-based selector
RDMA C hannel
RDMA-based selector
2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 4 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication
Towards building RDMA-based BFT
Basis BFT protocol: Hybster [Behl et al., EuroSys’17]
Building an RDMA-tailored BFT protocol Investigating RDMA communication tradeoffs Counter-measures for the resilient use of RDMA
Preliminary approach
Build similar interfaces to TCP programming using RDMA
⇒ Aiming to take fully advantage of RDMA Example applications: Blockchain & coordination services
RDMA C hannel
R eplica R eplica
RDMA-based selector
RDMA C hannel
RDMA-based selector
2018-04-23 Ines Messadi, TU Braunschweig, Germany Page 4 Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication