Deadlock-Recovery Support for Fault-tolerant Routing Algorithms in - - PowerPoint PPT Presentation

deadlock recovery support for fault tolerant routing
SMART_READER_LITE
LIVE PREVIEW

Deadlock-Recovery Support for Fault-tolerant Routing Algorithms in - - PowerPoint PPT Presentation

MCSoC-13 National Institute of Informatics, Tokyo, Japan, September 26-28, 2013 Deadlock-Recovery Support for Fault-tolerant Routing Algorithms in 3D-NoC Architectures Akram Ben Ahmed, Achraf Ben Ahmed, Abderazek Ben Abdallah The University of


slide-1
SLIDE 1

Deadlock-Recovery Support for Fault-tolerant Routing Algorithms in 3D-NoC Architectures

Akram Ben Ahmed, Achraf Ben Ahmed, Abderazek Ben Abdallah The University of Aizu Graduate School of Computer Science and Engineering, Adaptive Systems Laboratory, Aizu-Wakamatsu, Japan. Email:d8141104@u-aizu.ac.jp

MCSoC-13 National Institute of Informatics, Tokyo, Japan, September 26-28, 2013

Adaptive systems lab The University of Aizu 1

slide-2
SLIDE 2

Outline

  • Background
  • Motivation and goal
  • Look-Ahead-Fault-Tolerant routing
  • RAB mechanism for deadlock-recovery
  • Evaluation
  • Conclusion and future work

Adaptive systems lab The University of Aizu 2

slide-3
SLIDE 3

Outline

  • Background
  • Motivation and goal
  • Look-Ahead-Fault-Tolerant routing
  • RAB mechanism for deadlock-recovery
  • Evaluation
  • Conclusion and future work

Adaptive systems lab The University of Aizu 3

slide-4
SLIDE 4

The University of Aizu Adaptive systems lab

* A. Ben Ahmed and A. Ben Abdallah. Architecture and Design of High-throughput, Low-latency, and Fault-Tolerant Routing Algorithm for 3D-Network-on-Chip (3D-NoC). The Journal of Supercomputing, Apr.2013, DOI: 10.1007/s11227-013-0940-9. ** S. Lakhani, Y. Wang, A. Milenkovic, and V. Milutinovic, “2-D matrix multiplication on a 3-D systolic array,” Microelectron. Journal, vol. 27, no.1, pp. 11–22, Feb. 1996.

Typical 3D-NoC structure (3D-OASIS-NoC)*

Background: 3D-NoC systems

4

  • 2D-NoC limitations:

– Large diameter – High power

  • 3D-NoC merits

– High scalability – Low interconnect power – Heterogonous integration

200um** 4mm**

X Y Z

slide-5
SLIDE 5
  • 3D-NoC systems are complex and they are

susceptible to a variety kinds of faults that can be caused by different factors:

– Physical damage – Crosstalk – Thermal power etc..

  • Types: permanent, transient, and intermittent.
  • Faults can cause the information corruption or

the entire system failure

The University of Aizu Adaptive systems lab 5

Background: Fault Tolerance

slide-6
SLIDE 6

The University of Aizu Adaptive systems lab

Background: Fault Tolerant Routing Algorithm

30 31 32 033 20 21 22 023 010 011 012 013 000 001 002 003 30 31 32 133 20 21 22 123 110 111 112 113 100 101 102 103 230 231 232 233 220 221 222 223 210 211 212 213 200 201 202 203

Fault tolerant Routing algorithms can be used to redirect the flits to non-faulty links Since fault tolerant routing algorithms are adaptive, the deadlock problem is

  • ne of the main concern

6

X Y Z

slide-7
SLIDE 7

212 211 201

210 200 211 201 212 202

201

Background: Deadlock (Example)

Adaptive systems lab The University of Aizu 201 202 222 212 202 212 211 201 200 222 202 203 7

Permanent Fault link Valid link

slide-8
SLIDE 8

210 200 211 201 212 202

Background: Deadlock (Example)

Adaptive systems lab The University of Aizu 202 222 212 201 201 222 212 202 211 202 201 212 211 201 203 200 8

Permanent Fault link Valid link

slide-9
SLIDE 9

210 200 211 201 212 202

Background: Deadlock (Example)

Adaptive systems lab The University of Aizu 222 212 201 201 222 212 202 202 201 211 201 203 211 212 200 202

BLOCK BLOCK BLOCK BLOCK

DEADLOCK

9

Permanent Fault link Valid link

slide-10
SLIDE 10

Dest 203 Dest 222 Dest 201 Dest 200

201 211 212 202

Dest 212 Dest 211 Dest 201 Dest 202

Background: Virtual Channels

Adaptive systems lab The University of Aizu 10

Permanent Fault link Valid link

slide-11
SLIDE 11

201 211 212 202

Dest 12 Dest 11 Dest 201 Dest 202

Background: Virtual Channels

Adaptive systems lab The University of Aizu Dest 203 Dest 222 Dest 201 Dest 200 11

Permanent Fault link Valid link

slide-12
SLIDE 12

201 211 212 202

Dest 212 Dest 211 Dest 201 Dest 202

Background: Virtual Channels

Adaptive systems lab The University of Aizu Dest 203 Dest 222 Dest 201 Dest 200 12

Permanent Fault link Valid link

slide-13
SLIDE 13

201 211 212 202

Dest 201

Background: Virtual Channels

Adaptive systems lab The University of Aizu Dest 212 Dest 201 Dest 211 Dest 202 13

Permanent Fault link Valid link

slide-14
SLIDE 14

201 211 212 202

Dest 201

Background: Virtual Channels

Adaptive systems lab The University of Aizu Dest 212 Dest 201 Dest 211 Dest 202 14

Permanent Fault link Valid link

slide-15
SLIDE 15

Outline

  • Background
  • Motivation and goal
  • Look-Ahead-Fault-Tolerant routing
  • RAB mechanism for deadlock-recovery
  • Evaluation
  • Conclusion and future work

Adaptive systems lab The University of Aizu 15

slide-16
SLIDE 16

Motivation and Goal

Adaptive systems lab The University of Aizu

  • Previously, we presented a high throughput

fault tolerant routing algorithm named Look- Ahead-Fault-Tolerant (LAFT).

  • LAFT is an adaptive routing that takes

advantage of look-ahead routing to enhance the system performance while guaranteeing fault tolerance. LAFT is susceptible to deadlock

16

slide-17
SLIDE 17

Motivation and Goal

Adaptive systems lab The University of Aizu

  • Virtual Channels (VCs) are used in most systems

to solve the deadlock

– Expensive to implement – Require additional clock cycles for arbitration

  • We present Random-Access-Buffer mechanism

to solve the deadlock problem at very low cost

17

slide-18
SLIDE 18

Outline

  • Background
  • Motivation and goal
  • Look-Ahead-Fault-Tolerant routing
  • RAB mechanism for deadlock-recovery
  • Evaluation
  • Conclusion and future work

Adaptive systems lab The University of Aizu 18

slide-19
SLIDE 19

The University of Aizu Adaptive systems lab

Look-Ahead-Fault-Tolerant: Example

Fault link Valid link Source node Destination node

S D

Current node Next node

C N

Current out-port Next out-port

S C D

1- The current out-port is read from the flit and the next-node address is computed

N 20

slide-20
SLIDE 20

The University of Aizu Adaptive systems lab

Look-Ahead-Fault-Tolerant: Example

Fault link Valid link Source node Destination node

S D

Current node Next node

C N

Current out-port Next out-port

S C D

2- The three possible direction are calculated: North, East, and Up

N 21

slide-21
SLIDE 21

The University of Aizu Adaptive systems lab

Look-Ahead-Fault-Tolerant: Example

Fault link Valid link Source node Destination node

S D

Current node Next node

C N

Current out-port Next out-port

S C D

3- When verifying the link status of the three directions, two possible directions are computed: North and UP (East is faulty)

N 22

slide-22
SLIDE 22

The University of Aizu Adaptive systems lab

Look-Ahead-Fault-Tolerant: Example

Fault link Valid link Source node Destination node

S D

Current node Next node

C N

Current out-port Next out-port

S C D

4- When calculating the diversity value of each direction, North has the highest one: North=3 (North, east, and up); Up=2 (North and east)

N 23

slide-23
SLIDE 23

The University of Aizu Adaptive systems lab

Look-Ahead-Fault-Tolerant: Example

Fault link Valid link Source node Destination node

S D

Current node Next node

C N

Current out-port Next out-port

S C D

5- North is selected as the Next out-port and it is embedded in the flit to be used in the next downstream node

N 24

slide-24
SLIDE 24

Outline

  • Background
  • Motivation and goal
  • Look-Ahead-Fault-Tolerant routing
  • RAB mechanism for deadlock-recovery
  • Evaluation
  • Conclusion and future work

Adaptive systems lab The University of Aizu 25

slide-25
SLIDE 25

FIFO manager RAB manager

P2 South P1 North P1 North

Rd_adr data_out Next_port

Rd_adr Wr_adr Timer

data_in

deadlock_flag

Wr_adr

RAB_cntrl

Select_Wr Select_Rd tail RAB_Wr_adr RAB_Rd_adr head sw_grnt

Random-Access-Buffer mechanism: Architecture

FIFO manager Manages the input buffer when no deadlock is detected Timer If the flit’s request is not served after a period of time a flag is issued RAB manager When receiving the flag, it drops the request of the blocking flit and search for another one

The University of Aizu Adaptive systems lab 26

slide-26
SLIDE 26

P3 Up P2 East P1 North P1 North

RAB cntrl

Sw-gr 0 Timer North The University of Aizu Adaptive systems lab

0 0 0 0

Random-Access-Buffer mechanism: Example

Status-register Used to keep the status of the blocking flits

data-out Next-port Wrt_adr Rd_adr data-out 27

slide-27
SLIDE 27

P3 Up P2 East P1 North P1 North

RAB cntrl

sw-gr 0 Timer North The University of Aizu Adaptive systems lab

0 0 0 0

Random-Access-Buffer mechanism: Example

  • Timer informs that the flit being processed did not get the grant and it is blocked

data-out Next-port Wrt_adr Rd_adr data-out 28

slide-28
SLIDE 28

P3 Up P2 East P1 North P1 North

RAB cntrl

data-out Next-port sw-gr 0 Timer North The University of Aizu Adaptive systems lab

0 0 0 0

Wrt_adr Rd_adr

Random-Access-Buffer mechanism: Example

data-out

  • The request is dropped and the status-register is updated for the entire packet
  • RAB cntrl reads the next packet Next-port

0 0 1 1

29

slide-29
SLIDE 29

P3 Up P2 East P1 North P1 North

RAB cntrl

data-out Next-port sw-gr 1 Timer East The University of Aizu Adaptive systems lab

0 0 0 0

Wrt_adr Rd_adr

Random-Access-Buffer mechanism: Example

data-out

  • The next flit is checked and served

0 0 1 1

30

slide-30
SLIDE 30

P3 Up P1 North P1 North

RAB cntrl

data-out Next-port sw-gr 1 Timer Up The University of Aizu Adaptive systems lab

0 0 1 1

Wrt_adr Rd_adr

Random-Access-Buffer mechanism: Example

data-out

  • When assigning the Wrt-adr, the RAB cntrl check the status register and assign

an unoccupied slot to ovoid flit overwriting

31

slide-31
SLIDE 31

P1 North P1 North

RAB cntrl

data-out Next-port sw-gr 1 Timer North The University of Aizu Adaptive systems lab

0 0 0 0

Wrt_adr Rd_adr

Random-Access-Buffer mechanism: Example

data-out

  • Slots are freed and the input buffer can host new incoming flits

0 0 1 1

P5 West P4 South P1 North P1 North

32

slide-32
SLIDE 32

RAB cntrl

data-out Next-port sw-gr 1 Timer South The University of Aizu Adaptive systems lab

0 0 0 0

Wrt_adr Rd_adr

Random-Access-Buffer mechanism: Example

data-out

  • The previously blocking packet is given the grant and served
  • The Status-register is updated

0 0 1 1

P5 West P4 South P1 North P1 North P5 West P4 South

0 0 0 0

33

slide-33
SLIDE 33

Outline

  • Background
  • Motivation and goal
  • Look-Ahead-Fault-Tolerant routing
  • RAB mechanism for deadlock-recovery
  • Evaluation
  • Conclusion and future work

Adaptive systems lab The University of Aizu 34

slide-34
SLIDE 34
  • We evaluate:

– Hardware complexity

  • Area(ALUTs)
  • Power
  • Speed

– System performance

  • Latency/flit
  • Throughput

The University of Aizu Adaptive systems lab

Evaluation: Evaluation methodology

  • Benchmarks:

– Transpose – Uniform – Matrix Multiplication

  • We use:

– Verilog HDL – Quartus II ver. 12.0 – Target device : Stratix III – Modelsim ver. 6.5

35

slide-35
SLIDE 35

Evaluation: Evaluation parameters

The University of Aizu Adaptive systems lab 36

Simulation configuration

slide-36
SLIDE 36

The University of Aizu Adaptive systems lab

Evaluation: Hardware complexity

+7.1% Vs. LAFT +20% Vs. XYZ unchanged Vs. LAFT

  • 6% Vs. XYZ

+ 3.5% Vs. LAFT +6.3% Vs. XYZ

37

Hardware complexity results

slide-37
SLIDE 37

The University of Aizu Adaptive systems lab

Evaluation: Performance (Latency/flit)

Benchmark XYZ LAFT (0%) LAFT (5%) LAFT (10%) LAFT (15%) LAFT (20%) LAFT+RAB (20%) Transpose

9700 6250 8050 9450 11300 X 13124

Uniform

101000 68500 86500 X X X 104080

Matrix

980 720 825 1090 X X 1223

  • The proposed RAB mechanism has almost the same latency/flit as LAFT with

deadlock recovery mechanism

  • At low fault rate, LAFT provides a latency/flit reduction that can reach 35%
  • RAB mechanism managed to recover from deadlock while the previous system

at 10% fault-rate

38

Average Latency/flit results

slide-38
SLIDE 38

The University of Aizu Adaptive systems lab

Evaluation: Performance (Throughput)

  • The proposed RAB mechanism has almost the same throughput as LAFT with

deadlock recovery mechanism

  • At low fault rate, LAFT provides a throughput enhancement that can reach

47%

Benchmark XYZ LAFT (0%) LAFT (5%) LAFT (10%) LAFT (15%) LAFT (20%) LAFT+RAB (20%) Transpose

14.05 26.15 19.6 14.85 11.65 X 10.1

Uniform

13.15 22.4 13.5 X X X 12.58

Matrix

9.8 16.35 16.3 7.65 X X 6.02 39

Average Throughput results

slide-39
SLIDE 39

Outline

  • Background
  • Motivation and goal
  • Look-Ahead-Fault-Tolerant routing
  • RAB mechanism for deadlock-recovery
  • Evaluation
  • Conclusion and future work

Adaptive systems lab The University of Aizu 40

slide-40
SLIDE 40

Conclusion

  • Proposal of an Efficient deadlock recovery

mechanism named Random-Access-Buffer (RAB mechanism)

  • RAB was implemented with high throughput

fault tolerant routing algorithm called Look- Ahead-Fault-Tolerant (LAFT)

  • Complexity and performance evaluation

The University of Aizu Adaptive systems lab 41

slide-41
SLIDE 41

Conclusion

  • The proposed mechanism provides an average of

27% latency/flit reduction and 35% throughput enhancement when compared to XYZ based system at the absence of faults.

  • At high fault rates, RAB mechanism guarantees

deadlock freedom while the previous system fails starting from 10% fault-rate

The University of Aizu Adaptive systems lab 42

slide-42
SLIDE 42

Conclusion

  • RAB mechanism exhibits only 7.1% area
  • verhead and 3% increasing power with almost

the same behavior in terms of speed when compared to the previous LAFT-based architecture with no deadlock support

The University of Aizu Adaptive systems lab 43

slide-43
SLIDE 43

Future work

  • Use larger benchmarks to illustrate the real

behavior of the proposed mechanism.

  • Optimize the area and power overhead
  • Extend RAB to be able to detect and recover

form different types faults in the input buffer

The University of Aizu Adaptive systems lab 44

slide-44
SLIDE 44

Thank you for your attention

Adaptive systems lab The University of Aizu 45