Motivation Motivation Extreme static Extreme static & & - - PDF document

motivation motivation
SMART_READER_LITE
LIVE PREVIEW

Motivation Motivation Extreme static Extreme static & & - - PDF document

Microelectronic System Design Research Group System Design Research Group Microelectronic University Kaiserslautern University Kaiserslautern www.eit.uni- -kl.de kl.de/wehn /wehn www.eit.uni A Case Case Study Study in in Reliability


slide-1
SLIDE 1

1

A A Case Case Study Study in in Reliability Reliability-

  • Aware

Aware Design: Design: A A Resilient Resilient LDPC Decoder LDPC Decoder Architecture Architecture

Norbert Norbert Wehn Wehn

MPSoC 07 Awaji Island, Hyogo, Japan June 2007

This work was partially sponsored by the BMBF Initiative „Autonome Integrierte Systeme“

Microelectronic Microelectronic System Design Research Group System Design Research Group University Kaiserslautern University Kaiserslautern www.eit.uni www.eit.uni-

  • kl.de

kl.de/wehn /wehn

Motivation Motivation

  • Extreme

Extreme static static & & dynamic dynamic variations variations will will result result in in unreliable unreliable components components

  • How

How to to build build reliable reliable systems systems with with „ „physical layer physical layer“ “ ? ?

  • Resilient architectures tolerating variabilty and sporadic error

Resilient architectures tolerating variabilty and sporadic errors s

α-particle Cosmic neutrons

slide-2
SLIDE 2

2

Case Case Study Study: LDPC Decoder : LDPC Decoder

  • Emerging Killer Applications

Emerging Killer Applications

– – Recognition, Mining, Synthesis (RMS) Recognition, Mining, Synthesis (RMS) – – Probabilistic belief propagation algorithms Probabilistic belief propagation algorithms

  • LDPC decoding representative for RMS algorithms

LDPC decoding representative for RMS algorithms

– – Hot topic in wireless communications (WiMAX, DVB Hot topic in wireless communications (WiMAX, DVB-

  • S2, WiFi, space applications)

S2, WiFi, space applications) – – High troughput, low latency requirements, large flexibility High troughput, low latency requirements, large flexibility

  • Communication

Communication and and memory memory centric centric architecture architecture

  • Sources

Sources of

  • f unreliability

unreliability

– – E.g E.g. . timing timing errors errors in in communication communication network network due due to cross to cross talk talk and and voltage voltage noise noise – – E.g E.g. soft . soft errors errors in in memories and communication network memories and communication network

Goal: Goal: Increase Increase LDPC LDPC decoder reliability decoder reliability for for a a given given system system performance performance with with minimum minimum hardware overhead and throughput degradation hardware overhead and throughput degradation

Unreliable physical layer (transistor, circuit)

Error Resilient Error Resilient LDPC Decoder LDPC Decoder

  • Large design space for resilient architectures

Large design space for resilient architectures – – Spatial Spatial-

  • and time

and time redundancy redundancy e.g. e.g. TRM TRM (space), ARQ (space), ARQ (time) (time) – – Error Error detection/correction codes detection/correction codes e.g e.g. . CRC, Hamming CRC, Hamming codes codes

  • Application resilience (probabilistic & iterative)

Application resilience (probabilistic & iterative)

Architecture Subblocks Algorithm LDPC Decoding Application

slide-3
SLIDE 3

3

Algorithm/Architecture/EDC Codesign Algorithm/Architecture/EDC Codesign

  • ALGORITHM:

ALGORITHM: investigation investigation w.r.t w.r.t. . fault fault-

  • tolerance, error

tolerance, error sensitivity e.g. sensitivity e.g. – – Single/two Single/two phase belief propagation, layered phase belief propagation, layered belief belief propagation algorithms propagation algorithms – – Sum Sum-

  • Product

Product, 3 , 3-

  • min,

min, Min Min-

  • Sum

Sum

  • ARCHITECTURE:

ARCHITECTURE: select select robust robust architecture e.g. architecture e.g. – – Single Single-

  • Phase,

Phase, Two Two-

  • Phase

Phase – – Sign Sign-

  • magnitude, 2K

magnitude, 2K – – Critical signals Critical signals

  • SUBBLOCK

SUBBLOCK: : identify identify „ „reliability reliability sensitivity sensitivity“ “ for for each each subblock subblock – – Select Select appropriate appropriate technique technique for for each each subblock subblock to to increase increase SYSTEM SYSTEM reliability reliability

All All steps steps are are strongly strongly interrelated interrelated! !

UKL LDPC Decoder UKL LDPC Decoder Implementations Implementations

PN branch PN branch 7 7 25 25-

  • 20

20 25 25-

  • 20

20 50 50-

  • 15

15

  • Max. Iterations
  • Max. Iterations

0.14 0.14-

  • 0.70

0.70 274 Mbps / 274 Mbps / mm mm2

2

6.0 6.0-

  • 5.8

5.8 µ µs s 54 54-

  • 281 Mbps

281 Mbps 1.02 1.02 0.467 0.467 0.065 0.065 0.395 0.395 0.096 0.096 1 1-

  • phase

phase 27 27-

  • 81

81 1/2 1/2-

  • 5/6

5/6 648, 1296, 1944 648, 1296, 1944

WiFi WiFi (802.11n) (802.11n)

3.08 3.08 0.12 0.12-

  • 0.83

0.83 0.58 0.58-

  • 6.70

6.70 0.15 0.15-

  • 1.77

1.77 Infobit Infobit/Cycle /Cycle 3.2 3.2 Gbps Gbps / / mm mm2

2

4.4 4.4 µ µs s 1.63 1.63 Gbps Gbps 0.50 0.50 0.265 0.265 0.027 0.027 0.212 0.212 @ 528 MHz @ 528 MHz Layered Layered MinSum+MSF MinSum+MSF/Lay /Lay . . 80 80 3/4 3/4 9600 9600

U U-

  • S

S LDPC LDPC (UWB) (UWB)

250 Mbps / 250 Mbps / mm mm2

2

6.0 6.0-

  • 5.7

5.7 µ µs s 48 48-

  • 333 Mbps

333 Mbps 1.33 1.33 0.551 0.551 0.206 0.206 0.470 0.470 0.110 0.110 Combined Combined 24 24-

  • 96

96 1/2 1/2-

  • 5/6

5/6 576 576-

  • 2304

2304

WiMax WiMax (802.16e) (802.16e)

69 69-

  • 21

21 µ µs s 270 270-

  • 82

82 µ µs s Latency Latency 430 Mbps / m 430 Mbps / mm m2

2

183 Mbps / 183 Mbps / mm mm2

2

  • Max. Efficiency
  • Max. Efficiency

0.23 0.23-

  • 2.68

2.68 Gbps Gbps 6.11 6.11 4.428 4.428 0.270 0.270 1.200 1.200 0.217 0.217 360 360 60 60-

  • 708 Mbps

708 Mbps Net Throughput Net Throughput 3.86 3.86 Overall Area Overall Area 3.357 3.357 Memory Memory 0.046 0.046 Network Network 0.328 0.328 CNP CNP 0.130 0.130 VNP VNP Area [mm Area [mm2

2]

] 65nm 65nm @ 400 MHz @ 400 MHz 1 1-

  • phase

phase Architecture Architecture 3 3-

  • Min

Min Algorithm Algorithm 6 bit 6 bit Quantization Quantization 90 90 Parallelism Parallelism 1/4 1/4-

  • 9/10

9/10 Code Rate Code Rate 64800 64800 Codeword Size Codeword Size

DVB DVB-

  • S2

S2 LDPC Code LDPC Code

  • Selected WiMax Standard as case study

Selected WiMax Standard as case study

slide-4
SLIDE 4

4

Single Single-

  • Phase 3

Phase 3-

  • Min

Min Algorithm Algorithm

… …

CFU VFU VFU

Channel RAM MSG RAM + +

VFU

Sum RAM 1 Sum RAM 2

CFU CFU Permutation Network Π Permutation Network Π-1 Controller

Soft Errors in Memories Soft Errors in Memories

Controller

… …

CFU VFU VFU CFU CFU

MSG RAM +

VFU

ENC2 Sum RAM 2 Channel RAM Sum RAM 1

Permutation Network Π Permutation Network Π-1

  • Encoding

Encoding (ENC2): MSB of (ENC2): MSB of channel channel values values doubled doubled

  • Error

Error detection: detection: Comparison Comparison of

  • f the

the MSB MSB

  • Error

Error correction: Puncturing i.e. channel correction: Puncturing i.e. channel values values are are set set to 0 to 0 (algorithmic fault tol.) (algorithmic fault tol.)

+

ED/ PUN(0)

slide-5
SLIDE 5

5

Channel Channel RAM RAM

MTBF (bit flipping) = 2ms = 1015 FIT

Message RAM Message RAM

Controller

… …

CFU VFU VFU CFU CFU

Channel RAM

VFU

Sum RAM 1

ENC2 ED/PUN(0)

Sum RAM 2 MSG RAM

Permutation Network Π Permutation Network Π-1

+ +

slide-6
SLIDE 6

6

Message RAM Message RAM

  • Inherent

Inherent fault fault tolerance tolerance of

  • f the

the belief belief propagation propagation algorithm algorithm

Permutation Permutation Network Network: Soft and Timing Errors : Soft and Timing Errors

VFU

Controller

CFU

ENC2 ED PUN(0)

Permutation Network Π Permutation Network Π-1

ENC2 ED/PUN(0)

CFU

ENC2 ED PUN (0)

CFU

ENC2 ED PUN (0)

VFU

ENC2 ED/PUN(0)

VFU

ENC2 ED/PUN(0)

slide-7
SLIDE 7

7

Data Representation Data Representation

  • K2 versus Sign/Magnitude

K2 versus Sign/Magnitude: : Sign/Magnitude reduces Sign/Magnitude reduces power and power and noise noise Algorithmic fault tolerance Algorithmic fault tolerance Only sign is important! Only sign is important!

10-4: 90000 bits/iteration => 9 bits/iteration

Permutation Networks Permutation Networks

Permutation Network Π CFU_i sign_i magn_i sign_i magn_i 5

  • Encoding

Encoding – – Sign Sign bit bit doubled, doubled, toggle redundant sign toggle redundant sign every every clock clock cycle cycle ( (timing timing errors) errors) XOR rst par_i 6 rst XOR par_i

  • Error

Error detection detection and and correction correction – – Error Error in in input input message message: all : all output

  • utput messages

messages of

  • f this

this check check node node are are set set to 0 to 0

slide-8
SLIDE 8

8

Permutation Networks Permutation Networks Check Check Nodes Nodes

VFU Controller VFU VFU

CFU CFU

Permutation Network Π Permutation Network Π-1 CFU

magn calc

control control control

sign calc ED

  • Encoding

Encoding: : sign sign calculation calculation doubled doubled / / controller controller tripled tripled

  • Error

Error correction correction – – Message puncturing: reuse PUN unit of errors in permutation netw Message puncturing: reuse PUN unit of errors in permutation network

  • rk

– – Controller: 2 out of 3 Controller: 2 out of 3 voter voter

EDC sign calc PUN(0)

slide-9
SLIDE 9

9

Check Check Nodes Nodes Putting all together Putting all together

slide-10
SLIDE 10

10

Overhead Overhead ~20 % ~20 %

1.59 1.59

0.25 0.25 0.09 0.09 0.25 0.25 0.25 0.25 0.55 0.55 0.11 0.11 0.09 0.09

Resilient Resilient LDPC Decoder LDPC Decoder 1.31 1.31

0.25 0.25 0.07 0.07 0.21 0.21 0.21 0.21 0.43 0.43 0.11 0.11 0.03 0.03

LDPC Decoder LDPC Decoder Total Area [mm Total Area [mm2

2]

]

Message RAM Message RAM Channel RAM Channel RAM Sum RAM Sum RAM Permutation Networks Permutation Networks CFUs CFUs VFUs VFUs (without RAM) (without RAM) Controller (including address Controller (including address and permutation ROM) and permutation ROM)

Unit Unit

  • WiMAX

WiMAX LDPC LDPC code code decoder decoder, , parallelism parallelism degree degree 96 96

  • Synthesis

Synthesis with with 65 nm 65 nm standard standard cell cell library library @ 400 MHz @ 400 MHz

Conclusion Conclusion

  • Continous

Continous CMOS CMOS scaling scaling

– – Resilient Resilient architectures architectures become become mandatory mandatory

  • Increase

Increase of

  • f system

system reliabilty reliabilty

– – All All levels levels of

  • f abstraction

abstraction have have to to be be considered considered – – Application has to be understood Application has to be understood

  • Algorithm/Architecture/EDC Codesign

Algorithm/Architecture/EDC Codesign

– – Wirless communication is a good example Wirless communication is a good example