Challenges and opportunities in reliability engineering: the big KID - - PowerPoint PPT Presentation

challenges and opportunities in reliability
SMART_READER_LITE
LIVE PREVIEW

Challenges and opportunities in reliability engineering: the big KID - - PowerPoint PPT Presentation

Challenges and opportunities in reliability engineering: the big KID (Knowledge, Information and Data) Enrico Zio Chair on Systems Science and the Energy Challenge CentraleSupelec, Fondation Electricit de France (EDF), France Energy


slide-1
SLIDE 1

Challenges and opportunities in reliability engineering: the big KID (Knowledge, Information and Data)

Enrico Zio

Chair on Systems Science and the Energy Challenge – CentraleSupelec, Fondation Electricité de France (EDF), France Energy Department, Politecnico di Milano, Italy Aramis Srl, Italy

slide-2
SLIDE 2

2

slide-3
SLIDE 3

Prevented by Design for Reliability

Time Normal Degraded Failure

Problem statement

Failures

5 …

Maintenance

slide-4
SLIDE 4

7

INDUSTRY

slide-5
SLIDE 5

8

Industry 1-2-3-4

slide-6
SLIDE 6

10

(SMART) Reliability Engineering

slide-7
SLIDE 7

12

The Big KID

slide-8
SLIDE 8

13

Big Knowledge(ID)

slide-9
SLIDE 9

14

Big (K)Information(D)

slide-10
SLIDE 10

15

Big (KI)Data

1110101001010001011100100101011000010101001 1101110111011101010010100010111001001010110 0001010100111011101110111010100101000101110 0100101011000010101001110111011101110101001 0100010111001001010110000101110101001010001 0111001001010110000101010011101110111011101 0100101000101110010010101100001010100111011 1011101110101001010001011100100101011000010

slide-11
SLIDE 11

16

Application

Can the Big KID become SMART for Reliability Engineering ?

slide-12
SLIDE 12

Prevented by Design for Reliability Maintenance

Time Normal Degraded Failure

Problem statement

Failures

17 …

slide-13
SLIDE 13

18

SMART Reliability Engineering – component Big KID opportunities

Reliability analysis for Design for Reliability:

From failure modeling to degradation-to-failure modeling

slide-14
SLIDE 14

19

SMART Reliability Engineering – component Big KID opportunities

Reliability analysis for Design for Reliability:

From failure modeling to degradation-to-failure modeling

Integrating physics-of-failure knowledge in reliability models

  • Multi-State Physic-Based Models
slide-15
SLIDE 15

20

Model KID

(Knowledge, Information, Data)

20 40 60 80 0.994 0.995 0.996 0.997 0.998 0.999 1 Year Reliability

Sufficient failure data Physics knowledge Expert judgment Field data

Highly reliable

Statistical models

  • f time to failure

Stochastic process models Physics-based models Multi-state models

20

Reliability ?

SMART Reliability Engineering – component Challenges

slide-16
SLIDE 16

21

Multi-state physics model of crack development in Alloy 82/182 dissimilar metal weld

Alloy 82/182 dissimilar metal weld of piping in a PWR primary coolant system

Physical laws 21

SMART Reliability Engineering – component Multi-State Physic-Based Models

slide-17
SLIDE 17

22

Internal leak Failure state 3 2 1

λ32 λ21 λ10

Initial state

22

SMART Reliability Engineering – component Opportunities

Degradation process Random shock process

Random shocks Dependences in degradation processes

slide-18
SLIDE 18

23

23

SMART Reliability Engineering – component Opportunities

Maintenance

Preventive maintenance (a) Corrective maintenance (b)

Degradation process a b

slide-19
SLIDE 19

24

24

SMART Reliability Engineering – component Challenges

Uncertainty

Internal leak Failure state 3 2 1

λ32 λ21 λ10

Initial state

Uncertain parameters in degradation models

slide-20
SLIDE 20

25

Internal leak Failure state 3 2 1

λ32 λ21 λ10

Initial state

25

SMART Reliability Engineering – component Challenges

Degradation processes

Piecewise-deterministic Markov process (PDMP)

slide-21
SLIDE 21

26

MC Simulation 26 Finite-volume scheme

SMART Reliability Engineering – component Challenges

slide-22
SLIDE 22

27

SMART Reliability Engineering – component Big KID opportunities

Reliability analysis for Design for Reliability:

From failure modeling to degradation-to-failure modeling

Integrating physics-of-failure knowledge in reliability models

  • Multi-State Physic-Based Models

?And the data?

slide-23
SLIDE 23

31

ADT Procedure

s0 s2 s1 Stress Time t1 t2 t0

AM

1

AM

2

AM

3

Acceleration Model: Stress VS Time

Performance distribution Time

Threshold distribution

Life distribution Performance parameter

Degradation Model: Degradation VS Time Stochastic process or degradation-path: Wiener process: 𝑍(𝑢) = 𝜏𝐶(𝑢) + 𝑒(𝑇)𝑢 Physical or empirical models: Arrhenius: 𝑒(𝑇) = 𝐵𝑓−

𝐹𝑏 𝑙𝑇

Theory Assumptions about how things work Design A blueprint of the procedure Experiment Trial to test hypothesis Evaluation Assessment of the outcome of the experiment Conclusion Insight about what works, gained from analysis Refinements

General testing procedure

slide-24
SLIDE 24

33

y = 4E-06x + 0,0014 R² = 0,9837

  • 0,01

0,01 0,02 0,03 0,04 0,05 0,06 0,07

  • 4000

1000 6000 11000 16000 Degradation Percentage Time/Min

Data Analysis

Trend analysis & Accelerability Verification:

y = 13,693x - 37,955 R² = 0,9494
  • 13
  • 12,5
  • 12
  • 11,5
  • 11
  • 10,5
  • 10
  • 9,5
  • 9

1,8 1,9 2 2,1 Ln(degradation Rate) Ln(Voltage)

Degradation VS Time Time VS Stress

Degradation Process Model Degradation Fitting Parameter Estimation  

( ) ( ) ( ) , ( ) exp ( ) l l l l Y t B t d S t y d S a b S        

Maximum Likelihood

 

 

2 1 2 2 1 1 1 1 1 1 1 ( ) 1 ( , , ) exp 2 2 exp ( ) l l l n m k lij l lij l i j lij lij m k n lij l l i j x d S t L a b t t x a b S t                                     

 

 

( ) ( ) l l E Y t d S t y   

( )

l

d S

LSE LSE ˆ ˆ a b    ( ) exp ( ) ln ( ) ( ) l l l l d S a b S d S a b S        1 2 2 1 1 1 1 ˆ ˆ ˆ exp ( ) ( ) l m k n lij l l i j x a b S t n m k t                        

𝑏 𝑐 𝜏2

  • 36.961

13.112 8.278e-07

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 x 10 4 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 Time/Min R(t)

Reliability Prediction: Parameter estimation:

slide-25
SLIDE 25

34

Challenges in ADT

Degradation trend Aleatory uncertainty Epistemic uncertainty

  • The whole trend is defined

(linear, exponential, etc.)

  • Inherent randomness
  • Probability
  • Incomplete knowledge due

to limited information

  • Interval, possibility, etc.

 Traditional methods mainly model degradation trend and aleatory uncertainty.  Failing to consider epistemic uncertainty may cause serious reliability evaluation problems. Challenges:

DBM model Revised model I* Revised model II 𝑍 𝑢 = 𝑒(𝑇)𝑢 + 𝜏𝐶𝐶 𝑢 𝑍 𝑢 = 𝑒(𝑇)𝑢 + 𝜏𝐶𝐶 𝑢 𝑍 𝑢 = 𝑒 𝑇 ⋅ 𝑢 + 𝜏𝐶𝐶 𝑢

𝑒(𝑇) ∼ 𝑂 𝜈, 𝜏2 𝑒(𝑇) ∼ 𝑂 𝜈, 𝜏2 𝑒 𝑇 :a definite value

Degradation trend Aleatory uncertainty Degradation trend Aleatory uncertainty Degradation trend Aleatory uncertainty Epistemic uncertainty 𝝉𝑪𝑪 𝒖 𝝉𝑪𝑪 𝒖 & 𝒆(𝑻) 𝝉𝑪𝑪 𝒖 & 𝒆(𝑻)

Stochastic Process – some revised models:

slide-26
SLIDE 26

Prevented by Maintenance

Time Normal Degraded Failure

Problem statement

Failures

35 …

Design for Reliability

slide-27
SLIDE 27

36

SMART Reliability Engineering – component Big KID opportunities

Maintenance: Integrating physics knowledge and data:

  • Prognostics and Health Management (PHM)
slide-28
SLIDE 28

Prognostics and Health Management (PHM)

1950 1980 2000

Corrective Maintenance Planned Periodic Maintenance Condition Based Maintenance (CBM)

2016

Predictive Maintenance (PrM)

PHM is fostered by advancements in: 38

Maintenance

Sensor Algorithm Computation power

Maintenance

slide-29
SLIDE 29

PHM for what?

PHM in support to CBM and PrM

40

Equipment Maintenance Decision Abnormal Conditions Normal Conditions Anomaly of Type 1 Anomaly of Type 2 Anomaly of Type 3 Maintenance No Maintenance Decision Maker Remaining Useful Life (RUL)

Fault Detection Fault Diagnostics Fault Prognostics

Vibration t Sensors measurements t Temperature

slide-30
SLIDE 30

41

  • Increase

maintainability, availability, safety,

  • perating performance and productivity
  • Reduce downtime, number and severity of failure

and life-time cost

PHM: why? (Industry)

slide-31
SLIDE 31

43

Abnormal Condition

MODEL OF PLANT BEHAVIOR IN NORMAL OPERATION

PHM: how? (Fault detection)

Signal reconstructions Real measurements

10 20 65 70 75 80 500 1000 65 70 75 80 500 1000 10 20

Nominal Range-based Physics-based Data-Driven (AAKR, PCA, RNN,…)

slide-32
SLIDE 32

44

  • Empirical classification methods:
  • Support Vector Machines
  • K-Nearest Neighbours
  • Multilayer Perceptron Neural Networks
  • Supervised clustering algorithms
  • Ensemble of classifiers

Empirical Classifier

C1 = Inner race C2 = Balls C3 = Outer race

2

x

1

x

3

x

Peak Value

Norm Node 5 Wavelet Norm Node 14 Wavelet

  • Signal measurements representative of the fault classes: «x1,x2,…xn, class»

PHM: how? (Fault diagnostics)

slide-33
SLIDE 33

45

Data- Driven Model- Based

  • Physics-based

model of the degradation process

  • Measurement

equation

  • Current degradation

trajectory

  • A threshold of failure
  • External/operational

conditions

Degrading component Similar components Particle filter Monte Carlo Simulation

  • Degradation trajectories of

similar components

  • Life durations of a set of

similar components Hidden Semi-Markov Models Artificial Neural Networks Autoregressive (AR) models Similarity-based methods Neuro-fuzzy systems

PHM: how? (Fault prognostics)

Kalman Filter

slide-34
SLIDE 34

46

PHM: performance ?

  • Accuracy
slide-35
SLIDE 35

48

  • Accuracy
  • Fault Detection:

 Low rate of False Alarms  Low rate of Missing Alarms

False Alarm Rates Missing Alarm Rates 0.54% 0.98%

Example:

Detection Model

Normal Condition

level P …

PHM: performance ? (detection)

slide-36
SLIDE 36

49

  • Accuracy
  • Fault diagnostics:

 Low Misclassification rate

C1 C2 C3

Diagnostic Model

Signals

  • = true

 = diagnostic model Misclassification rate = 2.58%

PHM: performance ? (diagnostics)

slide-37
SLIDE 37

51

  • Accuracy
  • Prognostics

PHM: performance ? (prognostics)

100 200 300 400 500 600 700 100 200 300 400 500 600 700 800 900 KF Single Model True RUL

time [Days]

Predicted RUL

RUL(t)

slide-38
SLIDE 38

56 1) Context changing 2) Uncertainty management 3) Fleet 4) Return of Investment 5) Safety

PHM &

SMART Reliability Engineering – component Challenges (PHM)

slide-39
SLIDE 39

57 Environment t Present time Context Changes Present time

Context changing: concept

slide-40
SLIDE 40

60 The detection model should be able to follow the process changes:

  • Incremental learning of the new data that gradually becomes available
  • No necessity of human intervention for:
  • selecting recent normal operation data
  • building the new model

T P T P

New data are coming

T P

Automatic updating of the model

Context changing (fault detection)

Monitoring components of a (e.g. nuclear power) plant

slide-41
SLIDE 41

62

62

t

Failure threshold

Degradation indicator Present time

Context changing (fault prognostics)

slide-42
SLIDE 42

63

63

t

Failure threshold

Degradation indicator Present time

Context changing (prognostics)

slide-43
SLIDE 43

65

65/68 12/02/2015

50 100 150 200 250 300 350 400 450 500 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Time steps (four hours) Normalized leak flow Target Predicted values Upper bound Lower bound Position of changed FV Position of new FV

New FV: 13 Changed FV: 57

Context changing (fault prognostics)

slide-44
SLIDE 44

67 1) Context Changing 2) Uncertainty management 3) Fleet 4) Return Of Investment 5) Safety

PHM &

SMART Reliability Engineering – component Challenges (PHM)

slide-45
SLIDE 45

68

Uncertainty management (prognostics)

Sources of uncertainty: 1) noise on the observations (measurements)

Time Failure Threshold

Noise on degradation measurement

Seal leakage

tp True leakage Leakage measurement

slide-46
SLIDE 46

69 Sources of uncertainty: 1) noise on the observations (measurements) 2) intrinsic stochasticity of the degradation process

69

Time Failure Threshold Seal leakage

tp

Uncertainty management (prognostics)

slide-47
SLIDE 47

72

Sources of uncertainty: 1) noise on the observations (measurements) 2) intrinsic stochasticity of the degradation process 3) unknown future external/operational conditions 4) Modeling errors, i.e. inaccuracy of the prognostic model used to perform the prediction

Uncertainty on the RUL prediction ?

Uncertainty management (prognostics)

500 1000 1500 2000 2500 3000 3500 1 2 3 4 5 6x 10
  • 3
RUL RUL pdf estimate True RUL

Maximum acceptable failure probability is 5% Prognostic Model

Present Time

Probability to have a failure in this interval is lower than 5%

time for maintenance

slide-48
SLIDE 48

73 1) Context Changing 2) Uncertainty management 3) Fleet 4) Return of Investment 5) Safety

PHM &

SMART Reliability Engineering – component Challenges (PHM)

slide-49
SLIDE 49

74

Fleet (fault diagnostics)

  • Can we use data from similar industrial plants of the same fleet to

build diagnostic systems?

Failure of Class 1 Failure of Class 2 Failure of Class 3

Temperature Temperature Plant A (near the sea) Plant B (near a river)

time time

Plant Z (in a very rainy region)

?

Temperature

time

slide-50
SLIDE 50

77 1) Context Changing 2) Uncertainty management 3) Fleet 4) Return of Investment 5) Safety

PHM &

SMART Reliability Engineering – component Challenges (PHM)

slide-51
SLIDE 51

78

Return Of Investment (ROI)

  • Most frequently used measure to estimate the economic benefit of

PHM: 𝑆𝑃𝐽 = 𝐷𝑝𝑡𝑢 𝑏𝑤𝑝𝑗𝑒𝑏𝑜𝑑𝑓 𝐽𝑜𝑤𝑓𝑡𝑢𝑛𝑓𝑜𝑢 − 1

Cost avoidance Cost loss

  • f

remaining useful life

Cost of repair reduction

Cost of reduction in logistics

Cost of failures avoided

Investment cost

Costs associated with product manufacturing Development costs Cost of performing necessary analysis Infrastructure costs

Questions: 1- How to reformulate the ROI based on these economic benefits and make the ROI framework general? 2- How the performance indicators will affect the ROI ?

slide-52
SLIDE 52

79 1) Context Changing 2) Uncertainty management 3) Fleet 4) Return of Investment 5) Safety

PHM &

SMART Reliability Engineering – component Challenges (PHM)

slide-53
SLIDE 53

80

PHM & safety

Risk  (pi, ci|k)i=1,…,N

PHM

 (pi

∗, ci ∗|k∗ )i=1,…,N*

  • Avoided failures thanks to PHM
  • Reduction of unnecessary maintenance interventions (< human errors in maintenance)
  • Management of abnormal conditions
  • Missing alarms of the fault detection system
  • Late RUL predictions of the prognostic system
  • Unexpected scenarios

+ PHM System

(Terje Aven, ESRA Webinar, What is Risk, March 17, 2016)

slide-54
SLIDE 54

81

PHM & safety

+ + PHM System

Safety?

Initiating Event System 1 Detail of an Accident System 2 IE*S1*S2 IE*S1*F2 IE S1 F1 S2 F2 F1 A B IE*F1

PHM System

slide-55
SLIDE 55

82

Conclusions: Big KID and Smart KID

Fuzzy Logic Systems Optimization Algorithms FTA ETA FMECA Hazop Clustering Algorithms Graph Theory Petri Nets Neural Networks Bayesian Belief Networks Complex Network Theory Monte Carlo Simulation Process and Stochastic Flowgraphs
slide-56
SLIDE 56

83

Simulation, Modeling, Analysis, Research for Treasuring Knowledge, Information and Data (for Reliability Engineering)

SMART KID

Data Information Knowledge

Conclusions: Smart KID for Reliability Engineering

slide-57
SLIDE 57

84

Conclusions: Smart KID for Reliability Engineering

  • E. Zio, IEEE Trans on Reliability, 2016

Some challenges and opportunities in reliability engineering

slide-58
SLIDE 58

85

Thanks…

…for your outstanding contributions

slide-59
SLIDE 59

92

Thanks…

…for your attention