Semi-supervised Prediction of Comorbid Rare Conditions Chirag Nagpal - - PowerPoint PPT Presentation

semi supervised prediction of comorbid rare conditions
SMART_READER_LITE
LIVE PREVIEW

Semi-supervised Prediction of Comorbid Rare Conditions Chirag Nagpal - - PowerPoint PPT Presentation

Semi-supervised Prediction of Comorbid Rare Conditions Chirag Nagpal 1 , K Miller 1 , T Pellathy 2 , M Hravnak 2 , G Clermont 2 , M Pinsky 2 , A Dubrawski 1 1 Auton Lab Carnegie Mellon University 2 University of Pittsburgh chiragn@cs.cmu.edu


slide-1
SLIDE 1

Semi-supervised Prediction of Comorbid Rare Conditions

Chirag Nagpal1,

K Miller1, T Pellathy2, M Hravnak2, G Clermont2, M Pinsky2, A Dubrawski1

1Auton Lab

Carnegie Mellon University

2University of Pittsburgh

chiragn@cs.cmu.edu

November 18, 2017

1 / 83

slide-2
SLIDE 2

Overview

1 Background

Motivation Prior Work

2 Dataset Description

Sources Feature Extraction Ground Truth

3 Approach

Baselines PreCoRC: Prediction of Comorbid Rare Conditions

4 Results 5 Future Work

2 / 83

slide-3
SLIDE 3

Overview

1 Background

Motivation Prior Work

2 Dataset Description

Sources Feature Extraction Ground Truth

3 Approach

Baselines PreCoRC: Prediction of Comorbid Rare Conditions

4 Results 5 Future Work

3 / 83

slide-4
SLIDE 4

Motivation

4 / 83

slide-5
SLIDE 5

Motivation

  • Rare Conditions are potentially under reported in EHR.

5 / 83

slide-6
SLIDE 6

Motivation

  • Rare Conditions are potentially under reported in EHR.
  • Prevent Failure-to-Rescue, (FTR). Death of a Hospitalised

Patient from a treatable condition.

6 / 83

slide-7
SLIDE 7

Motivation

  • Rare Conditions are potentially under reported in EHR.
  • Prevent Failure-to-Rescue, (FTR). Death of a Hospitalised

Patient from a treatable condition.

  • Ability to predict conditions would allow for pro-active

healthcare

7 / 83

slide-8
SLIDE 8

Motivation

  • Rare Conditions are potentially under reported in EHR.
  • Prevent Failure-to-Rescue, (FTR). Death of a Hospitalised

Patient from a treatable condition.

  • Ability to predict conditions would allow for pro-active

healthcare

  • Challenge: FTR Conditions, under reported, available data

sparse for standard Machine Learning

8 / 83

slide-9
SLIDE 9

Motivation

9 / 83

slide-10
SLIDE 10

Motivation

  • Leverage historical EHR Build Early Warning System, identify

patients at risk.

10 / 83

slide-11
SLIDE 11

Motivation

  • Leverage historical EHR Build Early Warning System, identify

patients at risk.

  • Augment scarce ground truth for operationally useful models.

11 / 83

slide-12
SLIDE 12

Motivation

  • Leverage historical EHR Build Early Warning System, identify

patients at risk.

  • Augment scarce ground truth for operationally useful models.
  • Model interpretable by the end user, medical practitioner.

12 / 83

slide-13
SLIDE 13

Tree Featurization

  • Tree Featurization [Singh et al., 2014]

Expicitly Leverage ICD Hierarchy in the Feature Representation.

13 / 83

slide-14
SLIDE 14

Tree Featurization

  • Tree Featurization [Singh et al., 2014]

Expicitly Leverage ICD Hierarchy in the Feature Representation. Pneumonia 487

14 / 83

slide-15
SLIDE 15

Tree Featurization

  • Tree Featurization [Singh et al., 2014]

Expicitly Leverage ICD Hierarchy in the Feature Representation. Pneumonia Pneumonia&Influenza 487 480-488

15 / 83

slide-16
SLIDE 16

Tree Featurization

  • Tree Featurization [Singh et al., 2014]

Expicitly Leverage ICD Hierarchy in the Feature Representation. Pneumonia Pneumonia&Influenza Respiratory System 487 480-488 460-519

16 / 83

slide-17
SLIDE 17

OoD Embedding Learning

17 / 83

slide-18
SLIDE 18

OoD Embedding Learning

  • Out of Domain Embedding Learning [Liu et al., 2016]

Learn Embeddings from External Sources for Dense Representation

  • f ICD codes

PubMed

18 / 83

slide-19
SLIDE 19

OoD Embedding Learning

  • Out of Domain Embedding Learning [Liu et al., 2016]

Learn Embeddings from External Sources for Dense Representation

  • f ICD codes

PubMed PubMed Central

19 / 83

slide-20
SLIDE 20

OoD Embedding Learning

  • Out of Domain Embedding Learning [Liu et al., 2016]

Learn Embeddings from External Sources for Dense Representation

  • f ICD codes

PubMed PubMed Central Open Access

20 / 83

slide-21
SLIDE 21

OoD Embedding Learning

  • Out of Domain Embedding Learning [Liu et al., 2016]

Learn Embeddings from External Sources for Dense Representation

  • f ICD codes

One Hot Encoding PubMed PubMed Central Open Access

21 / 83

slide-22
SLIDE 22

OoD Embedding Learning

  • Out of Domain Embedding Learning [Liu et al., 2016]

Learn Embeddings from External Sources for Dense Representation

  • f ICD codes

CBOW

One Hot Encoding PubMed PubMed Central Open Access

22 / 83

slide-23
SLIDE 23

OoD Embedding Learning

  • Out of Domain Embedding Learning [Liu et al., 2016]

Learn Embeddings from External Sources for Dense Representation

  • f ICD codes

CBOW

One Hot Encoding Dense Encoding PubMed PubMed Central Open Access

23 / 83

slide-24
SLIDE 24

Overview

1 Background

Motivation Prior Work

2 Dataset Description

Sources Feature Extraction Ground Truth

3 Approach

Baselines PreCoRC: Prediction of Comorbid Rare Conditions

4 Results 5 Future Work

24 / 83

slide-25
SLIDE 25

Feature Extraction

25 / 83

slide-26
SLIDE 26

Feature Extraction

Static Data

1 Age 2 Gender 3 Ethnicity

26 / 83

slide-27
SLIDE 27

Feature Extraction

Static Data Admission Data

1 Age 2 Gender 3 Ethnicity 1 ICD-9 Codes

  • Diagnosis Codes
  • Procedure Codes
  • Admission Codes

2 Diagnosis Related

Groups

27 / 83

slide-28
SLIDE 28

Feature Extraction

Static Data Admission Data Aggregated Records

1 Age 2 Gender 3 Ethnicity 1 ICD-9 Codes

  • Diagnosis Codes
  • Procedure Codes
  • Admission Codes

2 Diagnosis Related

Groups

1 XTn = {1, 0...0, 1} 2 X ′ Tn =

Σ{XT1, ..., XTn}

28 / 83

slide-29
SLIDE 29

Clinical Tasks

29 / 83

slide-30
SLIDE 30

Clinical Tasks

  • Intubation & Mechanical Ventilation (Task-IMV)

A Treatment Scenario occuring in context of Failure-to-Rescue (FTR) cases. ICD Codes: 96.04, 96.71, 96.72, 518.81

30 / 83

slide-31
SLIDE 31

Clinical Tasks

  • Intubation & Mechanical Ventilation (Task-IMV)

A Treatment Scenario occuring in context of Failure-to-Rescue (FTR) cases. ICD Codes: 96.04, 96.71, 96.72, 518.81

  • Venous Thrombo-embolism (Task-VTE)

Includes both, patients diagnosed with Pulmonary and Deep Vein Thrombosis, an under reported, Life Threatening Condition ICD Codes: 415.1, 451.11, 451,2, 451.81, 453.8

31 / 83

slide-32
SLIDE 32

Clinical Tasks

32 / 83

slide-33
SLIDE 33

Clinical Tasks

Intubation & Mechanical Ventilation (Task-IMV) 1266 Positives ≈ 1.173%

33 / 83

slide-34
SLIDE 34

Clinical Tasks

Intubation & Mechanical Ventilation (Task-IMV) 1266 Positives ≈ 1.173% Task-IMV-10 : Uses 10% Labelled Data

34 / 83

slide-35
SLIDE 35

Clinical Tasks

Intubation & Mechanical Ventilation (Task-IMV) 1266 Positives ≈ 1.173% Task-IMV-10 : Uses 10% Labelled Data Task-IMV-90 : Uses 90% Labelled Data

35 / 83

slide-36
SLIDE 36

Clinical Tasks

Intubation & Mechanical Ventilation (Task-IMV) 1266 Positives ≈ 1.173% Task-IMV-10 : Uses 10% Labelled Data Task-IMV-90 : Uses 90% Labelled Data Venous Thromboembolism (Task-VTE) 56 Positives ≈ 0.0519%

36 / 83

slide-37
SLIDE 37

Overview

1 Background

Motivation Prior Work

2 Dataset Description

Sources Feature Extraction Ground Truth

3 Approach

Baselines PreCoRC: Prediction of Comorbid Rare Conditions

4 Results 5 Future Work

37 / 83

slide-38
SLIDE 38

Baselines

38 / 83

slide-39
SLIDE 39

Baselines

  • Logistic Regression with ℓ2 Penalty.

LR

39 / 83

slide-40
SLIDE 40

Baselines

  • Logistic Regression with ℓ2 Penalty.
  • Random Forest Ensemble

LR RF

40 / 83

slide-41
SLIDE 41

Baselines

  • Logistic Regression with ℓ2 Penalty.
  • Random Forest Ensemble
  • Principal Component Analysis

LR RF PCA-LR PCA-RF

41 / 83

slide-42
SLIDE 42

Baselines

  • Logistic Regression with ℓ2 Penalty.
  • Random Forest Ensemble
  • Principal Component Analysis
  • Non-Negative Matrix Factorisation

LR RF PCA-LR PCA-RF NMF-LR NMF-RF

42 / 83

slide-43
SLIDE 43

PreCoRC Pipeline

43 / 83

slide-44
SLIDE 44

PreCoRC Pipeline

Binary Classifier Historical Data Score Test Data ICD-9 Hierarchy Final Prediction La be l Re

  • Distribution

Prior Pre diction Graph Structure

T-Edges O-Edges I-Edges P-Edges

44 / 83

slide-45
SLIDE 45

PreCoRC Pipeline

Binary Classifier Historical Data Score Test Data ICD-9 Hierarchy Final Prediction La be l Re

  • Distribution

Prior Pre diction Graph Structure

T-Edges O-Edges I-Edges P-Edges

45 / 83

slide-46
SLIDE 46

PreCoRC Pipeline

Binary Classifier Historical Data Score Test Data ICD-9 Hierarchy Final Prediction Label Re-Distribution Prior Prediction Graph Structure

T-Edges O-Edges I-Edges P-Edges

46 / 83

slide-47
SLIDE 47

Graph Construction

47 / 83

slide-48
SLIDE 48

Graph Construction

487 Influenza 480-488 Pneumonia & Influenza 460-519 Respiratory Diseases Patient A Record 1 Record 2 Record n Patient B Record 1 Record n I-Edge s O-Edge s P-Edge s T-Edge s

Patients Records ICD-9 Ontology

48 / 83

slide-49
SLIDE 49

Graph Construction

487 Influenza 480-488 Pneumonia & Influenza 460-519 Respiratory Diseases Patient A Record 1 Record 2 Record n Patient B Record 1 Record n I-Edge s O-Edge s P-Edge s T-Edge s

Patients Records ICD-9 Ontology

49 / 83

slide-50
SLIDE 50

Graph Construction

487 Influenza 480-488 Pneumonia & Influenza 460-519 Respiratory Diseases Patient A Record 1 Record 2 Record n Patient B Record 1 Record n I-Edges O-Edges P-Edges T-Edges

Patients Records ICD-9 Ontology

50 / 83

slide-51
SLIDE 51

Label Propagation

51 / 83

slide-52
SLIDE 52

Label Propagation

Harmonic Energy Minimization [Zhu et al., 2003]

E(f ) =

i∈L(yi − fi)2Dii + λ i,j(fi − fj)2Aii

52 / 83

slide-53
SLIDE 53

Label Propagation

Harmonic Energy Minimization [Zhu et al., 2003]

E(f ) =

i∈L(yi − fi)2Dii + λ i,j(fi − fj)2Aii

Soft Label HEM [Wang et al., 2013]

E(f ) =

  • i∈L

(yi − fi)2Dii+λ

  • w0
  • i∈U

(fi − πi)2Dii+

  • i,j

(fi − fj)2Aii

  • 53 / 83
slide-54
SLIDE 54

Label Propagation

Harmonic Energy Minimization [Zhu et al., 2003]

E(f ) =

i∈L(yi − fi)2Dii + λ i,j(fi − fj)2Aii

Soft Label HEM [Wang et al., 2013]

E(f ) =

  • i∈L

(yi − fi)2Dii+λ

  • w0
  • i∈U

(fi − πi)2Dii+

  • i,j

(fi − fj)2Aii

  • 54 / 83
slide-55
SLIDE 55

Label Propagation

Harmonic Energy Minimization [Zhu et al., 2003]

E(f ) =

i∈L(yi − fi)2Dii + λ i,j(fi − fj)2Aii

Soft Label HEM [Wang et al., 2013]

E(f ) =

  • i∈L

(yi − fi)2Dii+λ

  • w0
  • i∈U

(fi − πi)2Dii+

  • i,j

(fi − fj)2Aii

  • 55 / 83
slide-56
SLIDE 56

Label Propagation

Harmonic Energy Minimization [Zhu et al., 2003]

E(f ) =

i∈L(yi − fi)2Dii + λ i,j(fi − fj)2Aii

Soft Label HEM [Wang et al., 2013]

E(f ) =

  • i∈L

(yi − fi)2Dii+λ

  • w0
  • i∈U

(fi − πi)2Dii+

  • i,j

(fi − fj)2Aii

  • Prior Term

Gaussian Prior

56 / 83

slide-57
SLIDE 57

Label Propagation

Harmonic Energy Minimization [Zhu et al., 2003]

E(f ) =

i∈L(yi − fi)2Dii + λ i,j(fi − fj)2Aii

Soft Label HEM [Wang et al., 2013]

E(f ) =

  • i∈L

(yi − fi)2Dii+λ

  • w0
  • i∈U

(fi − πi)2Dii+

  • i,j

(fi − fj)2Aii

  • Prior Term

Gaussian Prior

Hyperparameter

Tune to Tradeoff

57 / 83

slide-58
SLIDE 58

Label Propagation

58 / 83

slide-59
SLIDE 59

Label Propagation

E(f ) =

  • i∈L

(yi − fi)2Dii + λ

  • i,j

(fi − fj)2Aii Notice HEM is quadratic in f. Has Global Minimum !

59 / 83

slide-60
SLIDE 60

Label Propagation

E(f ) =

  • i∈L

(yi − fi)2Dii + λ

  • i,j

(fi − fj)2Aii Notice HEM is quadratic in f. Has Global Minimum ! E(f ) =

  • i∈L

(yi − fi)2Dii+λ

  • w0
  • i∈U

(fi − πi)2Dii+

  • i,j

(fi − fj)2Aii

  • Notice Soft HEM is also quadratic in f. Has Global Minimum !

60 / 83

slide-61
SLIDE 61

Label Propagation

61 / 83

slide-62
SLIDE 62

Label Propagation

Miminizer = f = [D(I + S) − A]−1DSy

62 / 83

slide-63
SLIDE 63

Label Propagation

Miminizer = f = [D(I + S) − A]−1DSy S =

  • 1

λIL

w0IU

  • , y =

yL π

  • , Di =

j Aij

63 / 83

slide-64
SLIDE 64

Overview

1 Background

Motivation Prior Work

2 Dataset Description

Sources Feature Extraction Ground Truth

3 Approach

Baselines PreCoRC: Prediction of Comorbid Rare Conditions

4 Results 5 Future Work

64 / 83

slide-65
SLIDE 65

Evaluation

65 / 83

slide-66
SLIDE 66

Evaluation

  • 10 Fold Cross Validation

66 / 83

slide-67
SLIDE 67

Evaluation

  • 10 Fold Cross Validation
  • Paired T-test

67 / 83

slide-68
SLIDE 68

Evaluation

  • 10 Fold Cross Validation
  • Paired T-test
  • Hyperparameter Tuning: Grid Search

68 / 83

slide-69
SLIDE 69

Evaluation

  • 10 Fold Cross Validation
  • Paired T-test
  • Hyperparameter Tuning: Grid Search
  • TPR @ FPR :
  • 1e-3
  • 1e-2
  • 1e-1

69 / 83

slide-70
SLIDE 70

Evaluation

  • 10 Fold Cross Validation
  • Paired T-test
  • Hyperparameter Tuning: Grid Search
  • TPR @ FPR :
  • 1e-3
  • 1e-2
  • 1e-1
  • TNR @ FNR :
  • 1e-2
  • 5e-2

70 / 83

slide-71
SLIDE 71

Evaluation

  • 10 Fold Cross Validation
  • Paired T-test
  • Hyperparameter Tuning: Grid Search
  • TPR @ FPR :
  • 1e-3
  • 1e-2
  • 1e-1
  • TNR @ FNR :
  • 1e-2
  • 5e-2
  • Area Under ROC

71 / 83

slide-72
SLIDE 72

Task-IMV-90

10

4

10

3

10

2

10

1

100 False Positive Rate 10

4

10

3

10

2

10

1

100 True Positive Rate

PreCoRC-RF PreCoRC-LR

LR RF Rnd 10

2

10

1

100 False Negative Rate 10

2

10

1

100 True Negative Rate

PreCoRC-LR PreCoRC-RF

LR RF Rnd

TPR@FPR=10−3 TPR@FPR=10−2 TPR@FPR=10−1 TNR@FNR=10−2 TNR@FNR=5% AUC RF-Baseline 0.0328 0.1525 0.4449 0.0230 0.1250 0.7410 RF-PreCoRC 0.0369 0.1569 0.4682 0.1246 0.2462 0.7660 LR-Baseline 0.0121 0.0924 0.5104 0.0705 0.2532 0.7947 LR-PreCoRC 0.0144 0.1100 0.5167 0.1478 0.3715 0.8211 72 / 83

slide-73
SLIDE 73

Task-IMV-10

10

4

10

3

10

2

10

1

100 False Positive Rate 10

4

10

3

10

2

10

1

100 True Positive Rate

PreCoRC-LR PreCoRC-RF

LR RF Rnd 10

3

10

2

10

1

100 False Negative Rate 10

3

10

2

10

1

100 True Negative Rate

PreCoRC-LR PreCoRC-RF

LR RF Rnd

TPR@FPR=10−3 TPR@FPR=10−2 TPR@FPR=10−1 TNR@FNR=10−2 TNR@FNR=5% AUC RF-Baseline 0.0085 0.0454 0.2057 0.0118 0.0588 0.5598 RF-PreCoRC 0.0153 0.0833 0.3650 0.0630 0.2265 0.7277 LR-Baseline 0.0020 0.0356 0.3367 0.0568 0.2405 0.7663 LR-PreCoRC 0.0048 0.0515 0.3951 0.1058 0.2637 0.7624 73 / 83

slide-74
SLIDE 74

Task-VTE

10

1

100 False Positive Rate 10

1

100 True Positive Rate

PreCoRC-LR PreCoRC-RF

LR RF Rnd 10

1

100

2 × 10

1

3 × 10

1

4 × 10

1

6 × 10

1

False Negative Rate 10

1

100

2 × 10

1

3 × 10

1

4 × 10

1

6 × 10

1

True Negative Rate

PreCoRC-LR PreCoRC-RF

LR RF Rnd

TPR@FPR=10−3 TPR@FPR=10−2 TPR@FPR=10−1 TNR@FNR=10−2 TNR@FNR=5% AUC RF-Baseline < 10−4 < 10−4 0.1247 0.0103 0.0514 0.5116 RF- PreCoRC < 10−4 < 10−4 0.2500 0.1133 0.14393 0.6545 LR-Baseline < 10−4 0.0167 0.2428 0.0184 0.0809 0.6230 LR-PreCoRC < 10−4 0.0167 0.2500 0.1553 0.2021 0.6663 74 / 83

slide-75
SLIDE 75

Post-hoc Interpretability

  • Task-IMV

1 2 3 4 5 6 7 8 9 10

Baseline Ranks

1 2 3 4 5 6 7 8 9 10

Proposed Ranks

144 269 480 611 262 580 507 503 602 404 Hypertesive Heart Disorders of Prostate Pneumoconiosis Pneumonitis Nutritional Deficiencies Malignant Neoplasm (Mouth) Nephrotic Syndrome Protein-Calorie Malnutrition Disorders of Breast Pneumonia And Influenza

Code Description 482 Bacterial Pneumonia 038 Streptococcal/Pneumococcal Septicemia 359 Muscular Dystrophy; Myopathy 238 Neoplasms; Myelodysplastic syndrome

75 / 83

slide-76
SLIDE 76

Post-hoc Interpretability

  • Task-VTE

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10

502 271 079

E000 E029

725 117 V87 745

Baseline Ranks Proposed Ranks

Bulbus Cordis Anomalies Other Exposures to Health Other Mycoses Metabolism Disorder Pneumoconiosis Polymyalgia Rheumatica Other Activity External Cause (Unspecified) Viral & Chlamydial Infection

Code Description 608 Seminal vesiculitis; Spermatocele; Hematospermia 999 Infection due to central venous catheters 364 Iridocyclitis; Hyphema of iris; Iridoschisis 466 Acute bronchitis

76 / 83

slide-77
SLIDE 77

Overview

1 Background

Motivation Prior Work

2 Dataset Description

Sources Feature Extraction Ground Truth

3 Approach

Baselines PreCoRC: Prediction of Comorbid Rare Conditions

4 Results 5 Future Work

77 / 83

slide-78
SLIDE 78

Future Work

78 / 83

slide-79
SLIDE 79

Future Work

  • Use PreCoRC in a streaming fashion, with each subsequent

admission

79 / 83

slide-80
SLIDE 80

Future Work

  • Use PreCoRC in a streaming fashion, with each subsequent

admission

  • Jointy optimize the loss corresponding to classifier and label

propagation

80 / 83

slide-81
SLIDE 81

Future Work

  • Use PreCoRC in a streaming fashion, with each subsequent

admission

  • Jointy optimize the loss corresponding to classifier and label

propagation

  • Better Strategies to come up with the underlying Graph

Representation

81 / 83

slide-82
SLIDE 82

References

Liu, Y., Stultz, C., Guttag, J., Chuang, K.-T., Liang, F.-W., and Su, H.-J. (2016). Transferring knowledge from text to predict disease onset. In Machine Learning for Healthcare Conference, pages 150–163. Singh, A., Nadkarni, G., Guttag, J., and Bottinger, E. (2014). Leveraging hierarchy in medical codes for predictive modeling. In Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, pages 96–103. ACM. Wang, X., Garnett, R., and Schneider, J. (2013). Active search on graphs. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 731–738. ACM. Zhu, X., Ghahramani, Z., and Lafferty, J. D. (2003). Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the 20th International conference on Machine learning (ICML-03), pages 912–919. 82 / 83

slide-83
SLIDE 83

Thank you! Questions?

http://cs.cmu.edu/~chiragn chiragn@cs.cmu.edu

83 / 83