SEPTIC SHOCK PREDICTION FOR PATIENTS WITH MISSING DATA Joyce C Ho, - - PowerPoint PPT Presentation

septic shock prediction for patients with missing data
SMART_READER_LITE
LIVE PREVIEW

SEPTIC SHOCK PREDICTION FOR PATIENTS WITH MISSING DATA Joyce C Ho, - - PowerPoint PPT Presentation

SEPTIC SHOCK PREDICTION FOR PATIENTS WITH MISSING DATA Joyce C Ho, Cheng Lee, Joydeep Ghosh University of Texas at Austin W HAT IS S EPSIS AND S EPTIC S HOCK ? Sepsis is a systemic inflammatory response to infection 11th leading cause


slide-1
SLIDE 1

SEPTIC SHOCK PREDICTION FOR PATIENTS WITH MISSING DATA

Joyce C Ho, Cheng Lee, Joydeep Ghosh University of Texas at Austin

slide-2
SLIDE 2

WHAT IS SEPSIS AND SEPTIC SHOCK?

  • Sepsis is a systemic inflammatory response to

infection

  • 11th leading cause of death in 2010
  • Estimated $14.6 billion spent on sepsis in 2008
  • Septic shock (sepsis-induced hypotension) has a

mortality rate of 45.7%

slide-3
SLIDE 3

IN-HOSPITAL DETECTION

Demographic
 Information Vital Signs Labs Clinical Notes Patient Representation Predictive Model

slide-4
SLIDE 4

MISSING DATA PROBLEM

  • Clinical studies must deal with large amounts of

missing data

  • Measurements are noisy and irregularly sampled
  • Highly accurate measurements require invasive

techniques (may not be medically necessary)

slide-5
SLIDE 5

TYPICAL APPROACH

  • Ignore subjects with missing observations
  • Ignore features without complete data
  • Result: Highly curated datasets with limited

features and small samples

slide-6
SLIDE 6

OUR SEPTIC SHOCK MODEL

  • Generalization to patients with partially missing
  • bservations
  • Simple and accessible approaches
  • Focus on commonly observed, non-invasive

measurements Problem: Given a patient has sepsis, can we predict complications at least one hour prior to onset of septic shock?

slide-7
SLIDE 7

CLINICAL FEATURES

  • Summary statistics (last measurement, min, mean, and

max) in 8 hour window

  • Cardiac: non-invasive blood pressure, heart rate, pulse

pressure

  • Other: respiratory rate, SpO2, temperature
  • Last measurement only (less observations)
  • White blood cell count
  • Index scores: SOFA, SAPS-I, Shock index
slide-8
SLIDE 8

IMPUTATION APPROACHES

  • Mean / median imputation
  • Matrix factorization techniques
  • Singular value based imputation (SVD)
  • Probabilistic principal component analysis

(PPCA)

  • K-nearest neighbors (KNN)
slide-9
SLIDE 9

IMPUTATION SELECTION CRITERIA

  • Matrix factorization and neighborhood techniques

have parameter to control resolution or locality of imputation

  • Evaluation metric typically involves randomly

removing observations and comparing fit using root mean square error (RMSE) or mean absolute error (MAE)

  • RMSE / MAE may not necessarily translate to

improved predictive performance

slide-10
SLIDE 10

PERFORMANCE-ORIENTED IMPUTATION (POI)

Data Impute Build & Evaluate Impute

Imputation parameter selection

Random splits Construct Prediction Model

Optimal k

slide-11
SLIDE 11

MIMIC-II DATABASE

  • Extensive, publicly available ICU data resource
  • Data between 2001 and 2007 from Boston’s Beth

Israel Deaconess Medical Center ICUs

  • Over 40,000 ICU stays from 30,000+ patients
  • Clinical records with physiological measures,

medication records, laboratory tests, free-form text notes, etc.

slide-12
SLIDE 12

100 200 300 400 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 22

Number of Missing Features Count

IMPORTANCE OF IMPUTATION

Feature 30 mins 60 mins Respiratory rate 0.67% 0.68% Temperature 1.70% 2.05% White blood cells 15.30% 14.69% Blood pressure 23.28% 23.44%

Less than 22% of the 1,353 patients have complete data Non-invasive BP is not always available

slide-13
SLIDE 13

DIFFERENCES IN POPULATION

Missing patients Complete only Time Sepsis (only) Shock Sepsis
 (only) Shock P-value 30 mins 749 79 199 110 4.56E-26 60 mins 723 79 196 106 6.99E-24 90 mins 705 79 196 103 4.63E-22 120 mins 685 74 193 103 7.06E-23

Statistically significantly higher ratio of shock patients if you ignore patients with missing data

slide-14
SLIDE 14

PREDICTIVE POWER OF MEAN IMPUTED MODEL

Train Data Test Data 30 minutes before (AUC) 60 minutes before (AUC) Complete Complete 0.796±0.065 0.777±0.050 Complete Imputed 0.815±0.033 0.800±0.053 Imputed Imputed 0.834±0.025 0.829±0.030 Imputed Complete 0.839±0.044 0.828±0.047

Model generalizes to broader population
 with slightly better predictive performance

slide-15
SLIDE 15

COMPARISON OF SELECTION CRITERIA (SVM)

SVD PPCA KNN

  • 0.6

0.7 0.8 1.00 1.25 1.50 1.75 2.00 0.1 0.2 0.3 0.4 0.5 AUC Lift F1 POI MAE RMSE POI MAE RMSE POI MAE RMSE

Selection Criteria Value

POI is generally better for
 AUC + F-measure

slide-16
SLIDE 16

COMPARING IMPUTATION APPROACHES (SVD + LOGR)

60 120 180 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.000.00 0.25 0.50 0.75 1.000.00 0.25 0.50 0.75 1.00

False Positive Rate True Positive Rate

selection POI MAE RMSE Mean

POI outperforms RMSE, but mean and MAE are generally the best

slide-17
SLIDE 17

COMPARING IMPUTATION APPROACHES (SVD + LOGR)

60 120 180

  • 5

10 15 20 25 POI MAE RMSE POI MAE RMSE POI MAE RMSE

Selection Criteria K

RMSE favors the simplest model (k=1), MAE favors most complex (k=25), POI lies in between the two

slide-18
SLIDE 18

COMPARING IMPUTATION APPROACHES (FEATURE RANK)

Feature Mean AUC F1 RMSE Systolic BP 1.50 1.70 1.70 2.40 SpO2 2.22 3.00 3.22 2.56 Shock Index 4.40 4.40 4.60 3.30 Temp 5.00 5.00 7.50 Diastolic BP 11.00 8.00 8.25 5.00

Selection criteria influences feature ranking within the same imputation method

slide-19
SLIDE 19

CONCLUSION

  • Generalizes to all ICU patients
  • Focuses on commonly observed, non-invasive

clinical measurements

  • Uses simple and accessible approaches for

missing data problem

slide-20
SLIDE 20

REFERENCES

Joyce C Ho, Cheng H Lee, and Joydeep Ghosh. Imputation-enhanced prediction of septic shock in ICU patients. In 2012 ACM SIGKDD Workshop on Health Informatics (HI-KDD), 2012. Joyce C Ho, Cheng H Lee, and Joydeep Ghosh. Septic shock prediction for patients with missing

  • data. ACM Transactions on Management Information

Systems (TMIS), 5(1):1:1–1:15, 2014.