A Multitask Multiple Kernel Learning Algorithm for Survival Analysis - - PowerPoint PPT Presentation

a multitask multiple kernel learning algorithm for
SMART_READER_LITE
LIVE PREVIEW

A Multitask Multiple Kernel Learning Algorithm for Survival Analysis - - PowerPoint PPT Presentation

A Multitask Multiple Kernel Learning Algorithm for Survival Analysis with Application to Cancer Biology Onur Dereli, Ceyda O guz, Mehmet Gnen Ko University, Istanbul, Turkey odereli14@ku.edu.tr coguz@ku.edu.tr mehmetgonen@ku.edu.tr


slide-1
SLIDE 1

A Multitask Multiple Kernel Learning Algorithm for Survival Analysis with Application to Cancer Biology

Onur Dereli, Ceyda O˘ guz, Mehmet Gönen

Koç University, ˙ Istanbul, Turkey

  • dereli14@ku.edu.tr coguz@ku.edu.tr mehmetgonen@ku.edu.tr

June 12, 2019 / Long Beach

Dereli, O., O˘ guz, C., Gönen, M. (KU) ICML 2019 June 12, 2019 / Long Beach 1 / 9

slide-2
SLIDE 2

Proposed Approach

Gene set1

G1 G3 G6 G17 G19 G28 G42

· · · Gene setP

G8 G12 G19 G25 G42 G47

· · · · · ·

X1 XT

Genes Patients Genes Patients Genes Patients Genes Patients

X1,1 X1,P

· · · Genes Patients Genes Patients

XT,1 XT,P

· · · · · · Patients Patients · · · Patients Patients

K1,1 K1,P

· · · Patients Patients · · · Patients Patients

KT,1 KT,P η1 ηP

· · ·

η1 ηP

· · · Patients Patients · · · Patients Patients

K1,η KT,η

Multitask multiple kernel learning

f1

Survival analysis

Vital Days to Days to last status death follow-up Alive NA 678 Dead 364 NA . . . . . . . . . Alive NA 2555 Dead 520 NA

Y1

· · · · · ·

fT

Survival analysis

Vital Days to Days to last status death follow-up Dead 456 NA Dead 3200 NA . . . . . . . . . Alive NA 2208 Dead 1891 NA

YT

Dereli, O., O˘ guz, C., Gönen, M. (KU) ICML 2019 June 12, 2019 / Long Beach 2 / 9

slide-3
SLIDE 3

Proposed Approach

Gene set1

G1 G3 G6 G17 G19 G28 G42

· · · Gene setP

G8 G12 G19 G25 G42 G47

· · · · · ·

X1 XT

Genes Patients Genes Patients Genes Patients Genes Patients

X1,1 X1,P

· · · Genes Patients Genes Patients

XT,1 XT,P

· · · · · · Patients Patients · · · Patients Patients

K1,1 K1,P

· · · Patients Patients · · · Patients Patients

KT,1 KT,P η1 ηP

· · ·

η1 ηP

· · · Patients Patients · · · Patients Patients

K1,η KT,η

Multitask multiple kernel learning

f1

Survival analysis

Vital Days to Days to last status death follow-up Alive NA 678 Dead 364 NA . . . . . . . . . Alive NA 2555 Dead 520 NA

Y1

· · · · · ·

fT

Survival analysis

Vital Days to Days to last status death follow-up Dead 456 NA Dead 3200 NA . . . . . . . . . Alive NA 2208 Dead 1891 NA

YT

Dereli, O., O˘ guz, C., Gönen, M. (KU) ICML 2019 June 12, 2019 / Long Beach 2 / 9

slide-4
SLIDE 4

Proposed Approach

Gene set1

G1 G3 G6 G17 G19 G28 G42

· · · Gene setP

G8 G12 G19 G25 G42 G47

· · · · · ·

X1 XT

Genes Patients Genes Patients Genes Patients Genes Patients

X1,1 X1,P

· · · Genes Patients Genes Patients

XT,1 XT,P

· · · · · · Patients Patients · · · Patients Patients

K1,1 K1,P

· · · Patients Patients · · · Patients Patients

KT,1 KT,P η1 ηP

· · ·

η1 ηP

· · · Patients Patients · · · Patients Patients

K1,η KT,η

Multitask multiple kernel learning

f1

Survival analysis

Vital Days to Days to last status death follow-up Alive NA 678 Dead 364 NA . . . . . . . . . Alive NA 2555 Dead 520 NA

Y1

· · · · · ·

fT

Survival analysis

Vital Days to Days to last status death follow-up Dead 456 NA Dead 3200 NA . . . . . . . . . Alive NA 2208 Dead 1891 NA

YT

Dereli, O., O˘ guz, C., Gönen, M. (KU) ICML 2019 June 12, 2019 / Long Beach 2 / 9

slide-5
SLIDE 5

Proposed Approach

Gene set1

G1 G3 G6 G17 G19 G28 G42

· · · Gene setP

G8 G12 G19 G25 G42 G47

· · · · · ·

X1 XT

Genes Patients Genes Patients Genes Patients Genes Patients

X1,1 X1,P

· · · Genes Patients Genes Patients

XT,1 XT,P

· · · · · · Patients Patients · · · Patients Patients

K1,1 K1,P

· · · Patients Patients · · · Patients Patients

KT,1 KT,P η1 ηP

· · ·

η1 ηP

· · · Patients Patients · · · Patients Patients

K1,η KT,η

Multitask multiple kernel learning

f1

Survival analysis

Vital Days to Days to last status death follow-up Alive NA 678 Dead 364 NA . . . . . . . . . Alive NA 2555 Dead 520 NA

Y1

· · · · · ·

fT

Survival analysis

Vital Days to Days to last status death follow-up Dead 456 NA Dead 3200 NA . . . . . . . . . Alive NA 2208 Dead 1891 NA

YT

Dereli, O., O˘ guz, C., Gönen, M. (KU) ICML 2019 June 12, 2019 / Long Beach 2 / 9

slide-6
SLIDE 6

Multitask Survival MKL Formulation

minimize

T

  • t=1
  • 1

2w⊤

t wt + C Nt

  • i=1

(ξ+

ti + (1 − δti)ξ− ti )

  • with respect to wt ∈ RDt, ξ+

t ∈ RNt, ξ− t ∈ RNt, bt ∈ R

subject to ǫ + ξ+

ti ≥ yti − w⊤ t xti − bt

∀(t, i) ǫ + ξ−

ti ≥ w⊤ t xti + bt − yti

∀(t, i) ξ+

ti ≥ 0

∀(t, i) ξ−

ti ≥ 0

∀(t, i)

Dereli, O., O˘ guz, C., Gönen, M. (KU) ICML 2019 June 12, 2019 / Long Beach 3 / 9

slide-7
SLIDE 7

Multitask Survival MKL Formulation

minimize −

T

  • t=1

Nt

  • i=1

yti(α+

ti − α− ti ) + ǫ T

  • t=1

Nt

  • i=1

(α+

ti + α− ti )

+ 1 2

T

  • t=1

Nt

  • i=1

Nt

  • j=1

(α+

ti − α− ti )(α+ tj − α− tj ) P

  • m=1

ηmkm(xti, xtj) with respect to α+

t ∈ RNt, α− t ∈ RNt, η ∈ RP

subject to

Nt

  • i=1

(α+

ti − α− ti ) = 0

∀t C ≥ α+

ti ≥ 0

∀(t, i) C(1 − δti) ≥ α−

ti ≥ 0

∀(t, i)

P

  • m=1

ηm = 1 ηm ≥ 0 ∀m

Dereli, O., O˘ guz, C., Gönen, M. (KU) ICML 2019 June 12, 2019 / Long Beach 3 / 9

slide-8
SLIDE 8

Multitask Survival MKL Formulation

η(s+1)

m

=

T

  • t=1

η(s)

m

  • Nt
  • i=1

Nt

  • j=1

α(s)

ti α(s) tj km(xti, xtj) T

  • t=1

P

  • =1

η(s)

  • Nt
  • i=1

Nt

  • j=1

α(s)

ti α(s) tj ko(xti, xtj)

∀m

Dereli, O., O˘ guz, C., Gönen, M. (KU) ICML 2019 June 12, 2019 / Long Beach 3 / 9

slide-9
SLIDE 9

Data Sets

  • BLCA

(402) BRCA (1067) CESC (291) COAD (433) ESCA (160) GBM (152) HNSC (498) KIRC (526) KIRP (285) LAML (130) LGG (506) LIHC (365) LUAD (500) LUSC (493) OV (372) PAAD (176) READ (156) SARC (256) STAD (348) UCEC (539)

  • 20 cancer data sets from TCGA database

Gene expression profiles and survival characteristics

  • Hallmark Gene Set [1] & PID Pathway [2] Collections

Dereli, O., O˘ guz, C., Gönen, M. (KU) ICML 2019 June 12, 2019 / Long Beach 4 / 9

slide-10
SLIDE 10

BLCA

p < 1e−3 p = 0.640 p < 1e−3 p = 0.016 p < 1e−3 p < 1e−3 p < 1e−3

  • C−index

0.3 0.4 0.5 0.6 0.7

BRCA

p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p = 0.023

  • 0.4

0.5 0.6 0.7 0.8

CESC

p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3

  • 0.3

0.4 0.5 0.6 0.7 0.8 0.9

COAD

p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p = 0.086 p = 0.012 p < 1e−3

  • 0.3

0.4 0.5 0.6 0.7 0.8

ESCA

p = 0.009 p = 0.360 p = 0.074 p = 0.834 p = 0.656 p = 0.313 p = 0.740

  • 0.3

0.4 0.5 0.6 0.7 0.8 0.9

GBM

p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p = 0.681

  • C−index

0.3 0.4 0.5 0.6 0.7 0.8

HNSC

p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3

  • ● ●
  • ● ●
  • 0.3

0.4 0.5 0.6 0.7 0.8

KIRC

p = 0.030 p < 1e−3 p = 0.778 p = 0.083 p = 0.331 p < 1e−3 p < 1e−3

  • 0.5

0.6 0.7 0.8

KIRP

p = 0.230 p = 0.833 p = 0.888 p = 0.096 p = 0.592 p = 0.866 p = 0.006

  • 0.4

0.5 0.6 0.7 0.8 0.9 1.0

LAML

p < 1e−3 p = 0.001 p = 0.325 p < 1e−3 p = 0.931 p < 1e−3 p < 1e−3

  • 0.4

0.5 0.6 0.7 0.8 0.9

LGG

p < 1e−3 p = 0.026 p = 0.803 p = 0.541 p = 0.483 p < 1e−3 p = 0.922

  • C−index

0.5 0.6 0.7 0.8 0.9

LIHC

p = 0.349 p = 0.505 p = 0.003 p = 0.620 p = 0.008 p = 0.004 p < 1e−3

  • 0.4

0.5 0.6 0.7 0.8

LUAD

p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3

  • ●●
  • 0.4

0.5 0.6 0.7 0.8

LUSC

p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3

  • 0.3

0.4 0.5 0.6 0.7 0.8

OV

p < 1e−3 p < 1e−3 p < 1e−3 p = 0.001 p < 1e−3 p < 1e−3 p < 1e−3

  • ● ●
  • 0.3

0.4 0.5 0.6 0.7

PAAD

p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p = 0.001

  • RF

SVM MKL[P] MTMKL[P] MKL[H] MTMKL[H]

C−index

0.3 0.4 0.5 0.6 0.7 0.8

READ

p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p = 0.681 p = 0.072

  • RF

SVM MKL[P] MTMKL[P] MKL[H] MTMKL[H] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

SARC

p = 0.001 p = 0.053 p < 1e−3 p = 0.001 p < 1e−3 p < 1e−3 p = 0.004

  • RF

SVM MKL[P] MTMKL[P] MKL[H] MTMKL[H] 0.4 0.5 0.6 0.7 0.8 0.9

STAD

p < 1e−3 p < 1e−3 p < 1e−3 p = 0.010 p < 1e−3 p < 1e−3 p < 1e−3

  • RF

SVM MKL[P] MTMKL[P] MKL[H] MTMKL[H] 0.3 0.4 0.5 0.6 0.7

UCEC

p < 1e−3 p = 0.002 p < 1e−3 p = 0.003 p < 1e−3 p < 1e−3 p < 1e−3

  • RF

SVM MKL[P] MTMKL[P] MKL[H] MTMKL[H] 0.4 0.5 0.6 0.7 0.8

Dereli, O., O˘ guz, C., Gönen, M. (KU) ICML 2019 June 12, 2019 / Long Beach 5 / 9

slide-11
SLIDE 11 BLCA p < 1e−3 p = 0.640 p < 1e−3 p = 0.016 p < 1e−3 p < 1e−3 p < 1e−3
  • C−index
0.3 0.4 0.5 0.6 0.7 BRCA p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p = 0.023
  • 0.4
0.5 0.6 0.7 0.8 CESC p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3
  • 0.3
0.4 0.5 0.6 0.7 0.8 0.9 COAD p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p = 0.086 p = 0.012 p < 1e−3
  • 0.3
0.4 0.5 0.6 0.7 0.8 ESCA p = 0.009 p = 0.360 p = 0.074 p = 0.834 p = 0.656 p = 0.313 p = 0.740
  • 0.3
0.4 0.5 0.6 0.7 0.8 0.9 GBM p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p = 0.681
  • C−index
0.3 0.4 0.5 0.6 0.7 0.8 HNSC p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3
  • ● ●
  • ● ●
  • 0.3
0.4 0.5 0.6 0.7 0.8 KIRC p = 0.030 p < 1e−3 p = 0.778 p = 0.083 p = 0.331 p < 1e−3 p < 1e−3
  • 0.5
0.6 0.7 0.8 KIRP p = 0.230 p = 0.833 p = 0.888 p = 0.096 p = 0.592 p = 0.866 p = 0.006
  • 0.4
0.5 0.6 0.7 0.8 0.9 1.0 LAML p < 1e−3 p = 0.001 p = 0.325 p < 1e−3 p = 0.931 p < 1e−3 p < 1e−3
  • 0.4
0.5 0.6 0.7 0.8 0.9 LGG p < 1e−3 p = 0.026 p = 0.803 p = 0.541 p = 0.483 p < 1e−3 p = 0.922
  • C−index
0.5 0.6 0.7 0.8 0.9 LIHC p = 0.349 p = 0.505 p = 0.003 p = 0.620 p = 0.008 p = 0.004 p < 1e−3
  • 0.4
0.5 0.6 0.7 0.8 LUAD p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3
  • ●●
  • 0.4
0.5 0.6 0.7 0.8 LUSC p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3
  • 0.3
0.4 0.5 0.6 0.7 0.8 OV p < 1e−3 p < 1e−3 p < 1e−3 p = 0.001 p < 1e−3 p < 1e−3 p < 1e−3
  • ● ●
  • 0.3
0.4 0.5 0.6 0.7 PAAD p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p = 0.001
  • RF
SVM MKL[P] MTMKL[P] MKL[H] MTMKL[H] C−index 0.3 0.4 0.5 0.6 0.7 0.8 READ p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p = 0.681 p = 0.072
  • RF
SVM MKL[P] MTMKL[P] MKL[H] MTMKL[H] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 SARC p = 0.001 p = 0.053 p < 1e−3 p = 0.001 p < 1e−3 p < 1e−3 p = 0.004
  • RF
SVM MKL[P] MTMKL[P] MKL[H] MTMKL[H] 0.4 0.5 0.6 0.7 0.8 0.9 STAD p < 1e−3 p < 1e−3 p < 1e−3 p = 0.010 p < 1e−3 p < 1e−3 p < 1e−3
  • RF
SVM MKL[P] MTMKL[P] MKL[H] MTMKL[H] 0.3 0.4 0.5 0.6 0.7 UCEC p < 1e−3 p = 0.002 p < 1e−3 p = 0.003 p < 1e−3 p < 1e−3 p < 1e−3
  • RF
SVM MKL[P] MTMKL[P] MKL[H] MTMKL[H] 0.4 0.5 0.6 0.7 0.8
  • 0.4

0.5 0.6

  • 0.4

0.5 0.6 p < 1e−3

  • LUSC

p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3 p < 1e−3

  • 0.3

0.4 0.5 0.6 0.7 0.8

OV

p < 1e−3 p < 1e−3 p < 1e−3 p = 0.001 p < 1e−3 p < 1e−3 p < 1e−3

  • ● ●
  • 0.3

0.4 0.5 0.6 0.7 p = 0.004

  • MKL[H]

MTMKL[H]

STAD

p < 1e−3 p < 1e−3 p < 1e−3 p = 0.010 p < 1e−3 p < 1e−3 p < 1e−3

  • RF

SVM MKL[P] MTMKL[P] MKL[H] MTMKL[H] 0.3 0.4 0.5 0.6 0.7

UCEC

p < 1e−3 p = 0.002 p < 1e−3 p = 0.003 p < 1e−3 p < 1e−3 p < 1e−3

  • RF

SVM MKL[P] MTMKL[P] MKL[H] MTMKL[H] 0.4 0.5 0.6 0.7 0.8

Dereli, O., O˘ guz, C., Gönen, M. (KU) ICML 2019 June 12, 2019 / Long Beach 5 / 9

slide-12
SLIDE 12

Hallmark Gene Sets

Selection frequency 0.0 0.2 0.4 0.6 0.8 1.0

G L Y C O L Y S I S A N G I O G E N E S I S B I L E _ A C I D _ M E T A B O L I S M S P E R M A T O G E N E S I S K R A S _ S I G N A L I N G _ D N A P O P T O S I S A P I C A L _ S U R F A C E A L L O G R A F T _ R E J E C T I O N U N F O L D E D _ P R O T E I N _ R E S P O N S E U V _ R E S P O N S E _ U P W N T _ B E T A _ C A T E N I N _ S I G N A L I N G E S T R O G E N _ R E S P O N S E _ E A R L Y P E R O X I S O M E E S T R O G E N _ R E S P O N S E _ L A T E C H O L E S T E R O L _ H O M E O S T A S I S H Y P O X I A P 5 3 _ P A T H W A Y M T O R C 1 _ S I G N A L I N G X E N O B I O T I C _ M E T A B O L I S M K R A S _ S I G N A L I N G _ U P P A N C R E A S _ B E T A _ C E L L S F A T T Y _ A C I D _ M E T A B O L I S M C O A G U L A T I O N E 2 F _ T A R G E T S H E M E _ M E T A B O L I S M I L 2 _ S T A T 5 _ S I G N A L I N G G 2 M _ C H E C K P O I N T M Y O G E N E S I S T N F A _ S I G N A L I N G _ V I A _ N F K B A P I C A L _ J U N C T I O N R E A C T I V E _ O X I G E N _ S P E C I E S _ P A T H W A Y I N T E R F E R O N _ G A M M A _ R E S P O N S E P I 3 K _ A K T _ M T O R _ S I G N A L I N G A D I P O G E N E S I S A N D R O G E N _ R E S P O N S E I N T E R F E R O N _ A L P H A _ R E S P O N S E M I T O T I C _ S P I N D L E T G F _ B E T A _ S I G N A L I N G I L 6 _ J A K _ S T A T 3 _ S I G N A L I N G D N A _ R E P A I R N O T C H _ S I G N A L I N G P R O T E I N _ S E C R E T I O N H E D G E H O G _ S I G N A L I N G C O M P L E M E N T M Y C _ T A R G E T S _ V 1 M Y C _ T A R G E T S _ V 2 E P I T H E L I A L _ M E S E N C H Y M A L _ T R A N S I T I O N I N F L A M M A T O R Y _ R E S P O N S E O X I D A T I V E _ P H O S P H O R Y L A T I O N U V _ R E S P O N S E _ D N

Dereli, O., O˘ guz, C., Gönen, M. (KU) ICML 2019 June 12, 2019 / Long Beach 6 / 9

slide-13
SLIDE 13

Summary

  • A multitask multiple kernel learning algorithm
  • Integration of different data sets
  • Performing survival prediction and knowledge extraction conjointly
  • Understanding the underlying mechanisms of cancer

Dereli, O., O˘ guz, C., Gönen, M. (KU) ICML 2019 June 12, 2019 / Long Beach 7 / 9

slide-14
SLIDE 14

References

[1]

  • A. Liberzon, C. Birger, H. Thorvaldsdottir, M. Ghandi, J. P

. Mesirov, and P . Tamayo. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst., 1:417–425, 2015. [2]

  • C. F. Schaefer, K. Anthony, S. Krupa, J. Buchoff, M. Day, et al. PID: The Pathway Interaction Database. Nucleic Acids Res.,

37:D674–D679, 2009. Dereli, O., O˘ guz, C., Gönen, M. (KU) ICML 2019 June 12, 2019 / Long Beach 8 / 9

slide-15
SLIDE 15

Thank you

You can reach R implementations of our work at https://github.com/mehmetgonen/path2msurv. Please visit our poster if you have any questions. Room: Pacific Ballroom #242 Date: Wed Jun 12th 06:30 - 09:00 PM

This work was supported by the Scientific and Technological Research Council of Turkey (TÜB˙ ITAK) under Grant EEEAG 117E181 and Ph.D. scholarship (2211).

Dereli, O., O˘ guz, C., Gönen, M. (KU) ICML 2019 June 12, 2019 / Long Beach 9 / 9