Learning by Fusing Heterogeneous Data
Marinka Zitnik
Thesis Defense, October 22 2015
Learning by Fusing Heterogeneous Data Marinka Zitnik Thesis - - PowerPoint PPT Presentation
Learning by Fusing Heterogeneous Data Marinka Zitnik Thesis Defense, October 22 2015 Motivation Marinka Zitnik - PhD Thesis Large Heterogeneous Data Compendia Marinka Zitnik - PhD Thesis Large Heterogeneous Data Compendia Large-scale
Thesis Defense, October 22 2015
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
Large-scale physics experiments
Marinka Zitnik - PhD Thesis
Large-scale physics experiments
Marinka Zitnik - PhD Thesis
Large-scale physics experiments
Social networks, recommender systems
users movies
reviews relatjonships Good Will Huntjng Saving Private Ryan The Terminal Schindler’s List Matu Damon (actor) Drama (genre) Tom Hanks (actor) Steven Spielberg (director) War (tag) Liam Neeson (actor)
Marinka Zitnik - PhD Thesis
Large-scale physics experiments Social networks, recommender systems
users movies
reviews relatjonships Good Will Huntjng Saving Private Ryan The Terminal Schindler’s List Matu Damon (actor) Drama (genre) Tom Hanks (actor) Steven Spielberg (director) War (tag) Liam Neeson (actor)
Marinka Zitnik - PhD Thesis
Large-scale physics experiments Social networks, recommender systems
users movies
reviews relatjonships Good Will Huntjng Saving Private Ryan The Terminal Schindler’s List Matu Damon (actor) Drama (genre) Tom Hanks (actor) Steven Spielberg (director) War (tag) Liam Neeson (actor)
Global navigation satellite systems
Marinka Zitnik - PhD Thesis
Large-scale physics experiments Social networks, recommender systems
users movies
reviews relatjonships Good Will Huntjng Saving Private Ryan The Terminal Schindler’s List Matu Damon (actor) Drama (genre) Tom Hanks (actor) Steven Spielberg (director) War (tag) Liam Neeson (actor)
Global navigation satellite systems
Marinka Zitnik - PhD Thesis
Large-scale physics experiments Social networks, recommender systems
users movies
reviews relatjonships Good Will Huntjng Saving Private Ryan The Terminal Schindler’s List Matu Damon (actor) Drama (genre) Tom Hanks (actor) Steven Spielberg (director) War (tag) Liam Neeson (actor)
Global navigation satellite systems
Molecular biology
Response to bacterium Response to
Response to external biotic stimulus Response to external stimulus Response to biotic stimulus Defense response Defense response to
Response to stress Defense response to bacterium T1 T2 T3 T4 T5 T6 T7 V1 V2 V3 V4 V5
Marinka Zitnik - PhD Thesis
Large-scale physics experiments Social networks, recommender systems
users movies
reviews relatjonships Good Will Huntjng Saving Private Ryan The Terminal Schindler’s List Matu Damon (actor) Drama (genre) Tom Hanks (actor) Steven Spielberg (director) War (tag) Liam Neeson (actor)
Global navigation satellite systems
Molecular biology
Response to bacterium Response to
Response to external biotic stimulus Response to external stimulus Response to biotic stimulus Defense response Defense response to
Response to stress Defense response to bacterium T1 T2 T3 T4 T5 T6 T7 V1 V2 V3 V4 V5
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
Recipe matrix of A Recipe matrix of B Backbone matrix of A-B
Marinka Zitnik - PhD Thesis
Recipe matrix of A Recipe matrix of B Backbone matrix of A-B
Marinka Zitnik - PhD Thesis
Recipe matrix of A Recipe matrix of B Backbone matrix of A-B
~ ~
x x =
Reconstructed matrix A-B
Marinka Zitnik - PhD Thesis
C B A
Marinka Zitnik - PhD Thesis
C B A
Marinka Zitnik - PhD Thesis
C B A
Marinka Zitnik - PhD Thesis
C B A
Marinka Zitnik - PhD Thesis
C B A
Shared factor
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
E B D C A F G
Marinka Zitnik - PhD Thesis
E B D C A F G E B D C A F G
Marinka Zitnik - PhD Thesis
E B D C A F G E B D C A F G
Many shared factors
Marinka Zitnik - PhD Thesis
E B D C A F G E B D C A F G
Many shared factors
Marinka Zitnik - PhD Thesis
E B D C A F G E B D C A F G
Many shared factors
Marinka Zitnik - PhD Thesis
E B D C A F G E B D C A F G
many shared factors
Marinka Zitnik - PhD Thesis
E B D C A F G E B D C A F G
many shared factors
Given
Marinka Zitnik - PhD Thesis
E B D C A F G E B D C A F G
many shared factors
Given Find latent matrices and that minimize
B A
GA GB SAB
Marinka Zitnik - PhD Thesis
E B D C A F G E B D C A F G
many shared factors
Given Find latent matrices and that minimize The problem is non-convex. The global optimum is unknown
B A
GA GB SAB
Marinka Zitnik - PhD Thesis
E B D C A F G E B D C A F G
Many shared factors
Marinka Zitnik - PhD Thesis
E B D C A F G E B D C A F G
Many shared factors
(8)
Input: A set R of relation matrices Rij; constraint matrices Θ(t) for t 2 {1, 2, . . . , maxi ti}; ranks k1, k2, . . . , kr (i, j 2 [r]). Output: Matrix factors S and G. 1) Initialize Gi for i = 1, 2, . . . , r. 2) Repeat until convergence:
S (GT G)−1GT RG(GT G)−1.
i
0 for i = 1, 2, . . . , r.
i
0 for i = 1, 2, . . . , r.
G(e)
i
+= (RijGjST
ij)+ + Gi(SijGT j GjST ij)−
G(d)
i
+= (RijGjST
ij)− + Gi(SijGT j GjST ij)+
G(e)
j
+= (RT
ijGiSij)+ + Gj(ST ijGT i GiSij)−
G(d)
j
+= (RT
ijGiSij)− + Gj(ST ijGT i GiSij)+ (10)
G(e)
i
+= [Θ(t)
i
]−Gi for i = 1, 2, . . . , r G(d)
i
+= [Θ(t)
i
]+Gi for i = 1, 2, . . . , r (11)
G G Diag( v u u t G(e)
1
G(d)
1
, v u u t G(e)
2
G(d)
2
, . . . , v u u t G(e)
r
G(d)
r
), (12) where denotes the Hadamard product. The p· and
· ·
are entry-wise operations.
Marinka Zitnik - PhD Thesis
E B D C A F G E B D C A F G
Many shared factors
Marinka Zitnik - PhD Thesis
E B D C A F G E B D C A F G
Many shared factors
Marinka Zitnik - PhD Thesis
E B D C A F G E B D C A F G
Many shared factors
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
genetic screen 50,000 clonal mutants genome found workload estimated 12,000 genes 7 genes 5 years ~200 genes
Gram+ defective: swp1, gpi, nagB1 Gram- defective: clkB, spc3, alyL, nip7
Nasser et al (2013) Curr Biol
Marinka Zitnik - PhD Thesis
Žitnik et al. PLoS Comp Bio 2015
14 data sources 4 Gram- seed genes 9 candidate genes
A data-driven approach
Marinka Zitnik - PhD Thesis
R1,10 Θ1 ABC family
Miranda et al. 2013
Gene Gene Ontology term Phenotype Ontology term PubMed identifier MeSH descriptor
Development
Parikh et al. 2010
Bacterial RNA-seq
Nasser et al. 2013
KEGG pathway Reactome pathway 1 4 8 2 3 5 6 7 9 10 R1,9 R1,8 R1,7 R1,6 R6,5 R6,4 R1,5 R1,4 R1,2 R2,3 R2,4 R5,4
Žitnik et al. PLoS Comp Bio 2015
14 data sources 4 Gram- seed genes 9 candidate genes
A data-driven approach
Marinka Zitnik - PhD Thesis
Drugs Dicty genes
Diseases
G1 S1,2 S2,3
1 2 3
Žitnik et al. PLoS Comp Bio 2015
Marinka Zitnik - PhD Thesis
Dicty genes Drugs Diseases
Drugs Dicty genes
Diseases
G1 S1,2 S2,3
1 2 3
Žitnik et al. PLoS Comp Bio 2015
Marinka Zitnik - PhD Thesis
Dicty genes Drugs Diseases
Dicty genes Diseases
x x
Profile matrix
Drugs Dicty genes
Diseases
G1 S1,2 S2,3
1 2 3
Žitnik et al. PLoS Comp Bio 2015
Marinka Zitnik - PhD Thesis
Dicty genes Drugs Diseases Dicty genes Diseases
x x
=
Profile matrix
Žitnik et al. PLoS Comp Bio 2015
Marinka Zitnik - PhD Thesis
Dicty genes Drugs Diseases Dicty genes Diseases
x x
=
Profile matrix
R1,10 Θ1 ABC family
Miranda et al. 2013
Gene Gene Ontology term Phenotype Ontology term PubMed identifier MeSH descriptor
Development
Parikh et al. 2010
Bacterial RNA-seq
Nasser et al. 2013
KEGG pathway Reactome pathway 1 4 8 2 3 5 6 7 9 10 R1,9 R1,8 R1,7 R1,6 R6,5 R6,4 R1,5 R1,4 R1,2 R2,3 R2,4 R5,4
Drugs Dicty genes Diseases G1 S1,2 S2,3 1 2 3Žitnik et al. PLoS Comp Bio 2015
Marinka Zitnik - PhD Thesis
Dicty genes Drugs Diseases Dicty genes Diseases
x x
=
Profile matrix
R1,10 Θ1 ABC family
Miranda et al. 2013
Gene Gene Ontology term Phenotype Ontology term PubMed identifier MeSH descriptor
Development
Parikh et al. 2010
Bacterial RNA-seq
Nasser et al. 2013
KEGG pathway Reactome pathway 1 4 8 2 3 5 6 7 9 10 R1,9 R1,8 R1,7 R1,6 R6,5 R6,4 R1,5 R1,4 R1,2 R2,3 R2,4 R5,4
Latent chains
Drugs Dicty genes Diseases G1 S1,2 S2,3 1 2 3Žitnik et al. PLoS Comp Bio 2015
Marinka Zitnik - PhD Thesis
Dicty genes Drugs Diseases Dicty genes Diseases
x x
=
Profile matrix
Žitnik et al. PLoS Comp Bio 2015
Marinka Zitnik - PhD Thesis
Dicty genes Drugs Diseases Dicty genes Diseases
x x
=
Profile matrix
Seed genes Similarity score aggregation Seed genes Similarity scoring Candidate gene Chains i ix ii iv v vi vii viii iii Scored candidate gene
Drugs Dicty genes Diseases G1 S1,2 S2,3 1 2 3Žitnik et al. PLoS Comp Bio 2015
Marinka Zitnik - PhD Thesis
Žitnik et al. PLoS Comp Bio 2015
Marinka Zitnik - PhD Thesis
cf50-1 smlA acbA pirA rps10 abpC tirA DDB_G0272184 pikB vps46 pikA swp1 ggtA DDB_G0288519 pten DDB_G0288551 tra2 DDB_G0286429 dscA-1 cinC udpB sfbA modA DDB_G0287399 prmt5 sh DDB pt cf ac sm DDB DDB tr si rb DDB pi DDB DG1 ad DDB DD_ ds gdt pi DDB DDB ab Žitnik et al. PLoS Comp Bio 2015
Marinka Zitnik - PhD Thesis
14 data sources 4 Gram- seed genes 9 candidate genes
abpC– modA– cf50-1– tirA– Day 2 # of D. d cells AX4 acbA– smlA– pikA–/pikB– pten– 104 103 102 10 104 103 102 10 Day 3
8/9 predictions correct!
cf50-1 smlA acbA pirA rps10 abpC tirA DDB_G0272184 pikB vps46 pikA swp1 ggtA DDB_G0288519 pten DDB_G0288551 tra2 DDB_G0286429 dscA-1 cinC udpB sfbA modA DDB_G0287399 prmt5 sh DDB pt cf ac sm DDB DDB tr si rb DDB pi DDB DG1 ad DDB DD_ ds gdt pi DDB DDB ab Žitnik et al. PLoS Comp Bio 2015
Marinka Zitnik - PhD Thesis
Žitnik & Zupan IEEE TPAMI 2015
Marinka Zitnik - PhD Thesis
Gene
PMID
R14
Experimental Condition
R13 1
GO Term
R 12
KEGG Pathway
R16 2
MeSH Descriptor
R45 R42 5 6 R62
1 2 3 4 5 6
Žitnik & Zupan IEEE TPAMI 2015
Marinka Zitnik - PhD Thesis
Gene
PMID
R14
Experimental Condition
R13 1
GO Term
R 12
KEGG Pathway
R16 2
MeSH Descriptor
R45 R42 5 6 R62
1 2 3 4 5 6
Chemical
Θ1
Pharmacologic Action
R12
PMID
R13
Depositor
R14
Substructure Fingerprint
R15
Depositor Category
R46
1 2 3 4 5 6
Žitnik & Zupan IEEE TPAMI 2015
Marinka Zitnik - PhD Thesis
Prediction task DFMF F1 AUC 100 D. discoideum genes 0.799 0.801 1000 D. discoideum genes 0.826 0.823 Whole D. discoideum genome 0.831 0.849 Pharmacologic actions 0.663 0.834
Gene
PMID
R14
Experimental Condition
R13 1
GO Term
R 12
KEGG Pathway
R16 2
MeSH Descriptor
R45 R42 5 6 R62
1 2 3 4 5 6
Chemical
Θ1
Pharmacologic Action
R12
PMID
R13
Depositor
R14
Substructure Fingerprint
R15
Depositor Category
R46
1 2 3 4 5 6
Žitnik & Zupan IEEE TPAMI 2015
Marinka Zitnik - PhD Thesis
Prediction task DFMF F1 AUC 100 D. discoideum genes 0.799 0.801 1000 D. discoideum genes 0.826 0.823 Whole D. discoideum genome 0.831 0.849 Pharmacologic actions 0.663 0.834
Gene
PMID
R14
Experimental Condition
R13 1
GO Term
R 12
KEGG Pathway
R16 2
MeSH Descriptor
R45 R42 5 6 R62
1 2 3 4 5 6
Chemical
Θ1
Pharmacologic Action
R12
PMID
R13
Depositor
R14
Substructure Fingerprint
R15
Depositor Category
R46
1 2 3 4 5 6
MKL AUC F1 AUC 0.801 0.781 0.788 0.823 0.787 0.798 0.849 0.800 0.821 0.834 0.639 0.811
Žitnik & Zupan IEEE TPAMI 2015
Marinka Zitnik - PhD Thesis
Prediction task DFMF F1 AUC 100 D. discoideum genes 0.799 0.801 1000 D. discoideum genes 0.826 0.823 Whole D. discoideum genome 0.831 0.849 Pharmacologic actions 0.663 0.834
Gene
PMID
R14
Experimental Condition
R13 1
GO Term
R 12
KEGG Pathway
R16 2
MeSH Descriptor
R45 R42 5 6 R62
1 2 3 4 5 6
Chemical
Θ1
Pharmacologic Action
R12
PMID
R13
Depositor
R14
Substructure Fingerprint
R15
Depositor Category
R46
1 2 3 4 5 6
MKL AUC F1 AUC 0.801 0.781 0.788 0.823 0.787 0.798 0.849 0.800 0.821 0.834 0.639 0.811 RF AUC F1 AUC 0.788 0.761 0.785 0.798 0.767 0.788 0.821 0.782 0.801 0.811 0.643 0.819
Žitnik & Zupan IEEE TPAMI 2015
Marinka Zitnik - PhD Thesis
Prediction task DFMF F1 AUC 100 D. discoideum genes 0.799 0.801 1000 D. discoideum genes 0.826 0.823 Whole D. discoideum genome 0.831 0.849 Pharmacologic actions 0.663 0.834
Gene
PMID
R14
Experimental Condition
R13 1
GO Term
R 12
KEGG Pathway
R16 2
MeSH Descriptor
R45 R42 5 6 R62
1 2 3 4 5 6
Chemical
Θ1
Pharmacologic Action
R12
PMID
R13
Depositor
R14
Substructure Fingerprint
R15
Depositor Category
R46
1 2 3 4 5 6
MKL AUC F1 AUC 0.801 0.781 0.788 0.823 0.787 0.798 0.849 0.800 0.821 0.834 0.639 0.811 RF AUC F1 AUC 0.788 0.761 0.785 0.798 0.767 0.788 0.821 0.782 0.801 0.811 0.643 0.819 tri-SPMF AUC F1 AUC 0.785 0.731 0.724 0.788 0.756 0.741 0.801 0.778 0.787 0.819 0.641 0.810
Žitnik & Zupan IEEE TPAMI 2015
Marinka Zitnik - PhD Thesis
Prediction task DFMF F1 AUC 100 D. discoideum genes 0.799 0.801 1000 D. discoideum genes 0.826 0.823 Whole D. discoideum genome 0.831 0.849 Pharmacologic actions 0.663 0.834
Gene
PMID
R14Experimental Condition
R13 1GO Term
R 12KEGG Pathway
R16 2MeSH Descriptor
R45 R42 5 6 R621 2 3 4 5 6 Chemical
Θ1 Pharmacologic Action R12PMID
R13Depositor
R14 Substructure Fingerprint R15Depositor Category
R461 2 3 4 5 6
MKL AUC F1 AUC 0.801 0.781 0.788 0.823 0.787 0.798 0.849 0.800 0.821 0.834 0.639 0.811 RF AUC F1 AUC 0.788 0.761 0.785 0.798 0.767 0.788 0.821 0.782 0.801 0.811 0.643 0.819 tri-SPMF AUC F1 AUC 0.785 0.731 0.724 0.788 0.756 0.741 0.801 0.778 0.787 0.819 0.641 0.810
Žitnik & Zupan IEEE TPAMI 2015
Marinka Zitnik - PhD Thesis
Prediction task DFMF F1 AUC 100 D. discoideum genes 0.799 0.801 1000 D. discoideum genes 0.826 0.823 Whole D. discoideum genome 0.831 0.849 Pharmacologic actions 0.663 0.834
Gene
PMID
R14Experimental Condition
R13 1GO Term
R 12KEGG Pathway
R16 2MeSH Descriptor
R45 R42 5 6 R621 2 3 4 5 6 Chemical
Θ1 Pharmacologic Action R12PMID
R13Depositor
R14 Substructure Fingerprint R15Depositor Category
R461 2 3 4 5 6
MKL AUC F1 AUC 0.801 0.781 0.788 0.823 0.787 0.798 0.849 0.800 0.821 0.834 0.639 0.811 RF AUC F1 AUC 0.788 0.761 0.785 0.798 0.767 0.788 0.821 0.782 0.801 0.811 0.643 0.819 tri-SPMF AUC F1 AUC 0.785 0.731 0.724 0.788 0.756 0.741 0.801 0.778 0.787 0.819 0.641 0.810
Žitnik & Zupan IEEE TPAMI 2015
Marinka Zitnik - PhD Thesis
Prediction task DFMF F1 AUC 100 D. discoideum genes 0.799 0.801 1000 D. discoideum genes 0.826 0.823 Whole D. discoideum genome 0.831 0.849 Pharmacologic actions 0.663 0.834
Gene
PMID
R14Experimental Condition
R13 1GO Term
R 12KEGG Pathway
R16 2MeSH Descriptor
R45 R42 5 6 R621 2 3 4 5 6 Chemical
Θ1 Pharmacologic Action R12PMID
R13Depositor
R14 Substructure Fingerprint R15Depositor Category
R461 2 3 4 5 6
MKL AUC F1 AUC 0.801 0.781 0.788 0.823 0.787 0.798 0.849 0.800 0.821 0.834 0.639 0.811 RF AUC F1 AUC 0.788 0.761 0.785 0.798 0.767 0.788 0.821 0.782 0.801 0.811 0.643 0.819 tri-SPMF AUC F1 AUC 0.785 0.731 0.724 0.788 0.756 0.741 0.801 0.778 0.787 0.819 0.641 0.810
Žitnik & Zupan IEEE TPAMI 2015
Mining disease associations
Žitnik et al Scientific Reports 2013
Marinka Zitnik - PhD Thesis
Prediction task DFMF F1 AUC 100 D. discoideum genes 0.799 0.801 1000 D. discoideum genes 0.826 0.823 Whole D. discoideum genome 0.831 0.849 Pharmacologic actions 0.663 0.834
Gene
PMID
R14Experimental Condition
R13 1GO Term
R 12KEGG Pathway
R16 2MeSH Descriptor
R45 R42 5 6 R621 2 3 4 5 6 Chemical
Θ1 Pharmacologic Action R12PMID
R13Depositor
R14 Substructure Fingerprint R15Depositor Category
R461 2 3 4 5 6
MKL AUC F1 AUC 0.801 0.781 0.788 0.823 0.787 0.798 0.849 0.800 0.821 0.834 0.639 0.811 RF AUC F1 AUC 0.788 0.761 0.785 0.798 0.767 0.788 0.821 0.782 0.801 0.811 0.643 0.819 tri-SPMF AUC F1 AUC 0.785 0.731 0.724 0.788 0.756 0.741 0.801 0.778 0.787 0.819 0.641 0.810
Žitnik & Zupan IEEE TPAMI 2015
Mining disease associations
Žitnik et al Scientific Reports 2013
Predicting drug toxicity
Žitnik & Zupan Systems Biomedicine 2014 (CAMDA Award)
Marinka Zitnik - PhD Thesis
Prediction task DFMF F1 AUC 100 D. discoideum genes 0.799 0.801 1000 D. discoideum genes 0.826 0.823 Whole D. discoideum genome 0.831 0.849 Pharmacologic actions 0.663 0.834
Gene
PMID
R14Experimental Condition
R13 1GO Term
R 12KEGG Pathway
R16 2MeSH Descriptor
R45 R42 5 6 R621 2 3 4 5 6 Chemical
Θ1 Pharmacologic Action R12PMID
R13Depositor
R14 Substructure Fingerprint R15Depositor Category
R461 2 3 4 5 6
MKL AUC F1 AUC 0.801 0.781 0.788 0.823 0.787 0.798 0.849 0.800 0.821 0.834 0.639 0.811 RF AUC F1 AUC 0.788 0.761 0.785 0.798 0.767 0.788 0.821 0.782 0.801 0.811 0.643 0.819 tri-SPMF AUC F1 AUC 0.785 0.731 0.724 0.788 0.756 0.741 0.801 0.778 0.787 0.819 0.641 0.810
Žitnik & Zupan IEEE TPAMI 2015
Mining disease associations
Žitnik et al Scientific Reports 2013
Predicting drug toxicity
Žitnik & Zupan Systems Biomedicine 2014 (CAMDA Award)
Predicting gene functions
Žitnik & Zupan In PSB 2014
Marinka Zitnik - PhD Thesis
Prediction task DFMF F1 AUC 100 D. discoideum genes 0.799 0.801 1000 D. discoideum genes 0.826 0.823 Whole D. discoideum genome 0.831 0.849 Pharmacologic actions 0.663 0.834
Gene
PMID
R14Experimental Condition
R13 1GO Term
R 12KEGG Pathway
R16 2MeSH Descriptor
R45 R42 5 6 R621 2 3 4 5 6 Chemical
Θ1 Pharmacologic Action R12PMID
R13Depositor
R14 Substructure Fingerprint R15Depositor Category
R461 2 3 4 5 6
MKL AUC F1 AUC 0.801 0.781 0.788 0.823 0.787 0.798 0.849 0.800 0.821 0.834 0.639 0.811 RF AUC F1 AUC 0.788 0.761 0.785 0.798 0.767 0.788 0.821 0.782 0.801 0.811 0.643 0.819 tri-SPMF AUC F1 AUC 0.785 0.731 0.724 0.788 0.756 0.741 0.801 0.778 0.787 0.819 0.641 0.810
Žitnik & Zupan IEEE TPAMI 2015
Mining disease associations
Žitnik et al Scientific Reports 2013
Predicting drug toxicity
Žitnik & Zupan Systems Biomedicine 2014 (CAMDA Award)
Predicting gene functions
Žitnik & Zupan In PSB 2014
Predicting cancer survival
Žitnik & Zupan Systems Biomedicine 2015 (CAMDA Award)
Marinka Zitnik - PhD Thesis
Model parameters
Marinka Zitnik - PhD Thesis
Model parameters
Marinka Zitnik - PhD Thesis
Data view Objects of one type Model parameters
Marinka Zitnik - PhD Thesis
Data view Objects of one type Model parameters
Marinka Zitnik - PhD Thesis
Data view Objects of one type Model parameters
Marinka Zitnik - PhD Thesis
Data view Objects of one type Model parameters
Marinka Zitnik - PhD Thesis
Heterogeneous data domain space
Data view Objects of one type Model parameters
Marinka Zitnik - PhD Thesis
Heterogeneous data domain space
Data view Objects of one type Model parameters
Marinka Zitnik - PhD Thesis
Heterogeneous data domain space
Data view Objects of one type Model parameters
Context jumping in the latent space
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
threshold value
Marinka Zitnik - PhD Thesis
threshold value
model parameters
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
RNA-seq count data
count transcripts mapped to genomic locations
Marinka Zitnik - PhD Thesis
RNA-seq count data
count transcripts mapped to genomic locations
Marinka Zitnik - PhD Thesis
RNA-seq count data
count transcripts mapped to genomic locations
Somatic mutations
No mutation Single base substitution Short indel
Marinka Zitnik - PhD Thesis
RNA-seq count data
count transcripts mapped to genomic locations
Somatic mutations
No mutation Single base substitution Short indel
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
is an object of interest
Marinka Zitnik - PhD Thesis
Nodes Edges
is an object of interest
Marinka Zitnik - PhD Thesis
Nodes Edges Object weights
is an object of interest
Marinka Zitnik - PhD Thesis
Nodes Edges Object weights Object-object interactions
is an object of interest
Marinka Zitnik - PhD Thesis
Objective function
Marinka Zitnik - PhD Thesis
Objective function
Data following distribution
Marinka Zitnik - PhD Thesis
Objective function
Data following distribution Data following distribution
Marinka Zitnik - PhD Thesis
Objective function Latent factor reuse
Data following distribution Data following distribution
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
Data Data
Marinka Zitnik - PhD Thesis
Data Data
Marinka Zitnik - PhD Thesis
Data Data
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
0.0 0.1 0.3 0.4 0.5
Neighborhood of
Marinka Zitnik - PhD Thesis
Sample 1 Sample 2 Sample 3 Sample 4 452 872 495 124 482 719 56 2 24 726 198 99 348 2 297 348 982 132 376 872 193 239 29 77 144 287 173 346 928 376 660
Marinka Zitnik - PhD Thesis
Sample 1 Sample 2 Sample 3 Sample 4 452 872 495 124 482 719 56 2 24 726 198 99 348 2 297 348 982 132 376 872 193 239 29 77 144 287 173 346 928 376 660 Poisson distribution
Marinka Zitnik - PhD Thesis
0.0 0.2 0.4 0.6 0.8
FuseNet - Our method; LPGM - Allen & Liu 2014; NPN-Copula - Liu et al. 2009; log-GLASSO - Gallopin et al 2013; GLASSO - Friedman et al 2007
0.1 0.2 0.3 0.4 B a s e l i n e
Marinka Zitnik - PhD Thesis
0.0 0.2 0.4 0.6 0.8
FuseNet - Our method; LPGM - Allen & Liu 2014; NPN-Copula - Liu et al. 2009; log-GLASSO - Gallopin et al 2013; GLASSO - Friedman et al 2007
0.1 0.2 0.3 0.4 B a s e l i n e
GLASSO Log-GLASSO
Marinka Zitnik - PhD Thesis
0.0 0.2 0.4 0.6 0.8
FuseNet - Our method; LPGM - Allen & Liu 2014; NPN-Copula - Liu et al. 2009; log-GLASSO - Gallopin et al 2013; GLASSO - Friedman et al 2007
0.1 0.2 0.3 0.4 B a s e l i n e
LPGM NPN-Copula GLASSO Log-GLASSO
Marinka Zitnik - PhD Thesis
0.0 0.2 0.4 0.6 0.8
FuseNet - Our method; LPGM - Allen & Liu 2014; NPN-Copula - Liu et al. 2009; log-GLASSO - Gallopin et al 2013; GLASSO - Friedman et al 2007
0.1 0.2 0.3 0.4 B a s e l i n e
FuseNet LPGM NPN-Copula GLASSO Log-GLASSO
Marinka Zitnik - PhD Thesis
0.0 0.2 0.4 0.6 0.8 1.0
Higher score indicates a more informative network Data from International Cancer Genome Consortium, BRCA
Marinka Zitnik - PhD Thesis
0.0 0.2 0.4 0.6 0.8 1.0
Mutation & RNA-seq
Our method
RNA-seq
Allen & Liu 2014
Mutation
Jalali et al 2011
Higher score indicates a more informative network Data from International Cancer Genome Consortium, BRCA
Marinka Zitnik - PhD Thesis
Marinka Zitnik - PhD Thesis
Relation Heterogeneity
Markov network inference for mixed data Epistasis network inference Collective pairwise classification for multi-way data
Z & Z. JMLR 2012; Z & Z. Bioinformatics 2014 (in ISMB 2014); Z & Z. Bioinformatics 2015 (in ISMB 2015); Z & Z. In PSB 2016
Marinka Zitnik - PhD Thesis
Relation Heterogeneity
Markov network inference for mixed data Epistasis network inference Collective pairwise classification for multi-way data
Z & Z. JMLR 2012; Z & Z. Bioinformatics 2014 (in ISMB 2014); Z & Z. Bioinformatics 2015 (in ISMB 2015); Z & Z. In PSB 2016
Object Heterogeneity
Latent profile chaining
Z et al. PLOS Comp Bio 2015
Marinka Zitnik - PhD Thesis
Relation Heterogeneity
Markov network inference for mixed data Epistasis network inference Collective pairwise classification for multi-way data
Z & Z. JMLR 2012; Z & Z. Bioinformatics 2014 (in ISMB 2014); Z & Z. Bioinformatics 2015 (in ISMB 2015); Z & Z. In PSB 2016
Dual Heterogeneity
Network guided matrix completion Survival regression by data fusion
Z & Z. Systems Biomedicine 2015; Z & Z. In RECOMB 2014; Z & Z. Journal of Comp Bio 2015
Object Heterogeneity
Latent profile chaining
Z et al. PLOS Comp Bio 2015
Marinka Zitnik - PhD Thesis
Relation Heterogeneity
Markov network inference for mixed data Epistasis network inference Collective pairwise classification for multi-way data
Z & Z. JMLR 2012; Z & Z. Bioinformatics 2014 (in ISMB 2014); Z & Z. Bioinformatics 2015 (in ISMB 2015); Z & Z. In PSB 2016
Dual Heterogeneity
Network guided matrix completion Survival regression by data fusion
Z & Z. Systems Biomedicine 2015; Z & Z. In RECOMB 2014; Z & Z. Journal of Comp Bio 2015
Object Heterogeneity
Latent profile chaining
Z et al. PLOS Comp Bio 2015
Triple Heterogeneity
collective matrix factorization
Z et al. Scientific Reports 2013; Z & Z. Systems Biomedicine 2014; Z & Z. In PSB 2014; Z & Z. IEEE TPAMI 2015;
Marinka Zitnik - PhD Thesis
Relation Heterogeneity
Markov network inference for mixed data Epistasis network inference Collective pairwise classification for multi-way data
Z & Z. JMLR 2012; Z & Z. Bioinformatics 2014 (in ISMB 2014); Z & Z. Bioinformatics 2015 (in ISMB 2015); Z & Z. In PSB 2016
Dual Heterogeneity
Network guided matrix completion Survival regression by data fusion
Z & Z. Systems Biomedicine 2015; Z & Z. In RECOMB 2014; Z & Z. Journal of Comp Bio 2015
Object Heterogeneity
Latent profile chaining
Z et al. PLOS Comp Bio 2015
Triple Heterogeneity
collective matrix factorization
Z et al. Scientific Reports 2013; Z & Z. Systems Biomedicine 2014; Z & Z. In PSB 2014; Z & Z. IEEE TPAMI 2015;
Exploring Heterogeneity
Sensitivity estimation using Frechet derivatives
Marinka Zitnik - PhD Thesis Best poster awards at BC^2 2015 (Basel, Switzerland); RECOMB 2014 (Pittsburgh, PA, USA)
I wonder what's next? All this excitement about data fusion! Gene function prediction, Disease associations, prediction
prioritization, cancer networks, disease progression, drug interactions, pharmacogenomics.
Marinka Zitnik - PhD Thesis
Blaz Zupan
Adam Kuspa Edward Nam Chris Dinh Gad Shaulsky Rafael Rosengarten Mariko Kurasawa Balaji Santhanam Thomas Helleday Jordi C. Puigvert Jure Leskovec Natasa Przulj Vuk Janjic Charles Boone Mojca M. Usaj Uroš Petrovic Petra Kaferle