Solo or Ensemble? Choosing a CNN Architecture for Melanoma - - PowerPoint PPT Presentation

solo or ensemble choosing a cnn architecture for melanoma
SMART_READER_LITE
LIVE PREVIEW

Solo or Ensemble? Choosing a CNN Architecture for Melanoma - - PowerPoint PPT Presentation

Solo or Ensemble? Choosing a CNN Architecture for Melanoma Classification Fbio Perez, Sandra Avila, Eduardo Valle Recod Lab., DCA, FEEC, University of Campinas (UNICAMP) Recod Lab., IC, University of Campinas (UNICAMP) 1 ISIC


slide-1
SLIDE 1

Solo or Ensemble? Choosing a CNN Architecture for Melanoma Classification

Fábio Perez¹, Sandra Avila², Eduardo Valle¹

¹Recod Lab., DCA, FEEC, University of Campinas (UNICAMP) ²Recod Lab., IC, University of Campinas (UNICAMP) ISIC Workshop @ CVPR 2019 1

slide-2
SLIDE 2

SotA for most computer vision problems, including skin lesion analysis Used by all winner submissions in ISIC Challenges 2016, 2017, 2018

Convolutional Neural Networks

2

slide-3
SLIDE 3

CNN Architectures

AlexNet

3

slide-4
SLIDE 4

CNN Architectures

AlexNet ZFNet VGG GoogLeNet

4

slide-5
SLIDE 5

CNN Architectures

AlexNet ZFNet VGG GoogLeNet I n c e p t i

  • n

I n c e p t i

  • n
  • R

e s N e t ResNet DenseNet MobileNet SqueezeNet NASNet PNASNet DualPathNet SE-Net X c e p t i

  • n

ResNeXt

5

slide-6
SLIDE 6

CNN Architectures ISIC Challenges

2016 ResNet 2017 ResNet, Inception 2018 ResNet, Inception, DenseNet, ResNeXt PNASNet, DPN, SENet...

6

slide-7
SLIDE 7

The most critical factor for model performance SotA for most computer vision problems, including skin lesion analysis Also used by all ISIC Challenges winners

Transfer Learning

Valle et al. (2017). Data, Depth, and Design: Learning Reliable Models for Melanoma Screening https://arxiv.org/abs/1711.00441 Menegola et al. (2017). Knowledge Transfer for Melanoma Screening with Deep Learning https://arxiv.org/abs/1703.07479

7

slide-8
SLIDE 8

Do better ImageNet models transfer better?

Kornblith et al. (2018) https://arxiv.org/abs/1805.08974

Short answer: Yes For multiple natural datasets Fine-tuning, fixed features, and random initialization

8

slide-9
SLIDE 9

Do better ImageNet models transfer better?

Kornblith et al. (2018) arxiv.org/abs/1805.08974

9

slide-10
SLIDE 10

How to predict model performance?

10

slide-11
SLIDE 11

9 architectures × 5 splits × 3 replicates = 135 experiments

Experimental Design

11

slide-12
SLIDE 12

9 architectures × 5 splits × 3 replicates = 135 experiments

Experimental Design

DenseNet Dual Path Nets Inception-v4 Inception-ResNet-v2 MobileNetV2 PNASNet ResNet SENet Xception

12

slide-13
SLIDE 13

9 architectures × 5 splits × 3 replicates = 135 experiments

Experimental Design

ISIC 2017 1750 train 500 validation 500 test

13

slide-14
SLIDE 14

Loss Validation

Explored factors

AUC Accuracy Sensitivity Specificity Validation Test Acc@1 on ImageNet # of Parameters Date of Publication # of Epochs

Training Architectural

14

slide-15
SLIDE 15

Results

15

slide-16
SLIDE 16

Results

16

slide-17
SLIDE 17

Results

(without MobileNetV2)

17

slide-18
SLIDE 18

18

➔ Multiple large datasets ➔ One factor: Acc@1 ➔ Hyperparameter tuning ➔ ISIC 2017 (2750 images) ➔ Multiple factors ➔ “Best-practice” hyperparameters

Datasets

Kornblith et al. (2018) Ours

18

slide-19
SLIDE 19

19

➔ Multiple large datasets ➔ One factor: Acc@1 ➔ Hyperparameter tuning ➔ One split per dataset ➔ No replicates ➔ ISIC 2017 (2750 images) ➔ Multiple factors ➔ “Best-practice” hyperparameters ➔ Five splits ➔ Three replicates

Datasets

Kornblith et al. (2018) Ours

19

slide-20
SLIDE 20

Ensembles

20

slide-21
SLIDE 21

9 architectures × 3 replicates = 27 models per split For each split, ensemble 1, 2, …, 27 models Two strategies for adding models: in random order models with best validation AUC first

Creating the Ensembles

21

slide-22
SLIDE 22

Results

22

slide-23
SLIDE 23

23

Results

(normalized)

slide-24
SLIDE 24

For the SotA models, performance on ImageNet does not necessarily translate to performance on melanoma detection Validation metrics correlate with test metrics much better much better than validation loss Ensembles are needed for stable SotA performance; large ensembles work okay from simply picking at random from a pool of SotA individual models

Conclusions

24

slide-25
SLIDE 25

Acknowledgments

REC D

reasoning for complex data

eScience

U N I C A M P

25

slide-26
SLIDE 26

Thanks!

26