Solo or Ensemble? Choosing a CNN Architecture for Melanoma - - PowerPoint PPT Presentation

▶

Feb 25, 2024 137 likes •404 views

Solo or Ensemble? Choosing a CNN Architecture for Melanoma Classification Fbio Perez, Sandra Avila, Eduardo Valle Recod Lab., DCA, FEEC, University of Campinas (UNICAMP) Recod Lab., IC, University of Campinas (UNICAMP) 1 ISIC

SLIDE 1

Solo or Ensemble? Choosing a CNN Architecture for Melanoma Classification

Fábio Perez¹, Sandra Avila², Eduardo Valle¹

¹Recod Lab., DCA, FEEC, University of Campinas (UNICAMP) ²Recod Lab., IC, University of Campinas (UNICAMP) ISIC Workshop @ CVPR 2019 1

SLIDE 2

SotA for most computer vision problems, including skin lesion analysis Used by all winner submissions in ISIC Challenges 2016, 2017, 2018

Convolutional Neural Networks

SLIDE 3

CNN Architectures

AlexNet

SLIDE 4

CNN Architectures

AlexNet ZFNet VGG GoogLeNet

SLIDE 5

CNN Architectures

AlexNet ZFNet VGG GoogLeNet I n c e p t i

I n c e p t i

e s N e t ResNet DenseNet MobileNet SqueezeNet NASNet PNASNet DualPathNet SE-Net X c e p t i

ResNeXt

SLIDE 6

CNN Architectures ISIC Challenges

2016 ResNet 2017 ResNet, Inception 2018 ResNet, Inception, DenseNet, ResNeXt PNASNet, DPN, SENet...

SLIDE 7

The most critical factor for model performance SotA for most computer vision problems, including skin lesion analysis Also used by all ISIC Challenges winners

Transfer Learning

Valle et al. (2017). Data, Depth, and Design: Learning Reliable Models for Melanoma Screening https://arxiv.org/abs/1711.00441 Menegola et al. (2017). Knowledge Transfer for Melanoma Screening with Deep Learning https://arxiv.org/abs/1703.07479

SLIDE 8

Do better ImageNet models transfer better?

Kornblith et al. (2018) https://arxiv.org/abs/1805.08974

Short answer: Yes For multiple natural datasets Fine-tuning, fixed features, and random initialization

SLIDE 9

Do better ImageNet models transfer better?

Kornblith et al. (2018) arxiv.org/abs/1805.08974

SLIDE 10

How to predict model performance?

SLIDE 11

9 architectures × 5 splits × 3 replicates = 135 experiments

Experimental Design

SLIDE 12

9 architectures × 5 splits × 3 replicates = 135 experiments

Experimental Design

DenseNet Dual Path Nets Inception-v4 Inception-ResNet-v2 MobileNetV2 PNASNet ResNet SENet Xception

SLIDE 13

9 architectures × 5 splits × 3 replicates = 135 experiments

Experimental Design

ISIC 2017 1750 train 500 validation 500 test

SLIDE 14

Loss Validation

Explored factors

AUC Accuracy Sensitivity Specificity Validation Test Acc@1 on ImageNet # of Parameters Date of Publication # of Epochs

Training Architectural

SLIDE 15

Results

SLIDE 16

Results

SLIDE 17

Results

(without MobileNetV2)

SLIDE 18

➔ Multiple large datasets ➔ One factor: Acc@1 ➔ Hyperparameter tuning ➔ ISIC 2017 (2750 images) ➔ Multiple factors ➔ “Best-practice” hyperparameters

Datasets

Kornblith et al. (2018) Ours

SLIDE 19

➔ Multiple large datasets ➔ One factor: Acc@1 ➔ Hyperparameter tuning ➔ One split per dataset ➔ No replicates ➔ ISIC 2017 (2750 images) ➔ Multiple factors ➔ “Best-practice” hyperparameters ➔ Five splits ➔ Three replicates

Datasets

Kornblith et al. (2018) Ours

SLIDE 20

Ensembles

SLIDE 21

9 architectures × 3 replicates = 27 models per split For each split, ensemble 1, 2, …, 27 models Two strategies for adding models: in random order models with best validation AUC first

Creating the Ensembles

SLIDE 22

Results

SLIDE 23

Results

(normalized)

SLIDE 24

For the SotA models, performance on ImageNet does not necessarily translate to performance on melanoma detection Validation metrics correlate with test metrics much better much better than validation loss Ensembles are needed for stable SotA performance; large ensembles work okay from simply picking at random from a pool of SotA individual models

Conclusions

SLIDE 25

Acknowledgments

REC D

reasoning for complex data

eScience

U N I C A M P

SLIDE 26

Do better ImageNet models transfer better?

Do better ImageNet models transfer better?

How to predict model performance?

Explored factors

Datasets

Datasets

Ensembles

Conclusions

Acknowledgments

REC D

eScience

Thanks!