Adapted Deep Embeddings: A Synthesis of Methods for !-Shot Inductive Transfer Learning
Tyler R. Scott1,2, Karl Ridgeway1,2, Michael C. Mozer1,3
1 University of Colorado, Boulder 2 Sensory Inc. 3 Presently at Google Brain
Adapted Deep Embeddings: A Synthesis of Methods for ! -Shot - - PowerPoint PPT Presentation
Adapted Deep Embeddings: A Synthesis of Methods for ! -Shot Inductive Transfer Learning Tyler R. Scott 1,2 , Karl Ridgeway 1,2 , Michael C. Mozer 1,3 1 University of Colorado, Boulder 2 Sensory Inc. 3 Presently at Google Brain Inductive Transfer
1 University of Colorado, Boulder 2 Sensory Inc. 3 Presently at Google Brain
Model Target Domain Input Target Domain Prediction Source Domain Data Target Domain Data
Retrain output Adapt weights to target domain
Yosinski et al. (2014)
Source Domain
Target Domain
Source Domain
Source & Target Domain Embedding
(Ustinova & Lempitsky, 2016)
Distance Within class Between class
Source Domain
Source & Target Domain Embedding
(Snell et al., 2017)
Source Domain
Target Domain
# labeled examples per target class (k) Weight Transfer > 100 Deep Metric Learning agnostic Few-Shot Learning < 20
1 5 10 50 100 500 1000
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Baseline
1 5 10 50 100 500 1000
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Weight Adaptation Baseline
1 5 10 50 100 500 1000
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Prototypical Net Weight Adaptation Baseline
1 5 10 50 100 500 1000
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Histogram Loss Prototypical Net Weight Adaptation Baseline
1 5 10 50 100 500 1000
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Adapted Histogram Loss Adapted Prototypical Net Histogram Loss Prototypical Net Weight Adaptation Baseline
1 10 50 100 200
k
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Isolet, n = 5
5 10 100 1000
n
0.0 0.1 0.2 0.3 0.4 0.5 0.6
Test Accuracy
Omniglot, k = 1
5 10 100 1000
n
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Omniglot, k = 5
5 10 100 1000
n
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Omniglot, k = 10
1 10 50 100 300
k
0.2 0.3 0.4 0.5 0.6 0.7
Test Accuracy
tinyImageNet, n = 5
1 10 50 100 300
k
0.1 0.2 0.3 0.4 0.5 0.6
tinyImageNet, n = 10
1 10 50 100 300
k
0.0 0.1 0.2
tinyImageNet, n = 50 200 1 10 50 100 200
k
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Isolet, n = 10
Adapted Histogram Loss Adapted Prototypical Net Histogram Loss Prototypical Net Weight Adaptation Baseline