Addressing Inter-Class Similarity in Fine-Grained Visual - - PowerPoint PPT Presentation

addressing inter class similarity in fine grained visual
SMART_READER_LITE
LIVE PREVIEW

Addressing Inter-Class Similarity in Fine-Grained Visual - - PowerPoint PPT Presentation

Addressing Inter-Class Similarity in Fine-Grained Visual Classification Abhimanyu Dubey Collaborators: Nikhil Naik, Ryan Farrell, Otkrist Gupta, Pei Guo, Ramesh Raskar Fine-Grained Visual Classification - Image classification with target


slide-1
SLIDE 1

Addressing Inter-Class Similarity in Fine-Grained Visual Classification

Abhimanyu Dubey

Collaborators: Nikhil Naik, Ryan Farrell, Otkrist Gupta, Pei Guo, Ramesh Raskar

slide-2
SLIDE 2

Fine-Grained Visual Classification

  • Image classification with target categories that are visually very similar
  • Classification within subcategories of the same larger visual category
  • Examples:
  • Identifying the make and model of a vehicle
  • Identifying species categorizations among flora/fauna

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-3
SLIDE 3

Fine-Grained Visual Classification

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-4
SLIDE 4

How is this different from large-scale classification?

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-5
SLIDE 5

How is this different from large-scale classification?

  • Foreground vs Background:
  • Diverse (Large-Scale) Problems: Background context can be relevant for foreground

classification

  • eg: we probably won’t come across an image of an airplane in someone’s living room
  • Fine-Grained Problems: Background usually varies independently of the foreground

classification

  • eg: many bird species can be photographed in the same setting

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-6
SLIDE 6

How is this different from large-scale classification?

  • Inter-class and intra-class diversity:
  • In large-scale classification, the average visual diversity between classes is typically much

larger than the variation that exists within samples of the same class

  • In fine-grained classification:
  • samples within a class can vary significantly based on background, pose and lighting
  • samples across classes, on average, exhibit smaller diversity due to the minute

differences between the foreground objects

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-7
SLIDE 7

How is this different from large-scale classification?

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

samples from different classes (norfolk terrier vs cairn terrier) samples from the same class (labrador retriever)

slide-8
SLIDE 8

How is this different from large-scale classification?

  • Data collection is harder:
  • Domains require expert knowledge
  • Smaller datasets on average, too little for directly training CNNs
  • Data is imbalanced:
  • Large-scale classification typically has a uniform distribution of labels in the training set
  • FGVC may have some classes harder to photograph, giving a fatter tail in the data distribution

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-9
SLIDE 9

Approaches to Fine-Grained Visual Classification

  • Object parts are common across

classes:

  • We can utilize object part annotations to

remove unwanted context

  • Removes background sensitivity,

part-based pooling introduces pose invariance

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

[Cui et al, CVPR09]

slide-10
SLIDE 10

Explicit Part Localization

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

[Cui et al, CVPR09]

slide-11
SLIDE 11

Part Alignment via co-segmentation

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

[Krause et al, CVPR15]

slide-12
SLIDE 12

Bilinear Pooling

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

[Lin et al, ICCV15, Cui et al, ICCV17, Gao et al CVPR16]

slide-13
SLIDE 13
  • Only experts among humans can identify a fine-grained class with certainty
  • Typically, we would expect confusion between classes during training, instead of memorizing each

sample with complete confidence

  • For example:

Our Intuition:

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

dog1 dog2 dog3 dogN p(y|x) dog1 dog2 dog3 dogN p(y|x)

slide-14
SLIDE 14
  • Foreground objects in fine-grained samples do not have enough diversity between classes to enjoy

generalization with strongly-discriminative training (minimizing cross-entropy from the training set)

  • Therefore, to reduce training error, they probably memorize samples based on non-generalizable

artefacts (background, distractor objects, occlusions)

Our Hypothesis:

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-15
SLIDE 15
  • Cross-entropy will enforce samples from different classes to have predictions very different from

each other by the end of training

  • The most obvious fix: Can we bring predictions from different classes closer together?

A solution: make training less discriminative [ECCV18]

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

dog1 dog2 dog3 dogN p(y|x) dog1 dog2 dog3 dogN p(y|x)

d( , )

slide-16
SLIDE 16
  • KL-divergence: standard divergence between probability distributions
  • Problem: asymmetric
  • Solution: consider Jeffrey’s divergence - KL(p || q) + KL(q + p)
  • New Problem: Will get arbitrarily big as predictions concentrate on one class

Measuring divergence between predictions

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-17
SLIDE 17

Measuring divergence between predictions

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-18
SLIDE 18

Alternative: Euclidean Distance

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  • Symmetric
  • Easy to compute
  • Well-behaved:
slide-19
SLIDE 19

Training Pipeline: Pairwise Confusion

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-20
SLIDE 20
  • We take baseline models and train them with our modified objective
  • “Basic” Models (ResNets, Inception, DenseNets): Average improvement of 4.5% in top-1

accuracy across 6 datasets

  • “Fine-Grained” Models (Bilinear Pooling, Spatial Transformer Nets): Average improvement of

1.9% in top-1 performance across 6 datasets (4.5x larger relative improvement)

  • Training time and LR-schedule is the same
  • Only minor variations in performance based on the choice of hyperparameter

Results: Fine-Grained Classification

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-21
SLIDE 21
  • We want to compare the importance of “low visual diversity” in the

performance of weakly-discriminative training

  • We subsampled all the Dog classes from ImageNet (116 classes, ~117K

points) and compared performance on this subset with performance on a similarly sized random subsample from ImageNet

  • We obtained an average improvement of around 2.7% in top-1 on the Dog

subset, and only 0.18% on the random subset

Results: Large-Scale v/s Fine-Grained

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-22
SLIDE 22
  • We compared the overlap in the heatmaps returned by Grad-CAM on our

models vs the true object annotations:

Results: Robustness to Distractors

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-23
SLIDE 23
  • Not really “principled”: Many different formulations can be derived from the

intuition of “weakly” discriminative training

  • How does this objective effect generalization performance?
  • Can we quantify the notion of visual diversity?

A lot was left to be desired:

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-24
SLIDE 24
  • We would desire the learnt probability distribution during training to have the

weakest discriminatory power while also predicting the correct class

  • More formally, for the distribution p(y|x), we would like it to have the maximum

entropy possible while ensuring that MODE(p(y|x)) = training label

  • However, directly enforcing a mode alignment constraint is non-differentiable,

therefore we can relax this constraint and attempt to minimize cross-entropy

  • Additionally, we would wish to maximise the entropy:

Entropy and Weakly Discriminative Training [NeurIPS18]

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey Cross-entropy Entropy

slide-25
SLIDE 25

Preliminaries:

  • Since we are performing a fine-tuning task, we assume the pre-trained feature

map ɸ(x) to be a multivariate mixture of m (unknown and possibly very large) Gaussians for any data distribution px:

Maximum-Entropy and Generalization: Analysis

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-26
SLIDE 26

Preliminaries:

  • Under this assumption, the variance of the feature space, given by the overall

covariance matrix Σ* characterize the variation of the features under the data distribution.

Maximum-Entropy and Generalization: Analysis

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-27
SLIDE 27

Maximum-Entropy and Generalization: Diversity

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-28
SLIDE 28

To see how well this measure of diversity characterizes fine-grained problems, we look at the spread of features projected onto the top-2 eigenvectors from ImageNet training set(red) and CUB-2011 training set (blue) for GoogLeNet pool5 features:

Maximum-Entropy and Generalization: Diversity

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-29
SLIDE 29

Theorem 1: Under these assumptions, we can derive a weak generalization bound for a 1-layer neural network classifier over C classes with weights w and features ɸ(x). With high probability,

Maximum-Entropy and Generalization

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-30
SLIDE 30
  • Our generalization bound is a weak bound: the growth on the RHS is lower-bounded
  • It suggests that additionally minimizing expected entropy over the data distribution can lead to

better generalization performance

  • However, the bound requires expected entropy over the dataset, and not the empirical entropy
  • ver the samples

Maximum-Entropy and Generalization

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-31
SLIDE 31

Theorem 2: We can quantify the confidence in the sample entropy (compared to the overall entropy) as a function of diversity. With high probability,

Maximum-Entropy and Generalization

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-32
SLIDE 32

Under the same setup as the Pairwise Confusion, we evaluate the Maximum-Entropy formulation.

  • “Basic” Models (ResNets, Inception, DenseNets): Average improvement
  • f 4.72% in top-1 accuracy across 5 datasets
  • “Fine-Grained” Models (Bilinear Pooling, Spatial Transformer Nets):

Average improvement of 2.04% in top-1 performance across 5 datasets Similar to the previous setup, the gains are larger in fine-grained settings compared to large-scale settings.

Maximum-Entropy : Experimental Results

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-33
SLIDE 33

Maximum-Entropy : Entropy of the Outputs

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-34
SLIDE 34
  • Fine-grained recognition problems differ from conventional large-scale

classification with larger intra-class and smaller inter-class diversity

  • We presented two techniques to regularize the inter-class similarities

present in fine-grained recognition

  • Weakly discriminative training through both pairwise and entropy-based

penalizations guarantee better best-case performance, and exhibit larger empirical performance as well

  • In addition to improvements in performance, evidence suggests that our

networks pay less attention (on average) to the distractors

Summary

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

slide-35
SLIDE 35

Q/A

Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey