Siamese Neural l Netw Networks a and Simila larity Learning Wh - - PowerPoint PPT Presentation

siamese neural l netw networks a and simila larity
SMART_READER_LITE
LIVE PREVIEW

Siamese Neural l Netw Networks a and Simila larity Learning Wh - - PowerPoint PPT Presentation

Siamese Neural l Netw Networks a and Simila larity Learning Wh What at can an ML ML do do for or us? Classification problem Neural CAT Network Prof. Leal-Taix and Prof. Niessner 2 Wh What at can an ML ML do do for or


slide-1
SLIDE 1

Siamese Neural l Netw Networks a and Simila larity Learning

slide-2
SLIDE 2

Wh What at can an ML ML do do for

  • r us?
  • Classification problem
  • Prof. Leal-Taixé and Prof. Niessner

2

Neural Network

CAT

slide-3
SLIDE 3

Wh What at can an ML ML do do for

  • r us?
  • Classification problem on ImageNet with thousands
  • f categories
  • Prof. Leal-Taixé and Prof. Niessner

3

slide-4
SLIDE 4

Wh What at can an ML ML do do for

  • r us?
  • Performance on ImageNet

– Size of the blobs indicates the number of parameters

  • Prof. Leal-Taixé and Prof. Niessner

4

  • A. Canziani et al. „An Analysis of Deep Neural Network Models for Practical

Applications“. arXiv:1605.07678 2016

slide-5
SLIDE 5

Wh What at can an ML ML do do for

  • r us?
  • Regression problem: pose regression
  • Prof. Leal-Taixé and Prof. Niessner

5

y ∈ R2048 p ∈ R3 FC FC q ∈ R4

Linear regression Feature extraction Pretrained network

slide-6
SLIDE 6

Wh What at can an ML ML do do for

  • r us?
  • Regression problem: bounding box regression
  • Prof. Leal-Taixé and Prof. Niessner

6

  • D. Held et al. „Learning to Track at 100 FPS with Deep Regression Networks“. ECCV 2016
slide-7
SLIDE 7

Wh What at can an ML ML do do for

  • r us?
  • Third type of problems
  • Prof. Leal-Taixé and Prof. Niessner

7

A B Classification: person, face, female Classification: person, face, male

slide-8
SLIDE 8

Wh What at can an ML ML do do for

  • r us?
  • Third type of problems
  • Prof. Leal-Taixé and Prof. Niessner

8

A B

Is it the same person?

slide-9
SLIDE 9

Wh What at can an ML ML do do for

  • r us?
  • Third type of problems: Similarity Learning
  • Prof. Leal-Taixé and Prof. Niessner

9

A B

  • Comparison
  • Ranking
slide-10
SLIDE 10

Si Simila larity ty Le Learni ning ng: whe when n and nd why why?

  • Application: unlocking your iPhone with your face
  • Prof. Leal-Taixé and Prof. Niessner

10

Training

slide-11
SLIDE 11

Si Simila larity ty Le Learni ning ng: whe when n and nd why why?

  • Application: unlocking your iPhone with your face
  • Prof. Leal-Taixé and Prof. Niessner

11

A B YES NO

Testing Can be solved as a classification problem

slide-12
SLIDE 12

Si Simila larity ty Le Learni ning ng: whe when n and nd why why?

  • Application: face recognition system so students can

enter the exam room without the need for ID check

  • Prof. Leal-Taixé and Prof. Niessner

12

Person 1 Person 2 Training Person 3

slide-13
SLIDE 13

Si Simila larity ty Le Learni ning ng: whe when n and nd why why?

  • Application: face recognition system so students can

enter the exam room without the need for ID check

  • Prof. Leal-Taixé and Prof. Niessner

13

What is the problem with this approach? Scalability – we need to retrain our model every time a new student is registered to the course

slide-14
SLIDE 14

Si Simila larity ty Le Learni ning ng: whe when n and nd why why?

  • Application: face recognition system so students can

enter the exam room without the need for ID check

  • Prof. Leal-Taixé and Prof. Niessner

14

Can we train one model and use it every year?

slide-15
SLIDE 15

Si Simila larity ty Le Learni ning ng: whe when n and nd why why?

  • Learn a similarity function
  • Prof. Leal-Taixé and Prof. Niessner

15

A B Low similarity score A B High similarity score

slide-16
SLIDE 16

Si Simila larity ty Le Learni ning ng: whe when n and nd why why?

  • Learn a similarity function: testing
  • Prof. Leal-Taixé and Prof. Niessner

16

A B Not the same person

d(A, B) > τ

slide-17
SLIDE 17

Si Simila larity ty Le Learni ning ng: whe when n and nd why why?

  • Learn a similarity function
  • Prof. Leal-Taixé and Prof. Niessner

17

A B

d(A, B) < τ

Same person

slide-18
SLIDE 18

Si Simila larity ty le learni ning ng

  • How do we train a network to learn similarity?
  • Prof. Leal-Taixé and Prof. Niessner

18

slide-19
SLIDE 19

Siamese Neural l Netw Networks

slide-20
SLIDE 20

Si Simila larity ty le learni ning ng

  • How do we train a network to learn similarity?
  • Prof. Leal-Taixé and Prof. Niessner

20

Taigman et al. „DeepFace: closing the gap to human level performance“. CVPR 2014

CNN FC Representation

  • f my face in

128 values A

slide-21
SLIDE 21

Si Simila larity ty le learni ning ng

  • How do we train a network to learn similarity?
  • Prof. Leal-Taixé and Prof. Niessner

21

Taigman et al. „DeepFace: closing the gap to human level performance“. CVPR 2014

A B

f(A) f(B)

slide-22
SLIDE 22

Si Simila larity ty le learni ning ng

  • Siamese network = shared weights
  • Prof. Leal-Taixé and Prof. Niessner

22

Taigman et al. „DeepFace: closing the gap to human level performance“. CVPR 2014

A B

f(A) f(B)

slide-23
SLIDE 23

Si Simila larity ty le learni ning ng

  • Siamese network = shared weights
  • We use the same network to obtain an encoding of

the image

  • To be done: compare the encodings
  • Prof. Leal-Taixé and Prof. Niessner

23

Taigman et al. „DeepFace: closing the gap to human level performance“. CVPR 2014

f(A)

slide-24
SLIDE 24

Si Simila larity ty le learni ning ng

  • Distance function
  • Training: learn the parameter such that

– If and depict the same person, is small – If and depict a different person, is large

  • Prof. Leal-Taixé and Prof. Niessner

24

Taigman et al. „DeepFace: closing the gap to human level performance“. CVPR 2014

d(A, B) = ||f(A) − f(B)||2 d(A, B) = d(A, A, B) = d(A, B) = d(A, A, B) =

slide-25
SLIDE 25

Si Simila larity ty le learni ning ng

  • Loss function for a positive pair:

– If and depict the same person, is small

  • Prof. Leal-Taixé and Prof. Niessner

25

d(A, B) = d(A, A, B) = L(A, B) = ||f(A) − f(B)||2

slide-26
SLIDE 26

Si Simila larity ty le learni ning ng

  • Loss function for a negative pair:

– If and depict a different person, is large – Better use a Hinge loss:

  • Prof. Leal-Taixé and Prof. Niessner

26

d(A, B) = d(A, A, B) = L(A, B) = max(0, m2 − ||f(A) − f(B)||2)

If two elements are already far away, do not spend energy in pulling them even further away

slide-27
SLIDE 27

Si Simila larity ty le learni ning ng

  • Contrastive loss:
  • Prof. Leal-Taixé and Prof. Niessner

27

L(A, B) = y∗||f(A) − f(B)||2 + (1 − y∗)max(0, m2 − ||f(A) − f(B)||2)

Positive pair, reduce the distance between the elements Negative pair, brings the elements further apart up to a margin

slide-28
SLIDE 28

Si Simila larity ty le learni ning ng

  • Training the siamese networks

– You can update the weights for each channel independently and then average them

  • This loss function allows us to learn to bring positive

pairs together and negative pairs apart

  • Prof. Leal-Taixé and Prof. Niessner

28

slide-29
SLIDE 29

Triple let Loss

slide-30
SLIDE 30

Tr Triple let t lo loss

  • Triplet loss allows us to learn a ranking

We want:

  • Prof. Leal-Taixé and Prof. Niessner

30

Schroff et al „FaceNet: a unified embedding for face recognition and clustering“. CVPR 2015

Anchor (A) Positive (P) Negative (N)

||f(A) − f(P)||2 < ||f(A) − f(N)||2

slide-31
SLIDE 31

Tr Triple let t lo loss

  • Triplet loss allows us to learn a ranking
  • Prof. Leal-Taixé and Prof. Niessner

31

Schroff et al „FaceNet: a unified embedding for face recognition and clustering“. CVPR 2015

||f(A) − f(P)||2 < ||f(A) − f(N)||2 ||f(A) − f(P)||2 − ||f(A) − f(N)||2 < 0

) = ||f(A) − f(P)||2 − ||f(A) − f(N)||2 + m < 0

margin

slide-32
SLIDE 32

Tr Triple let t lo loss

  • Triplet loss allows us to learn a ranking
  • Prof. Leal-Taixé and Prof. Niessner

32

Schroff et al „FaceNet: a unified embedding for face recognition and clustering“. CVPR 2015

||f(A) − f(P)||2 < ||f(A) − f(N)||2 ||f(A) − f(P)||2 − ||f(A) − f(N)||2 < 0

) = ||f(A) − f(P)||2 − ||f(A) − f(N)||2 + m < 0

L(A, P, N) = max(0, ||f(A) − f(P)||2 − ||f(A) − f(N)||2 + m)

slide-33
SLIDE 33

Tr Triple let t lo loss

  • Hard negative mining: training with hard cases
  • Train for a few epochs
  • Choose the hard cases where
  • Train with those to refine the distance learned
  • Prof. Leal-Taixé and Prof. Niessner

33

L(A, P, N) = max(0, ||f(A) − f(P)||2 − ||f(A) − f(N)||2 + m) d(A, P) ≈ d(A, N)

slide-34
SLIDE 34

Tr Triple let t lo loss

  • Prof. Leal-Taixé and Prof. Niessner

34

Anchor Negative Positive Anchor Negative Positive Training

slide-35
SLIDE 35

Tr Triple let t lo loss: te test t ti time

  • Just do nearest neighbor search!
  • Prof. Leal-Taixé and Prof. Niessner

35

slide-36
SLIDE 36

Tr Triple let t Lo Loss Cha halle lleng nges

  • Random sampling does not work - the number of

possible triplets is O(n^3) so the network would need to be trained for a very long time.

  • Even with hard negative mining, there is the risk of

being stuck in local minima.

  • Prof. Leal-Taixé and Prof. Niessner

36

slide-37
SLIDE 37

Several l approaches to improve simila larity le learning

slide-38
SLIDE 38

Im Improving simil imilar arit ity lear earnin ing

  • Loss:

– Contrastive vs. triplet loss

  • Sampling:

– Choosing the best triplets to train with, sample the space wisely

= diversity of classes + hard cases

  • Ensembles:

– Why not using several networks, each of them trained with a

subset of triplets?

  • Can we use a classification loss for similarity learning?
  • Prof. Leal-Taixé and Prof. Niessner

38

slide-39
SLIDE 39

Lo Losses: : in interestin ing wo works ks

  • Wang et al., Deep metric learning with angular loss, (ICCV

2017)

  • Yu et al., Correcting the triplet selection bias for triplet loss,

(ECCV 2018)

  • Prof. Leal-Taixé and Prof. Niessner

39

slide-40
SLIDE 40

Im Improving simil imilar arit ity lear earnin ing

  • Loss:

– Contrastive vs. triplet loss

  • Sampling:

– Choosing the best triplets to train with, sample the space wisely

= diversity of classes + hard cases

  • Ensembles:

– Why not using several networks, each of them trained with a

subset of triplets?

  • Can we use a classification loss for similarity learning?
  • Prof. Leal-Taixé and Prof. Niessner

40

slide-41
SLIDE 41

Class 1 Class 2 Class 3 Class 4 Class 5 Class 6 Class 7 Class 8 .

Sa Sampli ling ng: Hie Hierarchic ical Tr Triple let t Lo Loss

  • Build a hierarchical tree where the leaves of the tree

represent the image classes. Recursively merge them until you reach the root node

  • Prof. Leal-Taixé and Prof. Niessner

41

Ge et al., Deep Metric Learning with Hierarchical Triplet Loss, ECCV 2018

slide-42
SLIDE 42

HT HTL: : bui build lding the the tr tree

  • In order to create the tree, we first define a distance

between classes. Intuition: if the distance is small, they will be merged in the next level of the tree.

  • Prof. Leal-Taixé and Prof. Niessner

42

Deep features of images i and j The cardinality of classes p and q (how many samples do we have for each class)

slide-43
SLIDE 43

HT HTL: : Fin Findin ing the the an anchor

  • rs
  • Prof. Leal-Taixé and Prof. Niessner

43

  • Randomly select l’

l’ nodes at the 0th level

– This is done to preserve class diversity in the mini-batch

Class 1 Class 2 Class 3 Class 4 Class 5 Class 6 Class 7 Class 8 .

slide-44
SLIDE 44

HT HTL: : Fin Findin ing the the an anchor

  • rs
  • Prof. Leal-Taixé and Prof. Niessner

44

  • Randomly select l’

l’ nodes at the 0th level

– This is done to preserve class diversity in the mini-batch

  • m-1 nearest classes at the 0th level are selected for

each of the l’ l’ nodes based on the distance in feature space.

slide-45
SLIDE 45

HT HTL: : Fin Findin ing the the an anchor

  • rs
  • Prof. Leal-Taixé and Prof. Niessner

45

  • Randomly select l’

l’ nodes at the 0th level

– This is done to preserve class diversity in the mini-batch

  • m-1 nearest classes at the 0th level are selected for

each of the l’ l’ nodes based on the distance in feature space:

– We want to encourage the model to learn discriminative features from the visual similar classes.

slide-46
SLIDE 46

HT HTL: : Fin Findin ing the the an anchor

  • rs
  • Prof. Leal-Taixé and Prof. Niessner

46

  • Randomly select l’

l’ nodes at the 0th level

– This is done to preserve class diversity in the mini-batch

  • m-1 nearest classes at the 0th level are selected for

each of the l’ l’ nodes based on the distance in feature space:

– We want to encourage the model to learn discriminative features from the visual similar classes.

  • t

t images per class are randomly collected

t*m *m*l *l’ images in the mini-batch

slide-47
SLIDE 47

HT HTL: L : Loss ss fo form rmul ulation

  • Prof. Leal-Taixé and Prof. Niessner

47

all the triplets The margin actually depends on the distances computed on the hierachical tree. The idea is that it can adapt to class distributions and differences of the samples within the classes.

slide-48
SLIDE 48

Sa Sampli ling ng: in interestin ing wo works ks

  • Manmatha et al., Sampling matters for deep metric learning,

(ICCV 2017) - original sampling method

  • Xu et al., Deep asymmetric metric learning via rich

relationship mining, (CVPR 2019)

  • Duan et al., Deep embedding learning with discriminative

sampling policy, (CVPR 2019)

  • Wang et al., Ranked list loss for deep metric learning (CVPR

2019)

  • Wang et al., Multi-similarity loss with general pair weighting

for deep metric learning (CVPR 2019) - best performance

  • Prof. Leal-Taixé and Prof. Niessner

48

slide-49
SLIDE 49

Im Improving simil imilar arit ity lear earnin ing

  • Loss:

– Contrastive vs. triplet loss

  • Sampling:

– Choosing the best triplets to train with, sample the space wisely

= diversity of classes + hard cases

  • Ensembles:

– Why not using several networks, each of them trained with a

subset of triplets?

  • Can we use a classification loss for similarity learning?
  • Prof. Leal-Taixé and Prof. Niessner

49

slide-50
SLIDE 50

En Ensemb embles es

  • Idea: divide the space into K clusters, and have one

learner per cluster.

  • Prof. Leal-Taixé and Prof. Niessner

50

Divide Conquer

Sanakoyeu et al., Divide and Conquer the Embedding Space for Metric Learning, CVPR 2019

slide-51
SLIDE 51

En Ensemb embles es: Di Divi vide an and Con Conquer er

1) Cluster the embedding space in K clusters using K-means. 2) Build K independent learners (fully connected layer) at the top of the CNN, where each learner corresponds to one cluster - DI DIVIDE 3) Until convergence, sample each mini-batch from one random cluster, and update only its corresponding learner. 4) After the network has converged finetune using all learners at the same time - CONQ NQUER 5) Go back to (1) and repeat several times.

  • Prof. Leal-Taixé and Prof. Niessner

51

slide-52
SLIDE 52

En Ensemb embles es: in interestin ing wo works ks

  • Opitz et al., BIER - Boosting Independent Embeddings Robustly,

ICCV 2017 - train K independent networks.

  • Elezi et al., The Group Loss for Metric Learning, arXiv 2020 - train

K independent networks and concatenate their features.

  • Yuan et al., Hard-Aware Deeply Cascaded Embedding, CVPR

2017 - concatenate features from different levels of the network.

  • Wang et al., Ranked list loss for deep metric learning, CVPR 2019 -

concatenate features from different levels of the network.

  • Kim et al., Attention-based Ensemble for Deep Metric Learning,

ECCV 2018 - use an attention mechanism such that each learner looks at different parts of the object.

  • Prof. Leal-Taixé and Prof. Niessner

52

slide-53
SLIDE 53

Im Improving simil imilar arit ity lear earnin ing

  • Loss:

– Contrastive vs. triplet loss

  • Sampling:

– Choosing the best triplets to train with, sample the space wisely

= diversity of classes + hard cases

  • Ensembles:

– Why not using several networks, each of them trained with a

subset of triplets?

  • Can we use a classification loss for similarity learning?
  • Prof. Leal-Taixé and Prof. Niessner

53

slide-54
SLIDE 54

Cl Classif ific ication ion los

  • ss:

: in interestin ing wo works ks

  • Movshovitz-Attias et al., No Fuss Distance Metric Learning using

Proxies, ICCV 2017 - learn “proxy” samples to keep as positives and negatives in the mini-batch).

  • Teh et al., ProxyNCA++: Revisiting and Revitalizing Proxy

Neighborhood Component Analysis, arXiv 2020 - a better way of using proxies, some of the best results in the field.

  • Qian et al., SoftTriple Loss: Deep Metric Learning Without Triplet

Sampling, ICCV 2019 - using multiple centers for class

  • Elezi et al., The Group Loss for Deep Metric Learning, arXiv 2020 -

refine the softmax probabilities via a dynamical system for better feature embedding.

  • Prof. Leal-Taixé and Prof. Niessner

55

slide-55
SLIDE 55

So Some re resul sults

  • Prof. Leal-Taixé and Prof. Niessner

56

Jacob et al., Metric Learning With HORDE: High-Order Regularizer for Deep Embeddings, ICCV 2019

slide-56
SLIDE 56

So Some re resul sults

  • Prof. Leal-Taixé and Prof. Niessner

57

slide-57
SLIDE 57

So, , whi which mod model el to to use use?

  • Prof. Leal-Taixé and Prof. Niessner

58

CUB CARS When trained correctly (and using the same backbone, same embedding space and no extra-tricks to boost the results) the difference in accuracy between different models is not that large…

Musgrave et al., A Metric Learning Reality Check, arXiv 2020

slide-58
SLIDE 58

Ti Tips an and tr tricks cks

  • Simple baselines (contrastive loss, triplet loss and

classification loss) actually perform well when trained correctly.

  • Sampling is as important as the choice of loss
  • function. Every method can be boosted by

devising an intelligent sampling strategy.

  • Some tricks may further improve the results

(temperature for softmax, freezing batch-norm layers, using multiple centers per class, etc).

  • Prof. Leal-Taixé and Prof. Niessner

59

slide-59
SLIDE 59

Ti Tips an and tr tricks cks

  • Even naive ensembles may (significantly) boost

performance.

  • Good out-of-box choices: Proxy-NCA and SoftTriple

Loss à they perform well, and do not require a massive hyperparameter search (and have code

  • nline!).
  • Contrastive loss and triplet loss give a similarity score in

addition to the feature embedding.

  • Stronger backbone choices (densenet) further improve

the results.

  • Prof. Leal-Taixé and Prof. Niessner

60

slide-60
SLIDE 60

Appli lications in vision

slide-61
SLIDE 61

Siamese network on MNIS IST

  • Prof. Leal-Taixé and Prof. Niessner

62

slide-62
SLIDE 62

Es Establis ishin ing ima image e cor

  • rres

espon

  • nden

ences es

  • Prof. Leal-Taixé and Prof. Niessner

63

Image from University of Washington

slide-63
SLIDE 63

Es Establis ishin ing ima image e cor

  • rres

espon

  • nden

ences es

  • Prof. Leal-Taixé and Prof. Niessner

64

Image from University of Washington

slide-64
SLIDE 64

Es Establis ishin ing ima image e cor

  • rres

espon

  • nden

ences es

  • Used in a wide range of Computer Vision applications

– Image stitching or image alignment – Object recognition – 3D reconstruction – Object tracking – Image retrieval

  • Many of these applications are now targeted directly

with Neural Networks as we will see in the course

  • Prof. Leal-Taixé and Prof. Niessner

65

slide-65
SLIDE 65

Es Establis ishin ing ima image e cor

  • rres

espon

  • nden

ences es

  • Classic method pipeline

– Extract manually designed feature descriptors

  • Harris, SIFT, SURF: most are based on image gradients
  • They suffer under extreme illumination or viewpoint

changes

  • Slow to extract dense features

– Match descriptors from the two images

  • Many descriptors are similar, one needs to filter out possible

double matches and keep only reliable ones.

  • Prof. Leal-Taixé and Prof. Niessner

66

Sameer Agarwal et al. „Building Rome in a Day“. ICCV 2009

slide-66
SLIDE 66
  • End-to-end learning for patch similarity
  • Fast to allow dense extraction
  • Invariant to a wide array of

transformations (illumination, viewpoint)

  • Prof. Leal-Taixé and Prof. Niessner

67

  • S. Zagoruyko and N. Komodakis. „Learning to Compare Image Patches via Convolutional Neural Networks“. CVPR 2015

Es Establis ishin ing ima image e cor

  • rres

espon

  • nden

ences es

Siamese network

slide-67
SLIDE 67
  • Classic Siamese architecture

– Shared layers

  • Simulated feature extraction

– One decision layer

  • Simulates the matching
  • Prof. Leal-Taixé and Prof. Niessner

68

  • S. Zagoruyko and N. Komodakis. „Learning to Compare Image Patches via Convolutional Neural Networks“. CVPR 2015

Es Establis ishin ing ima image e cor

  • rres

espon

  • nden

ences es

slide-68
SLIDE 68

Im Image retrieval

  • Prof. Leal-Taixé and Prof. Niessner

69

Radenovic et al.. „Fine-tuning CNN Image Retrieval with No Human Annotation“. TPAMI 2018

slide-69
SLIDE 69

Un Unsu supervise ised l learnin ing

  • Learning from videos

– Tracking provides the supervision – Use those as positive samples – Extract random patches as negative samples

  • Prof. Leal-Taixé and Prof. Niessner

70

Wang and Gupta. „Unsupervised Learning of Visual Representations using Videos“. ICCV 2015

slide-70
SLIDE 70

Opt Optica cal l flo flow

  • Input: 2 consecutive images (e.g. from a video)
  • Output: displacement of every pixel from image A to

image B

  • Results in the “perceived” 2D motion, not the real

motion of the object

  • Prof. Leal-Taixé and Prof. Niessner

71

slide-71
SLIDE 71

Opt Optica cal l flo flow

  • Prof. Leal-Taixé and Prof. Niessner

72

slide-72
SLIDE 72

Opt Optica cal l flo flow

  • Prof. Leal-Taixé and Prof. Niessner

73

slide-73
SLIDE 73

Opt Optica cal l flo flow with CNNs NNs

  • End-to-end supervised learning of optical flow
  • Prof. Leal-Taixé and Prof. Niessner

74

  • P. Fischer et al. „FlowNet: Learning Optical Flow With Convolutional Networks“. ICCV 2015
slide-74
SLIDE 74

Opt Optica cal l flo flow with CNNs NNs

  • Prof. Leal-Taixé and Prof. Niessner

75

  • P. Fischer et al. „FlowNet: Learning Optical Flow With Convolutional Networks“. ICCV 2015
slide-75
SLIDE 75

Fl FlowNet: a : arc rchit itecture ure 1 1

  • Prof. Leal-Taixé and Prof. Niessner

76

  • Stack both images à input is now 2 x RGB = 6 channels
slide-76
SLIDE 76

Fl FlowNet: a : arc rchit itecture ure 2 2

  • Prof. Leal-Taixé and Prof. Niessner

77

  • Siamese architecture
slide-77
SLIDE 77

Fl FlowNet : a : arc rchit itecture ure 2 2

  • Prof. Leal-Taixé and Prof. Niessner

78

  • Two key design choices

How to combine the information from both images?

slide-78
SLIDE 78

Cor Correl elation ion layer er

  • Multiplies a feature vector with another feature vector
  • Prof. Leal-Taixé and Prof. Niessner

79

Fixed operation. No learnable weights!

slide-79
SLIDE 79

Cor Correl elation ion layer er

  • The matching score represents how correlated these

two feature vectors are

  • Prof. Leal-Taixé and Prof. Niessner

80

slide-80
SLIDE 80

Cor Correl elation ion layer er

  • Useful for finding image correspondences
  • Prof. Leal-Taixé and Prof. Niessner

81

  • I. Rocco et al. “Convolutional neural network architecture for

geometric matching. CVPR 2017.

Find a transformation from image A to image B A B

slide-81
SLIDE 81

Cor Correl elation ion layer er

  • Prof. Leal-Taixé and Prof. Niessner

82

  • I. Rocco et al. “Convolutional neural network architecture for geometric matching. CVPR 2017.
slide-82
SLIDE 82

Siamese Neural l Netw Networks a and Simila larity Learning

slide-83
SLIDE 83

Fu Further r references

  • Savinov et al. „Quad-networks: unsupervised learning

to rank for interest point detection“. CVPR 2017

  • Ristani & Tomasi. „Features for Multi-Target Multi-

Camera Tracking and Re-Identification“. CVPR 2018

  • Chen et al. „Beyond triplet loss: a deep quadruplet

network for person re-identification“. CVPR 2017

  • Prof. Leal-Taixé and Prof. Niessner

84