[PPT] - CNN Ba CNN Based ed Pi Pipeline peline for or Op Optical ical PowerPoint Presentation

SLIDE 1

CNN Ba CNN Based ed Pi Pipeline peline for

r

Op Optical ical Fl Flow

w

Based on: PatchBatch: a Batch Augmented Loss for Optical Flow, (Gadot, Wolf) CVPR 2016 Optical Flow Requires Multiple Strategies (but only one network), (Schuster, Wolf, Gadot) CVPR 2017

1

Tal Schuster, June 2017

SLIDE 2

Overv rview iew

Goal – Get SOTA results in main optical flow benchmarks Was done by:

Constructing a Deep Learning based pipeline (modular)
Architectures exploration
Loss function augmentations
Per-batch statistics
Learning methods

2

SLIDE 3

Problem Definition

3

SLIDE 4

Problem blem Defi finiti nition

n - Optic

ical al Flow

w

Given 2 images, compute a dense Optical Flow Field describing the motion between both images (i.e. pure re optical flow): 2 X (h,w,1 / 3) → (h,w,2) Where:

h - image height, w - image width
(h,w,1 / 3) - a grayscale or RGB image
(h,w,2) - a 3D tensor describing for each point (x,y) in image-A a 2D-flow vector: (Δ𝑦, Δ𝑧)

Accuracy measures:

Based on GT (synthetic or physically obtained) - KITTI, MPI-Sintel
F_err - % of pixels with euclidean error > z pixels (usually z=3)
Avg_err - mean of euclidean errors over all pixels

4

SLIDE 5

DB DB - KITTI TTI20 2012 12

5

LIDAR based
~50% coverage

SLIDE 6

DB DB - KITTI TTI20 2012 12

6

SLIDE 7

DB DB - KITTI TTI20 2015 15

7

SLIDE 8

DB DB - MPI SINTEL NTEL

8

Synthetic (computer graphics)
~100% coverage

SLIDE 9

Solutions tions

Traditional computer vision methods

Global constraints (Horn-Schunk, 1981) – Brightness constancy + smoothness asm.
Local constraints (Lucas-Kanade, 1981)

Main disadvantage – small objects and fast movements Descriptor based methods

Sparse to dense (Brox-Malik, 2010)

Descriptors SIFT, SURF, HOG, DAISY, etc. (handcrafted) CNN methods

End to End – Flownet (Fischer et al., 2015)

9

SLIDE 10

Refer erenc ence e Work k – Zbontar

ntar &

& Lecun, n, 2015 2015

Solving Stereo-Matching vs. Optical Flow Classification-based vs. metric learning To compute the classification, the network needs to

bserve both patches simultaneously

10

SLIDE 11

The PatchBatch pipeline

11

SLIDE 12

PatchBatch hBatch - DNN DNN

Siamese DNN - i.e., tied weights due to symmetry Leaky ReLU Should be FAST: Matching function = L2 Conv only Independent descriptor computation

12

SLIDE 13

PatchBatch hBatch - Overall all Pipel eline ine

PatchMatch - Barnes et al. 2010 EpicFlow - Revaud et at. 2015

13

(Normalized) Keeping only large connected components

SLIDE 14

PatchBatch hBatch - ANN ANN

PatchMatch:

(Descriptors, Matching function) → ANN ANN and not ENN : O(N^2) → O(N*logN) 2 iterations are enough

14 1. Initialization (random) 2. Propagation 𝑔 𝑦, 𝑧 = 𝑏𝑠𝑕𝑛𝑗𝑜 𝐸 𝑔 𝑦, 𝑧 , 𝐸 𝑔 𝑦 − 1, 𝑧 , 𝐸 𝑔 𝑦, 𝑧 − 1 (+1 on even iterations) 3. Search 𝑣𝑗 = 𝑤0 + 𝑥𝛽𝑗𝑆𝑗 4. Return to step 2

𝑆𝑗 ∈ −1,1 × [−1,1] 𝑥 - max radius 𝛽 - step (=

1 2)

SLIDE 15

PatchBatch hBatch - Post Post-Proces Processi sing ng

EpicFlow (Edge-Preserving Interpolation of Correspondences)

Sparse -> Dense Average support affine transformations based on geodesic distance on top of edges map 15 SED alg.

SLIDE 16

PatchBatch hBatch - CNN NN

16

Batch Normalization- Solves the “internal covariate shift” problem

Per pixel instead of per feature map

SLIDE 17

PatchBatch hBatch - Loss Loss

DrLIM - Dimensionality Reduction by Learning an Invariant Mapping (LeCun, 2006)

Orig DrLIM (SPRING) CENT+SD 17 𝐸𝓍 𝑀 Negative pairs CENT CENT + SD CENT

SLIDE 18

PatchBatch hBatch - Trai aining ning Method hod

Negative sample – random 1-8 pixels from the true match Data augmentation - flipping , rotating 90°

18

SLIDE 19

Results

19

SLIDE 20

Benchmar hmarks

20

SLIDE 21

How can we Improv prove e the Results lts?

21

SLIDE 22

Architecture Modifications

22

SLIDE 23

PatchBatch hBatch - CNN NN

23

Increased Patch and Descriptor sizes

SLIDE 24

Hinge ge Loss s with h SD

Hinge Loss

s instead of DrLIM

Trained on Triplets

ts - <A, B-match, B-non-match >

Keeping the additional SD c

comp mponent

24

SLIDE 25

Failed led Attempts empts

Data augmentation

Rotations (random +/- 𝛽)
Colored patches (HSV or other decomposition)

Loss function

Foursome (A, B, A’, B’)

A, B – matching patches A’– Patch from Image A that is closest to B H(A,B) = max(0, m - 𝑀2(A,B))

L = 𝑀2(A,B) + Ι(B ≠ B’) * H(A,B’) + Ι(A ≠ A’) * H(A’,B)

Sample Mining

PatchMatch output
Patches distance
Descriptor distance

25

SLIDE 26

Optical Flow as a Multifaceted Problem

26

SLIDE 27

Success ess of methods hods

The challenge of larg rge disp splace ceme ments ts KITTI 2015 average error:

Foreground – 26.43 %
Background – 11.43 %

Possible causes:

Matching algorithm
Descriptors quality

27 MPI-Sintel results table PatchBatch on KITTI 2012 – distance between true matches

SLIDE 28

Descri criptor ptors Evaluation uation

Defined a quality measurement of descriptors for matching 𝑒𝑞 is a distra tractor tor of pixel 𝑞 if the 𝑀2 distance between 𝑒𝑞 and 𝑞 is lower than the 𝑀2 distance of 𝑞 with its matching pixel descriptor. Counted up to 25 pixels from the examined pixel.

28

SLIDE 29

Distrac tractor tors by displacement lacement

Amount of distractors increase with displacement range Goal: improve results for large displacements without reducing for other ranges.

29

SLIDE 30

Distrac tractor tors by displacement lacement

Amount of distractors increase with displacement range Goal: improve results for large displacements without reducing for other ranges. Expert models

Training only on sub ranges
Improving results for large displacements is possible
Implies the need of differ

ferent t feature tures for differe ferent patch ches

30

SLIDE 31

Learning with Varying Difficulty

31

SLIDE 32

Gradual dual Learni arning ng Methods hods

Deal with varying difficulty Curriculum (Bengio et al. 2009):

Samples are pre-ordered
Curriculum by displacement
Curriculum by distance (of false sample)

Self-Paced (Kumar et al. 2010):

No need to pre-order
Sample hardness increases with time (by loss value)

Hard Mining (Simo-Serra et al. 2015):

Backpropagate only some ratio of harder samples
Used for training local descriptors with triplets

All methods did not improve match over baseline – Why?

32

SLIDE 33

Need ed for r variant riant extractin tracting g strategies rategies

Large motions are mostly correlated with more changes in appearance: 1. Background changes 2. View point changes -> occluded parts 3. Distance and angle to light source -> illumination 4. Scale (when moving along the Z-axis)

34

SLIDE 34

Learning for Multiple Strategies and Varying Difficulty

35

SLIDE 35

Our Interleav erleaving ing Learni arning ng Method hod

Goal: Deal with mult ltiple iple sub-task tasks Learning ML models

Mostly in random order (SGD)
Applying gradual methods can effect randomness

Interl rleavi ving Learn rning

Maintai

taining the random m order of categories s while adjust sting the diffic ficulty ty Motivated by psychological research (Kornell - Bjork)

Massing vs. Interleaving
Experiments on classification tasks, sports, etc.

36

Learning Concepts and Categories – Kornell and Bjork (2008)

Classification: Painting to Artist Massing Interleaving

SLIDE 36

Same e class s of objec ects? ts?

37

SLIDE 37

Interleavi erleaving ng Learning rning for Optical al Flow

Controlling the negative sample to balance difficulty

38 Original method Interleaving

SLIDE 38

Interleavi erleaving ng Learning rning for Optical al Flow

39

Drawing the line from 𝑞 improved matching results but did not effect the distractors measurement

(Due to PatchMatch initialization)

SLIDE 39

Self Self-Pac Paced ed Curric rriculum ulum Interleav erleaving ing Learning rning (SPCI) CI)

𝑚𝑗 - validation loss on epoch 𝑗 𝑚𝑗𝑜𝑗𝑢 - initial loss value (epoch #5) 𝑛 – total epoch amount

40

SLIDE 40

Experiments

41

SLIDE 41

Optical cal Flow

42

SLIDE 42

Results

43

SLIDE 43

Benchmar hmarks s - KITTI TTI20 2012 12

44

SLIDE 44

Benchmar hmarks s - KITTI TTI20 2015 15

45

SLIDE 45

46

SLIDE 46

47

SLIDE 47

48

SLIDE 48

Benchmar hmarks s - MPI MPI-Sintel Sintel

49

SLIDE 49

50

SLIDE 50

51

SLIDE 51

Summary

52

SLIDE 52

Summary mary

Computing Optical Flow as a matching problem with a modular pipeline
using a CNN to generate descriptors
Per-batch statistics (SD, batch normalization)
Interleaving Learning Method & SPCI

Referring difficulties while maintaining a random order of the categories

One model to generate descriptors for both small and large displacements

53

SLIDE 53

THANK YOU!

54