Lecture 2: Object Detection Professor Fei Fei Li Stanford Vision Lab - - PowerPoint PPT Presentation

lecture 2 object detection
SMART_READER_LITE
LIVE PREVIEW

Lecture 2: Object Detection Professor Fei Fei Li Stanford Vision Lab - - PowerPoint PPT Presentation

Lecture 2: Object Detection Professor Fei Fei Li Stanford Vision Lab 1 29 Mar 11 Lecture 2 - Fei-Fei Li What we will learn today? Visual recognition overview Representation Learning Recognition Implicit Shape Model


slide-1
SLIDE 1

Lecture 2 -

Fei-Fei Li

Lecture 2: Object Detection

Professor Fei‐Fei Li Stanford Vision Lab

29‐Mar‐11 1

slide-2
SLIDE 2

Lecture 2 -

Fei-Fei Li

What we will learn today?

  • Visual recognition overview

– Representation – Learning – Recognition

  • Implicit Shape Model

– Representation – Recognition – Experiments and results

29‐Mar‐11 2

slide-3
SLIDE 3

Lecture 2 -

Fei-Fei Li

What are the different visual recognition tasks?

29‐Mar‐11 3

slide-4
SLIDE 4

Lecture 2 -

Fei-Fei Li

Categorization vs Single instance recognition

Does this image contain the Chicago Macy building’s?

29‐Mar‐11 4

slide-5
SLIDE 5

Lecture 2 -

Fei-Fei Li

Where is the crunchy nut?

Categorization vs Single instance recognition

29‐Mar‐11 5

slide-6
SLIDE 6

Lecture 2 -

Fei-Fei Li

+ GPS

  • Recognizing landmarks in

mobile platforms

Applications of computer vision

29‐Mar‐11 6

slide-7
SLIDE 7

Lecture 2 -

Fei-Fei Li

Classification:

Does this image contain a building? [yes/no]

Yes!

29‐Mar‐11 7

slide-8
SLIDE 8

Lecture 2 -

Fei-Fei Li

Classification:

Is this an beach?

29‐Mar‐11 8

slide-9
SLIDE 9

Lecture 2 -

Fei-Fei Li

Image Search

Organizing photo collections

29‐Mar‐11 9

slide-10
SLIDE 10

Lecture 2 -

Fei-Fei Li

Detection:

Does this image contain a car? [where?]

car 29‐Mar‐11 10

slide-11
SLIDE 11

Lecture 2 -

Fei-Fei Li

Building clock person car

Detection:

Which object does this image contain? [where?]

29‐Mar‐11 11

slide-12
SLIDE 12

Lecture 2 -

Fei-Fei Li

clock

Detection:

Accurate localization (segmentation)

29‐Mar‐11 12

slide-13
SLIDE 13

Lecture 2 -

Fei-Fei Li

Object: Person, back;

1‐2 meters away

Object: Police car, side view, 4‐5 m away Object: Building, 45º pose,

8‐10 meters away It has bricks

Detection: Estimating object semantic & geometric attributes

29‐Mar‐11 13

slide-14
SLIDE 14

Lecture 2 -

Fei-Fei Li

Applications of computer vision

Surveillance

Assistive technologies

Security Assistive driving

Computational photography 29‐Mar‐11 14

slide-15
SLIDE 15

Lecture 2 -

Fei-Fei Li

Activity or Event recognition

What are these people doing?

29‐Mar‐11 17

slide-16
SLIDE 16

Lecture 2 -

Fei-Fei Li

Visual Recognition

  • Design algorithms that are capable to

–Classify images or videos –Detect and localize objects –Estimate semantic and geometrical attributes – Classify human activities and events

Why is this challenging?

29‐Mar‐11 18

slide-17
SLIDE 17

Lecture 2 -

Fei-Fei Li

How many object categories are there?

29‐Mar‐11 19

slide-18
SLIDE 18

Lecture 2 -

Fei-Fei Li

Challenges: viewpoint variation

Michelangelo 1475-1564

29‐Mar‐11 20

slide-19
SLIDE 19

Lecture 2 -

Fei-Fei Li

Challenges: illumination

image credit: J. Koenderink

29‐Mar‐11 21

slide-20
SLIDE 20

Lecture 2 -

Fei-Fei Li

Challenges: scale

29‐Mar‐11 22

slide-21
SLIDE 21

Lecture 2 -

Fei-Fei Li

Challenges: deformation

29‐Mar‐11 23

slide-22
SLIDE 22

Lecture 2 -

Fei-Fei Li

Challenges:

  • cclusion

Magritte, 1957 29‐Mar‐11 24

slide-23
SLIDE 23

Lecture 2 -

Fei-Fei Li

Challenges: background clutter

Kilmeny Niland. 1995

29‐Mar‐11 25

slide-24
SLIDE 24

Lecture 2 -

Fei-Fei Li

Challenges: intra‐class variation

29‐Mar‐11 26

slide-25
SLIDE 25

Lecture 2 -

  • Turk and Pentland, 1991
  • Belhumeur, Hespanha, & Kriegman, 1997
  • Schneiderman & Kanade 2004
  • Viola and Jones, 2000
  • Amit and Geman, 1999
  • LeCun et al. 1998
  • Belongie and Malik, 2002
  • Schneiderman & Kanade, 2004
  • Argawal and Roth, 2002
  • Poggio et al. 1993

Some early works on object categorization

29‐Mar‐11

slide-26
SLIDE 26

Lecture 2 -

Fei-Fei Li

Basic issues

  • Representation

– How to represent an object category; which classification scheme?

  • Learning

– How to learn the classifier, given training data

  • Recognition

– How the classifier is to be used on novel data

29‐Mar‐11 28

slide-27
SLIDE 27

Lecture 2 -

Fei-Fei Li

Representation

‐ Building blocks: Sampling strategies

Randomly Multiple interest operators Interest operators Dense, uniformly

Image credits: L. Fei‐Fei, E. Nowak, J. Sivic

29‐Mar‐11 29

slide-28
SLIDE 28

Lecture 2 -

Fei-Fei Li

Representation

– Appearance only or location and appearance

29‐Mar‐11 31

slide-29
SLIDE 29

Lecture 2 -

Fei-Fei Li

Representation

–Invariances

  • View point
  • Illumination
  • Occlusion
  • Scale
  • Deformation
  • Clutter
  • etc.

29‐Mar‐11 32

slide-30
SLIDE 30

Lecture 2 -

Fei-Fei Li

Basic issues

  • Representation

– How to represent an object category; which classification scheme?

  • Learning

– How to learn the classifier, given training data

  • Recognition

– How the classifier is to be used on novel data

29‐Mar‐11 42

slide-31
SLIDE 31

Lecture 2 -

Fei-Fei Li

  • Learning parameters: What are you maximizing?

Likelihood (Gen.) or performances on train/validation set (Disc.)

Learning

29‐Mar‐11 43

slide-32
SLIDE 32

Lecture 2 -

Fei-Fei Li

  • Learning parameters: What are you maximizing?

Likelihood (Gen.) or performances on train/validation set (Disc.)

  • Level of supervision
  • Manual segmentation; bounding box; image labels;

noisy labels

Learning

  • Batch/incremental
  • Priors

29‐Mar‐11 44

slide-33
SLIDE 33

Lecture 2 -

Fei-Fei Li

  • Learning parameters: What are you maximizing?

Likelihood (Gen.) or performances on train/validation set (Disc.)

  • Level of supervision
  • Manual segmentation; bounding box; image labels;

noisy labels

Learning

  • Batch/incremental
  • Training images:
  • Issue of overfitting
  • Negative images for

discriminative methods

  • Priors

29‐Mar‐11 45

slide-34
SLIDE 34

Lecture 2 -

Fei-Fei Li

Basic issues

  • Representation

– How to represent an object category; which classification scheme?

  • Learning

– How to learn the classifier, given training data

  • Recognition

– How the classifier is to be used on novel data

29‐Mar‐11 46

slide-35
SLIDE 35

Lecture 2 -

Fei-Fei Li

– Recognition task: classification, detection, etc..

Recognition

29‐Mar‐11 47

slide-36
SLIDE 36

Lecture 2 -

Fei-Fei Li

Recognition

– Recognition task – Search strategy: Sliding Windows

  • Simple
  • Computational complexity (x,y, S, θ, N of classes)

‐ BSW by Lampert et al 08 ‐ Also, Alexe, et al 10 Viola, Jones 2001, 29‐Mar‐11 48

slide-37
SLIDE 37

Lecture 2 -

Fei-Fei Li

Recognition

– Recognition task – Search strategy: Sliding Windows

  • Simple
  • Computational complexity (x,y, S, θ, N of classes)
  • Localization
  • Objects are not boxes

‐ BSW by Lampert et al 08 ‐ Also, Alexe, et al 10 Viola, Jones 2001, 29‐Mar‐11 49

slide-38
SLIDE 38

Lecture 2 -

Fei-Fei Li

Recognition

– Recognition task – Search strategy: Sliding Windows

  • Simple
  • Computational complexity (x,y, S, θ, N of classes)
  • Localization
  • Objects are not boxes
  • Prone to false positive

‐ BSW by Lampert et al 08 ‐ Also, Alexe, et al 10 Non max suppression: Canny ’86 …. Desai et al , 2009 Viola, Jones 2001, 29‐Mar‐11 50

slide-39
SLIDE 39

Lecture 2 -

Fei-Fei Li

Recognition

Category: car Azimuth = 225º Zenith = 30º

  • Savarese, 2007
  • Sun et al 2009
  • Liebelt et al., ’08, 10
  • Farhadi et al 09

‐ It has metal ‐ it is glossy ‐ has wheels

  • Farhadi et al 09
  • Lampert et al 09
  • Wang & Forsyth 09

– Recognition task – Search strategy – Attributes

29‐Mar‐11 54

slide-40
SLIDE 40

Lecture 2 -

Fei-Fei Li

Semantic:

  • Torralba et al 03
  • Rabinovich et al 07
  • Gupta & Davis 08
  • Heitz & Koller 08
  • L‐J Li et al 08
  • Yao & Fei‐Fei 10

Recognition

– Recognition task – Search strategy – Attributes – Context

Geometric

  • Hoiem, et al 06
  • Gould et al 09
  • Bao, Sun, Savarese 10

29‐Mar‐11 55

slide-41
SLIDE 41

Lecture 2 -

Fei-Fei Li

Basic issues

  • Representation

– How to represent an object category; which classification scheme?

  • Learning

– How to learn the classifier, given training data

  • Recognition

– How the classifier is to be used on novel data

29‐Mar‐11 56

slide-42
SLIDE 42

Lecture 2 -

Fei-Fei Li

What we will learn today?

  • Visual recognition overview

– Representation – Learning – Recognition

  • Implicit Shape Model

– Representation – Recognition – Experiments and results

29‐Mar‐11 57

slide-43
SLIDE 43

Lecture 2 -

Fei-Fei Li

Implicit Shape Model (ISM)

  • Basic ideas

– Learn an appearance codebook – Learn a star‐topology structural model

  • Features are considered independent given obj. center
  • Algorithm: probabilistic Gen. Hough Transform

– Exact correspondences →

  • Prob. match to object part

– NN matching → Soft matching – Feature location on obj. → Part location distribution – Uniform votes → Probabilistic vote weighting – Quantized Hough array → Continuous Hough space

x1 x3 x4 x6 x5 x2

Source: Bastian Leibe

29‐Mar‐11 58

slide-44
SLIDE 44

Lecture 2 -

Fei-Fei Li

Implicit Shape Model: Basic Idea

  • Visual vocabulary is used to index votes for object

position [a visual word = “part”].

Training image Visual codeword with displacement vectors

Source: Bastian Leibe

  • B. Leibe, A. Leonardis, and B. Schiele, Robust Object Detection with Interleaved Categorization and

Segmentation, International Journal of Computer Vision, Vol. 77(1‐3), 2008.

29‐Mar‐11 59

slide-45
SLIDE 45

Lecture 2 -

Fei-Fei Li

  • Objects are detected as consistent configurations of

the observed parts (visual words).

Test image

Implicit Shape Model: Basic Idea

Source: Bastian Leibe

  • B. Leibe, A. Leonardis, and B. Schiele, Robust Object Detection with Interleaved Categorization and

Segmentation, International Journal of Computer Vision, Vol. 77(1‐3), 2008.

29‐Mar‐11 60

slide-46
SLIDE 46

Lecture 2 -

Fei-Fei Li

Implicit Shape Model ‐ Recognition

Interest Points Matched Codebook Entries Probabilistic Voting 3D Voting Space (continuous)

x y s

Object Position

  • ,x

Image Feature

f

Interpretation (Codebook match)

Ci

) ( f C p

i

) , , (

  • i

n

C x

  • p

Probabilistic vote weighting (will be explained later in detail)

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

29‐Mar‐11 62

slide-47
SLIDE 47

Lecture 2 -

Fei-Fei Li

Implicit Shape Model ‐ Recognition

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

Backprojected Hypotheses Interest Points Matched Codebook Entries Probabilistic Voting 3D Voting Space (continuous)

x y s

Backprojection

  • f Maxima

29‐Mar‐11 63

slide-48
SLIDE 48

Lecture 2 -

Fei-Fei Li

Original image

Example: Results on Cows

Source: Bastian Leibe

29‐Mar‐11 64

slide-49
SLIDE 49

Lecture 2 -

Fei-Fei Li

Original image Interest points

Example: Results on Cows

Source: Bastian Leibe

29‐Mar‐11 65

slide-50
SLIDE 50

Lecture 2 -

Fei-Fei Li

Original image Interest points

Matched patches

Example: Results on Cows

Source: Bastian Leibe

29‐Mar‐11 66

slide-51
SLIDE 51

Lecture 2 -

Fei-Fei Li

Example: Results on Cows

  • Prob. Votes

Source: Bastian Leibe

29‐Mar‐11 67

slide-52
SLIDE 52

Lecture 2 -

Fei-Fei Li

1st hypothesis

Example: Results on Cows

Source: K. Grauman & B. Leibe

29‐Mar‐11 68

slide-53
SLIDE 53

Lecture 2 -

Fei-Fei Li

2nd hypothesis

Example: Results on Cows

Source: Bastian Leibe

29‐Mar‐11 69

slide-54
SLIDE 54

Lecture 2 -

Fei-Fei Li

Example: Results on Cows

3rd hypothesis

Source: Bastian Leibe

29‐Mar‐11 70

slide-55
SLIDE 55

Lecture 2 -

Fei-Fei Li

  • Scale‐invariant feature selection

– Scale‐invariant interest points – Rescale extracted patches – Match to constant‐size codebook

  • Generate scale votes

– Scale as 3rd dimension in voting space – Search for maxima in 3D voting space

Scale Invariant Voting

Search window

x y s Source: Bastian Leibe

29‐Mar‐11 71

slide-56
SLIDE 56

Lecture 2 -

Fei-Fei Li

Scale Voting: Efficient Computation

  • Continuous Generalized Hough Transform

Binned accumulator array similar to standard Gen. Hough Transf. Quickly identify candidate maxima locations Refine locations by Mean‐Shift search only around those points

⇒ Avoid quantization effects by keeping exact vote locations. ⇒ Mean‐shift interpretation as kernel prob. density estimation.

y s x

Refinement (Mean-Shift)

y s x

Candidate maxima

y s

Scale votes

x y s

Binned

  • accum. array

x Source: Bastian Leibe

29‐Mar‐11 72

slide-57
SLIDE 57

Lecture 2 -

Fei-Fei Li

  • Scale‐adaptive Mean‐Shift search for refinement

– Increase search window size with hypothesis scale – Scale‐adaptive balloon density estimator

Scale Voting: Efficient Computation

y s x

Refinement (Mean-Shift)

y s x

Candidate maxima

y s

Scale votes

x y s

Binned

  • accum. array

x Source: Bastian Leibe

29‐Mar‐11 73

slide-58
SLIDE 58

Lecture 2 -

Detection Results

  • Qualitative Performance

– Recognizes different kinds of objects – Robust to clutter, occlusion, noise, low contrast

Source: Bastian Leibe

74

slide-59
SLIDE 59

Lecture 2 -

Fei-Fei Li

Figure‐Ground Segregation

  • What happens first – segmentation or recognition?
  • Problem extensively studied in

Psychophysics

  • Experiments with ambiguous

figure‐ground stimuli

  • Results:

– Evidence that object recognition can and does operate before figure‐ground

  • rganization

– Interpreted as Gestalt cue familiarity.

M.A. Peterson, “Object Recognition Processes Can and Do Operate Before Figure-Ground Organization”, Cur. Dir. in Psych. Sc., 3:105-111, 1994.

29‐Mar‐11 75

slide-60
SLIDE 60

Lecture 2 -

Fei-Fei Li

ISM – Top‐Down Segmentation

Backprojected Hypotheses Interest Points Matched Codebook Entries Probabilistic Voting Segmentation 3D Voting Space (continuous)

x y s

Backprojection

  • f Maxima

p(figure)

Probabilities

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

29‐Mar‐11 76

slide-61
SLIDE 61

Lecture 2 -

Fei-Fei Li

Top‐Down Segmentation: Motivation

  • Secondary hypotheses (“mixtures of cars/cows/etc.”)

– Desired property of algorithm! ⇒ robustness to occlusion – Standard solution: reject based on bounding box overlap ⇒ Problematic ‐ may lead to missing detections! ⇒ Use segmentations to resolve ambiguities instead. – Basic idea: each observed pixel can only be explained by (at most) one detection.

Source: Bastian Leibe

29‐Mar‐11 77

slide-62
SLIDE 62

Lecture 2 -

Fei-Fei Li

  • Secondary hypotheses (“mixtures of cars/cows/etc.”)

– Desired property of algorithm! ⇒ robustness to occlusion – Standard solution: reject based on bounding box overlap ⇒ Problematic ‐ may lead to missing detections! ⇒ Use segmentations to resolve ambiguities instead. – Basic idea: each observed pixel can only be explained by (at most) one detection.

Top‐Down Segmentation: Motivation

Source: Bastian Leibe

29‐Mar‐11 78

slide-63
SLIDE 63

Lecture 2 -

Fei-Fei Li

Segmentation: Probabilistic Formulation

  • Influence of patch on object hypothesis (vote weight)

( )

( ) ( ) ( ) ( )

x

  • p

f, p f C p C x

  • p

x

  • f

p

n i i i n n

, | | , , ,

=

  • (

) ( ) ( )

= = =

) , (

, | , , , , | , |

  • f

n n n

x

  • f

p x

  • f

figure p x

  • figure

p

p

p p

  • Backprojection to features f and pixels p:

Segmentation information Influence on

  • bject hypothesis

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

29‐Mar‐11 79

slide-64
SLIDE 64

Lecture 2 -

Fei-Fei Li

Derivation: ISM Recognition

  • Algorithm stages

1.

Voting

2.

Mean‐shift search

3.

Backprojection

  • Vote weights: contribution of a single feature f

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

Object location

  • n,x

Image Feature f at location

f

Codebook matches

Ci

) ( f C p

i

) , , (

  • i

n

C x

  • p

Matching probability Occurrence distribution

29‐Mar‐11 80

slide-65
SLIDE 65

Lecture 2 -

Fei-Fei Li

  • Algorithm stages

1.

Voting

2.

Mean‐shift search

3.

Backprojection

  • Vote weights: contribution of a single feature f
  • Probability that object on occurs at location x given (f,)

( , , )

n i

p o x f = ∑

  • )

( f C p

i

Matching probability

) , , (

  • i

n

C x

  • p

Occurrence distribution

) ( f C p

i

Matching probability

) , , (

  • i

n

C x

  • p

Occurrence distribution

Derivation: ISM Recognition

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

29‐Mar‐11 81

slide-66
SLIDE 66

Lecture 2 -

Fei-Fei Li

Derivation: ISM Recognition

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

  • Algorithm stages

1.

Voting

2.

Mean‐shift search

3.

Backprojection

  • Vote weights: contribution of a single feature f
  • Probability that object on occurs at location x given (f,)
  • How to measure those probabilities?

( , , )

n i

p o x f = ∑

  • {

}

1 ( ) , where | ( , ) | |

i i i

p C f C C d C f C θ = = ≤ 1 ( , , ) # ( )

n i i

p o x C

  • ccurrences C

=

  • )

( f C p

i

) , , (

  • i

n

C x

  • p

θ f

Activated codebook entries

29‐Mar‐11 82

slide-67
SLIDE 67

Lecture 2 -

Fei-Fei Li

Derivation: ISM Recognition

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

  • Algorithm stages

1.

Voting

2.

Mean‐shift search

3.

Backprojection

  • Vote weights: contribution of a single feature f
  • Probability that object on occurs at location x given (f,)
  • Likelihood of the observed features given the object hypothesis

( , , )

n i

p o x f = ∑

  • )

( f C p

i

) , , (

  • i

n

C x

  • p

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

, | , | , | | , , ,

n i i n i n n n

p o x C p C f p f, p o x f, p f, p f,

  • x

p o x p o x = = ∑

  • (

)

,

n

p o x : Prior for the object location

( )

p f, : Indicator variable for

sampled features

29‐Mar‐11 83

slide-68
SLIDE 68

Lecture 2 -

Fei-Fei Li

Derivation: ISM Recognition

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

  • Algorithm stages

1.

Voting

2.

Mean‐shift search

3.

Backprojection

  • Vote weights: contribution of a single feature f

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

, | , | , | | , , ,

n i i n i n n n

p o x C p C f p f, p o x f, p f, p f,

  • x

p o x p o x = = ∑

  • (

) ( ) ( ) ( ) ( ) ( ) ( ) ( )

, | , | , | | , , ,

n i i n i n n n

p o x C p C f p f, p o x f, p f, p f,

  • x

p o x p o x = = ∑

  • 29‐Mar‐11

84

slide-69
SLIDE 69

Lecture 2 -

Fei-Fei Li

Derivation: ISM Recognition

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

  • Algorithm stages

1.

Voting

2.

Mean‐shift search

3.

Backprojection

  • Vote weights: contribution of a single feature f

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

, | , | , | | , , ,

n i i n i n n n

p o x C p C f p f, p o x f, p f, p f,

  • x

p o x p o x = = ∑

  • 29‐Mar‐11

85

slide-70
SLIDE 70

Lecture 2 -

Fei-Fei Li

Derivation: ISM Recognition

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

  • Algorithm stages

1.

Voting

2.

Mean‐shift search

3.

Backprojection

  • Vote weights: contribution of a single feature f

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

, | , | , | | , , ,

n i i n i n n n

p o x C p C f p f, p o x f, p f, p f,

  • x

p o x p o x = = ∑

  • x

y s

29‐Mar‐11 86

slide-71
SLIDE 71

Lecture 2 -

Fei-Fei Li

Derivation: ISM Top‐Down Segmentation

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

  • Algorithm stages

1.

Voting

2.

Mean‐shift search

3.

Backprojection

  • Vote weights: contribution of a single feature f
  • Figure‐ground backprojection

Fig./Gnd. label for each occurrence Influence on

  • bject hypothesis

( ) ( ) ( ) ( ) ( )

∑ ∑

= =

) p

p

  • f,

i n i i n i n

x

  • p

f, p f C p C x

  • p

C x

  • fig.

p

(

, | , | , , , , |

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

, | , | , | | , , ,

n i i n i n n n

p o x C p C f p f, p o x f, p f, p f,

  • x

p o x p o x = = ∑

  • (

) =

=

  • ,

, , , |

i n

C f x

  • figure

p p

29‐Mar‐11 87

slide-72
SLIDE 72

Lecture 2 -

Fei-Fei Li

Derivation: ISM Top‐Down Segmentation

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

  • Algorithm stages

1.

Voting

2.

Mean‐shift search

3.

Backprojection

  • Vote weights: contribution of a single feature f
  • Figure‐ground backprojection

Fig./Gnd. label for each occurrence Influence on

  • bject hypothesis

( ) ( ) ( ) ( ) ( )

∑ ∑

= =

) p

p

  • f,

i n i i n i n

x

  • p

f, p f C p C x

  • p

C x

  • fig.

p

(

, | , | , , , , |

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

, | , | , | | , , ,

n i i n i n n n

p o x C p C f p f, p o x f, p f, p f,

  • x

p o x p o x = = ∑

  • (

) ∑

= =

i n

f x

  • figure

p

  • ,

, , | p

Marginalize over all codebook entries matched to f

29‐Mar‐11 88

slide-73
SLIDE 73

Lecture 2 -

Fei-Fei Li

Derivation: ISM Top‐Down Segmentation

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

  • Algorithm stages

1.

Voting

2.

Mean‐shift search

3.

Backprojection

  • Vote weights: contribution of a single feature f
  • Figure‐ground backprojection

Fig./Gnd. label for each occurrence Influence on

  • bject hypothesis

( ) ( ) ( ) ( ) ( )

∑ ∑

= =

) p

p

  • f,

i n i i n i n

x

  • p

f, p f C p C x

  • p

C x

  • fig.

p

(

, | , | , , , , |

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

, | , | , | | , , ,

n i i n i n n n

p o x C p C f p f, p o x f, p f, p f,

  • x

p o x p o x = = ∑

  • (

)

∑ ∑

= =

) , (

, |

  • f

i n x

  • figure

p

p

p

Marginalize over all features contai- ning pixel p

29‐Mar‐11 89

slide-74
SLIDE 74

Lecture 2 -

Fei-Fei Li

Top‐Down Segmentation Algorithm

  • This may sound quite complicated, but it boils down to a

very simple algorithm…

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

29‐Mar‐11 90

slide-75
SLIDE 75

Lecture 2 -

Fei-Fei Li

Segmentation

  • Interpretation of p(figure) map
  • per‐pixel confidence in object hypothesis
  • Use for hypothesis verification

p( figure) p( ground)

Segmentation

p(figure) p(ground)

Original image

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

29‐Mar‐11 91

slide-76
SLIDE 76

Lecture 2 -

Fei-Fei Li

Example Results: Motorbikes

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

29‐Mar‐11 92

slide-77
SLIDE 77

Lecture 2 -

Fei-Fei Li

  • Training

– 112 hand‐segmented images

  • Results on novel sequences:

Single‐frame recognition ‐ No temporal continuity used!

Example Results: Cows

[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

29‐Mar‐11 93

slide-78
SLIDE 78

Lecture 2 -

Fei-Fei Li

Office chairs Dining room chairs

Example Results: Chairs

Source: Bastian Leibe

29‐Mar‐11 94

slide-79
SLIDE 79

Lecture 2 -

Fei-Fei Li

Detections Using Ground Plane Constraints

left camera 1175 frames Battery of 5 ISM detectors for different car views [Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]

29‐Mar‐11 95

slide-80
SLIDE 80

Lecture 2 -

Fei-Fei Li

Training Test Output

[Thomas, Ferrari, Tuytelaars, Leibe, Van Gool, 3DRR’ 07; RS S ’ 08]

Inferring Other Information: Part Labels (1)

29‐Mar‐11 96

slide-81
SLIDE 81

Lecture 2 -

Fei-Fei Li

Inferring Other Information: Part Labels (2)

[Thomas, Ferrari, Tuytelaars, Leibe, Van Gool, 3DRR’ 07; RS S ’ 08]

29‐Mar‐11 97

slide-82
SLIDE 82

Lecture 2 -

Fei-Fei Li

Inferring Other Information: Depth Maps

“Depth from a single image”

[Thomas, Ferrari, Tuytelaars, Leibe, Van Gool, 3DRR’ 07; RS S ’ 08]

29‐Mar‐11 98

slide-83
SLIDE 83

Lecture 2 -

Fei-Fei Li

  • Try to fit silhouette to detected person
  • Basic idea

– Search for the silhouette that simultaneously optimizes the

  • Chamfer match to the distance‐transformed edge image
  • Overlap with the top‐down segmentation

– Enforces global consistency – Caveat: introduces again reliance on global model

Extension: Estimating Articulation

[Leibe, S eemann, S chiele, CVPR’ 05]

29‐Mar‐11 99

slide-84
SLIDE 84

Lecture 2 -

Fei-Fei Li

  • Polar instead of Cartesian voting scheme
  • Benefits:

– Recognize objects under image‐plane rotations – Possibility to share parts between articulations.

  • Caveats:

– Rotation invariance should only be used when it’s really needed. (Also increases false positive detections)

Extension: Rotation‐Invariant Detection

[Mikolaj czyk, Leibe, S chiele, CVPR’ 06]

29‐Mar‐11 100

slide-85
SLIDE 85

Lecture 2 -

Fei-Fei Li

Sometimes, Rotation Invariance Is Needed…

[Mikolaj czyk et al., CVPR’ 06]

29‐Mar‐11 101

slide-86
SLIDE 86

Lecture 2 -

Fei-Fei Li

You Can Try It At Home…

  • Linux binaries available

– Including datasets & several pre‐trained detectors – http://www.vision.ee.ethz.ch/bleibe/code

x y s

Source: Bastian Leibe

29‐Mar‐11 102

slide-87
SLIDE 87

Lecture 2 -

Fei-Fei Li

Discussion: Implicit Shape Model

  • Pros:

– Works well for many different object categories

  • Both rigid and articulated objects

– Flexible geometric model

  • Can recombine parts seen on different training examples

– Learning from relatively few (50‐100) training examples – Optimized for detection, good localization properties

  • Cons:

– Needs supervised training data

  • Object bounding boxes for detection
  • Reference segmentations for top‐down segm.

– Only weak geometric constraints

  • Result segmentations may contain superfluous

body parts.

– Purely representative model

  • No discriminative learning

Source: Bastian Leibe

29‐Mar‐11 103

slide-88
SLIDE 88

Lecture 2 -

Fei-Fei Li

What we have learned today

29‐Mar‐11 104

  • Visual recognition overview

– Representation – Learning – Recognition

  • Implicit Shape Model

– Representation – Recognition – Experiments and results