Lecture 2 -
Fei-Fei Li
Lecture 2: Object Detection
Professor Fei‐Fei Li Stanford Vision Lab
29‐Mar‐11 1
Lecture 2: Object Detection Professor Fei Fei Li Stanford Vision Lab - - PowerPoint PPT Presentation
Lecture 2: Object Detection Professor Fei Fei Li Stanford Vision Lab 1 29 Mar 11 Lecture 2 - Fei-Fei Li What we will learn today? Visual recognition overview Representation Learning Recognition Implicit Shape Model
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 1
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 2
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 3
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 4
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 5
Lecture 2 -
Fei-Fei Li
mobile platforms
29‐Mar‐11 6
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 7
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 8
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 9
Lecture 2 -
Fei-Fei Li
car 29‐Mar‐11 10
Lecture 2 -
Fei-Fei Li
Building clock person car
29‐Mar‐11 11
Lecture 2 -
Fei-Fei Li
clock
29‐Mar‐11 12
Lecture 2 -
Fei-Fei Li
Object: Person, back;
1‐2 meters away
Object: Police car, side view, 4‐5 m away Object: Building, 45º pose,
8‐10 meters away It has bricks
29‐Mar‐11 13
Lecture 2 -
Fei-Fei Li
Surveillance
Assistive technologies
Security Assistive driving
Computational photography 29‐Mar‐11 14
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 17
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 18
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 19
Lecture 2 -
Fei-Fei Li
Michelangelo 1475-1564
29‐Mar‐11 20
Lecture 2 -
Fei-Fei Li
image credit: J. Koenderink
29‐Mar‐11 21
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 22
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 23
Lecture 2 -
Fei-Fei Li
Magritte, 1957 29‐Mar‐11 24
Lecture 2 -
Fei-Fei Li
Kilmeny Niland. 1995
29‐Mar‐11 25
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 26
Lecture 2 -
29‐Mar‐11
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 28
Lecture 2 -
Fei-Fei Li
Randomly Multiple interest operators Interest operators Dense, uniformly
Image credits: L. Fei‐Fei, E. Nowak, J. Sivic
29‐Mar‐11 29
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 31
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 32
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 42
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 43
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 44
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 45
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 46
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 47
Lecture 2 -
Fei-Fei Li
‐ BSW by Lampert et al 08 ‐ Also, Alexe, et al 10 Viola, Jones 2001, 29‐Mar‐11 48
Lecture 2 -
Fei-Fei Li
‐ BSW by Lampert et al 08 ‐ Also, Alexe, et al 10 Viola, Jones 2001, 29‐Mar‐11 49
Lecture 2 -
Fei-Fei Li
‐ BSW by Lampert et al 08 ‐ Also, Alexe, et al 10 Non max suppression: Canny ’86 …. Desai et al , 2009 Viola, Jones 2001, 29‐Mar‐11 50
Lecture 2 -
Fei-Fei Li
Category: car Azimuth = 225º Zenith = 30º
‐ It has metal ‐ it is glossy ‐ has wheels
29‐Mar‐11 54
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 55
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 56
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 57
Lecture 2 -
Fei-Fei Li
– Learn an appearance codebook – Learn a star‐topology structural model
– Exact correspondences →
– NN matching → Soft matching – Feature location on obj. → Part location distribution – Uniform votes → Probabilistic vote weighting – Quantized Hough array → Continuous Hough space
x1 x3 x4 x6 x5 x2
Source: Bastian Leibe
29‐Mar‐11 58
Lecture 2 -
Fei-Fei Li
Training image Visual codeword with displacement vectors
Source: Bastian Leibe
Segmentation, International Journal of Computer Vision, Vol. 77(1‐3), 2008.
29‐Mar‐11 59
Lecture 2 -
Fei-Fei Li
Test image
Source: Bastian Leibe
Segmentation, International Journal of Computer Vision, Vol. 77(1‐3), 2008.
29‐Mar‐11 60
Lecture 2 -
Fei-Fei Li
Interest Points Matched Codebook Entries Probabilistic Voting 3D Voting Space (continuous)
x y s
Object Position
Image Feature
f
Interpretation (Codebook match)
Ci
) ( f C p
i
) , , (
n
C x
Probabilistic vote weighting (will be explained later in detail)
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
29‐Mar‐11 62
Lecture 2 -
Fei-Fei Li
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
Backprojected Hypotheses Interest Points Matched Codebook Entries Probabilistic Voting 3D Voting Space (continuous)
x y s
Backprojection
29‐Mar‐11 63
Lecture 2 -
Fei-Fei Li
Original image
Source: Bastian Leibe
29‐Mar‐11 64
Lecture 2 -
Fei-Fei Li
Original image Interest points
Source: Bastian Leibe
29‐Mar‐11 65
Lecture 2 -
Fei-Fei Li
Matched patches
Source: Bastian Leibe
29‐Mar‐11 66
Lecture 2 -
Fei-Fei Li
Source: Bastian Leibe
29‐Mar‐11 67
Lecture 2 -
Fei-Fei Li
1st hypothesis
Source: K. Grauman & B. Leibe
29‐Mar‐11 68
Lecture 2 -
Fei-Fei Li
2nd hypothesis
Source: Bastian Leibe
29‐Mar‐11 69
Lecture 2 -
Fei-Fei Li
3rd hypothesis
Source: Bastian Leibe
29‐Mar‐11 70
Lecture 2 -
Fei-Fei Li
Search window
x y s Source: Bastian Leibe
29‐Mar‐11 71
Lecture 2 -
Fei-Fei Li
Binned accumulator array similar to standard Gen. Hough Transf. Quickly identify candidate maxima locations Refine locations by Mean‐Shift search only around those points
⇒ Avoid quantization effects by keeping exact vote locations. ⇒ Mean‐shift interpretation as kernel prob. density estimation.
y s x
Refinement (Mean-Shift)
y s x
Candidate maxima
y s
Scale votes
x y s
Binned
x Source: Bastian Leibe
29‐Mar‐11 72
Lecture 2 -
Fei-Fei Li
– Increase search window size with hypothesis scale – Scale‐adaptive balloon density estimator
y s x
Refinement (Mean-Shift)
y s x
Candidate maxima
y s
Scale votes
x y s
Binned
x Source: Bastian Leibe
29‐Mar‐11 73
Lecture 2 -
Source: Bastian Leibe
74
Lecture 2 -
Fei-Fei Li
M.A. Peterson, “Object Recognition Processes Can and Do Operate Before Figure-Ground Organization”, Cur. Dir. in Psych. Sc., 3:105-111, 1994.
29‐Mar‐11 75
Lecture 2 -
Fei-Fei Li
Backprojected Hypotheses Interest Points Matched Codebook Entries Probabilistic Voting Segmentation 3D Voting Space (continuous)
x y s
Backprojection
p(figure)
Probabilities
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
29‐Mar‐11 76
Lecture 2 -
Fei-Fei Li
– Desired property of algorithm! ⇒ robustness to occlusion – Standard solution: reject based on bounding box overlap ⇒ Problematic ‐ may lead to missing detections! ⇒ Use segmentations to resolve ambiguities instead. – Basic idea: each observed pixel can only be explained by (at most) one detection.
Source: Bastian Leibe
29‐Mar‐11 77
Lecture 2 -
Fei-Fei Li
– Desired property of algorithm! ⇒ robustness to occlusion – Standard solution: reject based on bounding box overlap ⇒ Problematic ‐ may lead to missing detections! ⇒ Use segmentations to resolve ambiguities instead. – Basic idea: each observed pixel can only be explained by (at most) one detection.
Source: Bastian Leibe
29‐Mar‐11 78
Lecture 2 -
Fei-Fei Li
n i i i n n
∈
= = =
) , (
, | , , , , | , |
n n n
x
p x
figure p x
p
p
p p
Segmentation information Influence on
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
29‐Mar‐11 79
Lecture 2 -
Fei-Fei Li
1.
Voting
2.
Mean‐shift search
3.
Backprojection
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
Object location
Image Feature f at location
f
Codebook matches
Ci
) ( f C p
i
) , , (
n
C x
Matching probability Occurrence distribution
29‐Mar‐11 80
Lecture 2 -
Fei-Fei Li
1.
Voting
2.
Mean‐shift search
3.
Backprojection
( , , )
n i
p o x f = ∑
( f C p
i
Matching probability
) , , (
n
C x
Occurrence distribution
) ( f C p
i
Matching probability
) , , (
n
C x
Occurrence distribution
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
29‐Mar‐11 81
Lecture 2 -
Fei-Fei Li
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
1.
Voting
2.
Mean‐shift search
3.
Backprojection
( , , )
n i
p o x f = ∑
}
1 ( ) , where | ( , ) | |
i i i
p C f C C d C f C θ = = ≤ 1 ( , , ) # ( )
n i i
p o x C
=
( f C p
i
) , , (
n
C x
θ f
Activated codebook entries
29‐Mar‐11 82
Lecture 2 -
Fei-Fei Li
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
1.
Voting
2.
Mean‐shift search
3.
Backprojection
( , , )
n i
p o x f = ∑
( f C p
i
) , , (
n
C x
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
, | , | , | | , , ,
n i i n i n n n
p o x C p C f p f, p o x f, p f, p f,
p o x p o x = = ∑
)
,
n
p o x : Prior for the object location
( )
p f, : Indicator variable for
sampled features
29‐Mar‐11 83
Lecture 2 -
Fei-Fei Li
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
1.
Voting
2.
Mean‐shift search
3.
Backprojection
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
, | , | , | | , , ,
n i i n i n n n
p o x C p C f p f, p o x f, p f, p f,
p o x p o x = = ∑
) ( ) ( ) ( ) ( ) ( ) ( ) ( )
, | , | , | | , , ,
n i i n i n n n
p o x C p C f p f, p o x f, p f, p f,
p o x p o x = = ∑
84
Lecture 2 -
Fei-Fei Li
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
1.
Voting
2.
Mean‐shift search
3.
Backprojection
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
, | , | , | | , , ,
n i i n i n n n
p o x C p C f p f, p o x f, p f, p f,
p o x p o x = = ∑
85
Lecture 2 -
Fei-Fei Li
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
1.
Voting
2.
Mean‐shift search
3.
Backprojection
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
, | , | , | | , , ,
n i i n i n n n
p o x C p C f p f, p o x f, p f, p f,
p o x p o x = = ∑
y s
29‐Mar‐11 86
Lecture 2 -
Fei-Fei Li
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
1.
Voting
2.
Mean‐shift search
3.
Backprojection
Fig./Gnd. label for each occurrence Influence on
( ) ( ) ( ) ( ) ( )
∈
= =
) p
p
i n i i n i n
x
f, p f C p C x
C x
p
(
, | , | , , , , |
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
, | , | , | | , , ,
n i i n i n n n
p o x C p C f p f, p o x f, p f, p f,
p o x p o x = = ∑
) =
=
, , , |
i n
C f x
p p
29‐Mar‐11 87
Lecture 2 -
Fei-Fei Li
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
1.
Voting
2.
Mean‐shift search
3.
Backprojection
Fig./Gnd. label for each occurrence Influence on
( ) ( ) ( ) ( ) ( )
∈
= =
) p
p
i n i i n i n
x
f, p f C p C x
C x
p
(
, | , | , , , , |
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
, | , | , | | , , ,
n i i n i n n n
p o x C p C f p f, p o x f, p f, p f,
p o x p o x = = ∑
) ∑
= =
i n
f x
p
, , | p
Marginalize over all codebook entries matched to f
29‐Mar‐11 88
Lecture 2 -
Fei-Fei Li
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
1.
Voting
2.
Mean‐shift search
3.
Backprojection
Fig./Gnd. label for each occurrence Influence on
( ) ( ) ( ) ( ) ( )
∈
= =
) p
p
i n i i n i n
x
f, p f C p C x
C x
p
(
, | , | , , , , |
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
, | , | , | | , , ,
n i i n i n n n
p o x C p C f p f, p o x f, p f, p f,
p o x p o x = = ∑
)
∈
= =
) , (
, |
i n x
p
p
p
Marginalize over all features contai- ning pixel p
29‐Mar‐11 89
Lecture 2 -
Fei-Fei Li
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
29‐Mar‐11 90
Lecture 2 -
Fei-Fei Li
p( figure) p( ground)
Segmentation
p(figure) p(ground)
Original image
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
29‐Mar‐11 91
Lecture 2 -
Fei-Fei Li
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
29‐Mar‐11 92
Lecture 2 -
Fei-Fei Li
– 112 hand‐segmented images
Single‐frame recognition ‐ No temporal continuity used!
[Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
29‐Mar‐11 93
Lecture 2 -
Fei-Fei Li
Office chairs Dining room chairs
Source: Bastian Leibe
29‐Mar‐11 94
Lecture 2 -
Fei-Fei Li
left camera 1175 frames Battery of 5 ISM detectors for different car views [Leibe, Leonardis, S chiele, S LCV’ 04; IJCV’ 08]
29‐Mar‐11 95
Lecture 2 -
Fei-Fei Li
Training Test Output
[Thomas, Ferrari, Tuytelaars, Leibe, Van Gool, 3DRR’ 07; RS S ’ 08]
29‐Mar‐11 96
Lecture 2 -
Fei-Fei Li
[Thomas, Ferrari, Tuytelaars, Leibe, Van Gool, 3DRR’ 07; RS S ’ 08]
29‐Mar‐11 97
Lecture 2 -
Fei-Fei Li
“Depth from a single image”
[Thomas, Ferrari, Tuytelaars, Leibe, Van Gool, 3DRR’ 07; RS S ’ 08]
29‐Mar‐11 98
Lecture 2 -
Fei-Fei Li
– Search for the silhouette that simultaneously optimizes the
– Enforces global consistency – Caveat: introduces again reliance on global model
[Leibe, S eemann, S chiele, CVPR’ 05]
29‐Mar‐11 99
Lecture 2 -
Fei-Fei Li
– Recognize objects under image‐plane rotations – Possibility to share parts between articulations.
– Rotation invariance should only be used when it’s really needed. (Also increases false positive detections)
[Mikolaj czyk, Leibe, S chiele, CVPR’ 06]
29‐Mar‐11 100
Lecture 2 -
Fei-Fei Li
[Mikolaj czyk et al., CVPR’ 06]
29‐Mar‐11 101
Lecture 2 -
Fei-Fei Li
– Including datasets & several pre‐trained detectors – http://www.vision.ee.ethz.ch/bleibe/code
x y s
Source: Bastian Leibe
29‐Mar‐11 102
Lecture 2 -
Fei-Fei Li
– Works well for many different object categories
– Flexible geometric model
– Learning from relatively few (50‐100) training examples – Optimized for detection, good localization properties
– Needs supervised training data
– Only weak geometric constraints
body parts.
– Purely representative model
Source: Bastian Leibe
29‐Mar‐11 103
Lecture 2 -
Fei-Fei Li
29‐Mar‐11 104