CS 378 Computer Vision Oct 22, 2009 Outline: Stereopsis and - - PDF document

cs 378 computer vision oct 22 2009 outline stereopsis and
SMART_READER_LITE
LIVE PREVIEW

CS 378 Computer Vision Oct 22, 2009 Outline: Stereopsis and - - PDF document

CS 378 Computer Vision Oct 22, 2009 Outline: Stereopsis and calibration I. Computing correspondences for stereo A. Epipolar geometry gives hard geometric constraint, but only reduces match for a point to be on a line. Other soft constraints


slide-1
SLIDE 1

CS 378 Computer Vision Oct 22, 2009 Outline: Stereopsis and calibration

  • I. Computing correspondences for stereo
  • A. Epipolar geometry gives hard geometric constraint, but only reduces match for a point to be on a line.

Other “soft” constraints are needed to assign corresponding points: ‐ Similarity – how well do the pixels match in a local region by the point?

  • Normalized cross correlation
  • Dense vs. sparse correspondences
  • Effect of window size

‐ Uniqueness—up to one match for every point ‐ Disparity gradient—smooth surfaces would lead to smooth disparities ‐ Ordering—points on same surface imaged in order

  • Enforcing ordering constraint with scanline stereo + dynamic programming

(Aside from point‐based matching, or order‐constrained DP, graph cuts can be used to minimize energy function expressing preference for well‐matched local windows and smooth disparity labels.) Sources of error when computing correspondences for stereo

  • B. Examples of applications leveraging stereo

‐ Segmentation with depth and spatial gradients ‐ Body tracking with fitting and depth ‐ Camera+microphone stereo system ‐ Virtual viewpoint video

  • II. Camera calibration
  • A. Estimating projection matrix

‐ Intrinsic and extrinsic parameters; we can relate them to image pixel coordinates and world point coordinates via perspective projection. ‐ Use a calibration object to collect correspondences. ‐ Set up equation to solve for projection matrix when we know the correspondences.

  • B. Weak calibration

‐ When all we have are corresponding image points (and no camera parameters), can solve for the fundamental matrix. This gives epipolar constraint, but unlike essential matrix does not require knowing camera parameters. ‐ Stereo pipeline with weak calibration: must estimate both fundamental matrix and

  • correspondences. Start from correspondences, estimate geometry, refine.
slide-2
SLIDE 2

10/22/2009 1

Stereo matching Calibration

Thursday, Oct 22 Kristen Grauman UT‐Austin

Today

  • Correspondences, matching for stereo

– A few stereo applications

  • Camera calibration
slide-3
SLIDE 3

10/22/2009 2

Last time: Estimating depth with stereo

  • Stereo: shape from “motion” between two views
  • We need to consider:
  • Info on camera pose (“calibration”)
  • Image point correspondences

scene point scene point

  • ptical
  • ptical

center center image plane image plane

Last time: Epipolar constraint

  • Potential matches for p have to lie on the corresponding
  • Potential matches for p have to lie on the corresponding

epipolar line l’.

  • Potential matches for p’ have to lie on the corresponding

epipolar line l.

Slide credit: M. Pollefeys

slide-4
SLIDE 4

10/22/2009 3

An audio camera & epipolar geometry

Spherical microphone array

Adam O' Donovan, Ramani Duraiswami and Jan Neumann Microphone Arrays as Generalized Cameras for Integrated Audio Visual Processing, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Minneapolis, 2007

Spherical microphone array

An audio camera & epipolar geometry

slide-5
SLIDE 5

10/22/2009 4

Correspondence problem

Multiple match p hypotheses satisfy epipolar constraint, but which is correct?

Figure from Gee & Cipolla 1999

Correspondence problem

  • Beyond the hard constraint of epipolar

geometry, there are “soft” constraints to help identify corresponding points identify corresponding points

– Similarity – Uniqueness – Ordering – Disparity gradient

T fi d t h i th i i ill

  • To find matches in the image pair, we will

assume

– Most scene points visible from both views – Image regions for the matches are similar in appearance

slide-6
SLIDE 6

10/22/2009 5

Correspondence problem

Parallel camera example: epipolar lines are

Source: Andrew Zisserman

Parallel camera example: epipolar lines are corresponding image scanlines

Correspondence problem

Intensity

Source: Andrew Zisserman

profiles

slide-7
SLIDE 7

10/22/2009 6

Correspondence problem

Neighborhoods of corresponding points are similar in intensity patterns.

Source: Andrew Zisserman

Normalized cross correlation

Source: Andrew Zisserman

slide-8
SLIDE 8

10/22/2009 7

Correlation‐based window matching

Source: Andrew Zisserman

Dense correspondence search

For each epipolar line For each pixel / window in the left image

  • compare with every pixel / window on same epipolar line in right

image

  • pick position with minimum match cost (e.g., SSD, correlation)

Adapted from Li Zhang

slide-9
SLIDE 9

10/22/2009 8

Textureless regions

Textureless regions are non‐distinct; high ambiguity for matches.

Source: Andrew Zisserman

Effect of window size

Source: Andrew Zisserman

slide-10
SLIDE 10

10/22/2009 9

Effect of window size

W = 3 W = 20

Figures from Li Zhang

Want window large enough to have sufficient intensity variation, yet small enough to contain only pixels with about the same disparity.

Foreshortening effects

Source: Andrew Zisserman

slide-11
SLIDE 11

10/22/2009 10

Occlusion

Slide credit: David Kriegman

Sparse correspondence search

  • Restrict search to sparse set of detected features
  • Rather than pixel values (or lists of pixel values) use feature

descriptor and an associated feature distance

  • Still narrow search further by epipolar geometry
slide-12
SLIDE 12

10/22/2009 11

Correspondence problem

  • Beyond the hard constraint of epipolar

geometry, there are “soft” constraints to help identify corresponding points identify corresponding points

– Similarity – Uniqueness – Disparity gradient – Ordering

Uniqueness constraint

  • Up to one match in right image for every point in left

image

Figure from Gee & Cipolla 1999

slide-13
SLIDE 13

10/22/2009 12

Disparity gradient constraint

  • Assume piecewise continuous surface, so want disparity

estimates to be locally smooth

Figure from Gee & Cipolla 1999

Ordering constraint

  • Points on same surface (opaque object) will be in same
  • rder in both views

Figure from Gee & Cipolla 1999

slide-14
SLIDE 14

10/22/2009 13

Ordering constraint

  • Won’t always hold, e.g. consider transparent object, or

an occluding surface

Figures from Forsyth & Ponce

Scanline stereo

  • Try to coherently match pixels on the entire scanline
  • Different scanlines are still optimized independently

Left image Right image

intensity

slide-15
SLIDE 15

10/22/2009 14

“Shortest paths” for scan-line stereo

Left image Right image

I I′

left

S

q

Left

  • cclusion

t

Right

  • cclusion

Can be implemented with dynamic programming Ohta & Kanade ’85, Cox et al. ‘96

right

S

p s

Slide credit: Y. Boykov

Coherent stereo on 2D grid

  • Scanline stereo generates streaking artifacts
  • Can’t use dynamic programming to find spatially

coherent disparities/ correspondences on a 2D grid

slide-16
SLIDE 16

10/22/2009 15

  • Example depth maps (pentagon)

Stereo matching as energy minimization

I1 I2 D W1(i) W2(i+D(i)) D(i) ( )

) ( ) , , (

smooth 2 1 data

D E D I I E E β α + = ( )

( )

2

  • Energy functions of this form can be minimized using

graph cuts

  • Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization

via Graph Cuts, PAMI 2001

( )

− =

j i

j D i D E

, neighbors smooth

) ( ) ( ρ

( )

2 2 1 data

)) ( ( ) (

+ − =

i

i D i W i W E

Source: Steve Seitz

slide-17
SLIDE 17

10/22/2009 16

Recap: stereo with calibrated cameras

  • Image pair
  • Detect some features
  • Detect some features
  • Compute E from given R

and T

  • Match features using the

epipolar and other constraints constraints

  • Triangulate for 3d structure

Error sources

  • Low-contrast ; textureless image regions
  • Occlusions
  • Occlusions
  • Camera calibration errors
  • Violations of brightness constancy (e.g.,

specular reflections)

  • Large motions
slide-18
SLIDE 18

10/22/2009 17

Today

  • Correspondences, matching for stereo

– A few stereo applications

  • Camera calibration

Depth for segmentation

Edges in disparity in conj nction ith

Danijela Markovic and Margrit Gelautz, Interactive Media Systems Group, Vienna University of Technology

conjunction with image edges enhances contours found

slide-19
SLIDE 19

10/22/2009 18

Depth for segmentation

Danijela Markovic and Margrit Gelautz, Interactive Media Systems Group, Vienna University of Technology

Stereo in machine vision systems

Left : The Stanford cart sports a single camera moving in discrete increments along a straight line and providing multiple snapshots of

  • utdoor scenes

Right : The INRIA mobile robot uses three cameras to map its environment

Forsyth & Ponce

slide-20
SLIDE 20

10/22/2009 19

Model-based body tracking, stereo input

David Demirdjian, MIT Vision Interface Group http://people.csail.mit.edu/demirdji/movie/artic-tracker/turn-around.m1v

  • Adam O' Donovan, Ramani Duraiswami and Jan Neumann.

Microphone Arrays as Generalized Cameras for Integrated Audio Visual Processing, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Minneapolis, 2007

slide-21
SLIDE 21

10/22/2009 20

Virtual viewpoint video

  • C. Zitnick et al, High-quality video view interpolation using a layered representation,

SIGGRAPH 2004.

Virtual viewpoint video

http://research.microsoft.com/IVM/VVV/

slide-22
SLIDE 22

10/22/2009 21

Uncalibrated case

  • What if we don’t know the camera parameters?

Today

  • Correspondences, matching for stereo

– A few stereo applications

  • Camera calibration
slide-23
SLIDE 23

10/22/2009 22

Perspective projection

Image plane Focal length Camera frame Optical axis Scene point Image coordinates

Thus far, in camera’s reference frame only.

  • Extrinsic: location and orientation of camera frame

with respect to reference frame I t i i h t i l di t t i l

Camera parameters

  • Intrinsic: how to map pixel coordinates to image plane

coordinates

Reference frame Camera 1 frame

slide-24
SLIDE 24

10/22/2009 23

Extrinsic camera parameters

) ( T P R P − = ) ( T P R P − =

w c

World reference frame Camera reference frame

( )

T

( )

T c

Z Y X , , = P

  • Extrinsic: location and orientation of camera frame with

respect to reference frame I t i i h t i l di t t i

Camera parameters

  • Intrinsic: how to map pixel coordinates to image

plane coordinates

Reference frame Camera 1 frame

slide-25
SLIDE 25

10/22/2009 24

Intrinsic camera parameters

  • Ignoring any geometric distortions from optics, we can

describe them by:

x x im

s

  • x

x ) ( − − =

y y im

s

  • y

y ) ( − − =

Coordinates of projected point in camera reference frame Coordinates of image point in pixel units Coordinates of image center in pixel units Effective size of a pixel (mm)

Camera parameters

  • We know that in terms of camera reference frame:

) ( T P R P − =

w c

and

( )

T

  • Substituting previous eqns describing intrinsic and extrinsic

parameters, can relate pixels coordinates to world points:

) ( ) ( ) (

1

T P R T P R − ⋅ − ⋅ = − −

w x x im

f s

  • x

R = Row i of

and

( )

T c

Z Y X , , = P

) (

3

T P R

w

) ( ) ( ) (

3 2

T P R T P R − ⋅ − ⋅ = − −

w w y y im

f s

  • y

Ri = Row i of rotation matrix

slide-26
SLIDE 26

10/22/2009 25

Projection matrix

  • This can be rewritten as a

matrix product using

⎥ ⎥ ⎤ ⎢ ⎢ ⎡ ⎥ ⎤ ⎢ ⎡

w w im

Y X wx M M

homogeneous coordinates: ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎣ ⎡ − − = 1 / /

int y y x x

  • s

f

  • s

f M

⎥ ⎥ ⎥ ⎦ ⎢ ⎢ ⎢ ⎣ = ⎥ ⎥ ⎥ ⎦ ⎢ ⎢ ⎢ ⎣ 1

int w w ext im

Z w wy M M

M

where: ⎥ ⎦ ⎢ ⎣ 1

⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎣ ⎡ − − − =

Τ Τ Τ

T R T R T R M

3 33 32 31 2 23 22 21 1 13 12 11

r r r r r r r r r

ext

Projection matrix

  • This can be rewritten as a

matrix product using

⎥ ⎥ ⎤ ⎢ ⎢ ⎡ ⎥ ⎤ ⎢ ⎡

w w im

Y X wx M M

im

P wx M ⎥ ⎤ ⎢ ⎡

homogeneous coordinates: ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎣ ⎡ − − = 1 / /

int y y x x

  • s

f

  • s

f M

⎥ ⎥ ⎥ ⎦ ⎢ ⎢ ⎢ ⎣ = ⎥ ⎥ ⎥ ⎦ ⎢ ⎢ ⎢ ⎣ 1

int w w ext im

Z w wy M M

M

where:

w im

P w wy M = ⎥ ⎥ ⎥ ⎦ ⎢ ⎢ ⎢ ⎣

⎥ ⎦ ⎢ ⎣ 1

⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎣ ⎡ − − − =

Τ Τ Τ

T R T R T R M

3 33 32 31 2 23 22 21 1 13 12 11

r r r r r r r r r

ext

slide-27
SLIDE 27

10/22/2009 26

Calibrating a camera

  • Compute intrinsic and extrinsic

parameters using observed camera data data Main idea

  • Place “calibration object” with known

geometry in the scene

  • Get correspondences
  • Solve for mapping from scene to

image: estimate M=MintMext

Estimating the projection matrix

w im im

P w wy wx M = ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎣ ⎡

w w im

x P M P M P M ⋅ ⋅ ⋅ =

3 1

For a given feature point

w w im

P P x ⋅ = ⋅

1 3

) ( M M

w w im

y P M P M ⋅ ⋅ =

3 2

slide-28
SLIDE 28

10/22/2009 27

Estimating the projection matrix

w im im

P w wy wx M = ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎣ ⎡

w w im

x P M P M P M ⋅ ⋅ ⋅ =

3 1

For a given feature point

) (

3 1 w im w

P x P ⋅ − ⋅ = M M

w w im

y P M P M ⋅ ⋅ =

3 2

Estimating the projection matrix

w im im

P w wy wx M = ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎣ ⎡

w w im

x P M P M P M ⋅ ⋅ ⋅ =

3 1

w im

x P M M ⋅ − = ) (

3 1

For a given feature point

w w im

y P M P M ⋅ ⋅ =

3 2

w im

y P M M ⋅ − = ) (

3 2

slide-29
SLIDE 29

10/22/2009 28

Estimating the projection matrix

w im

x P M M ⋅ − = ) (

3 1 w im

y P M M ⋅ − = ) (

3 2

im w im w im w im w w w im w im w im w im w w w

y Z y Y y X y Z Y X x Z x Y x X x Z Y X − − − − − − − − 1 1

=

Estimating the projection matrix

This is true for every feature point, so we can stack up n

  • bserved image features and their associated 3d points

in single equation:

) 1 ( ) 1 ( ) 1 ( ) 1 ( ) 1 ( ) 1 ( ) 1 ( ) 1 ( ) 1 ( ) 1 ( ) 1 ( ) 1 ( ) 1 ( ) 1 ( ) 1 ( ) 1 ( ) 1 ( ) 1 ( ) 1 ( ) 1 (

1 1

im w im w im w w w w im w im w im w im w w w

y Z y Y y X y Z Y X x Z x Y x X x Z Y X

im

− − − − − − − −

… … … …

) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) (

1 1

n n n n n n n im n n n n n n n n n n n n n

im w im w im w im w w w

y Z y Y y X y Z Y X x Z x Y x X x Z Y X − − − − − − − −

=

Solve for mij’s (the calibration information) [F&P Section 3.1]

im

im w im w im w w w w

y y y y

=

slide-30
SLIDE 30

10/22/2009 29

Calibrating a camera

  • Compute intrinsic and extrinsic

parameters using observed camera data data Main idea

  • Place “calibration object” with known

geometry in the scene

  • Get correspondences
  • Solve for mapping from scene to

image: estimate M=MintMext

When would we calibrate this way?

  • Makes sense when geometry of system is not going to

change over time change over time

  • …When would it change?
slide-31
SLIDE 31

10/22/2009 30

Weak calibration

  • Want to estimate world geometry without requiring

calibrated cameras

– Archival videos – Archival videos – Photos from multiple unrelated users – Dynamic camera system

  • Main idea:

– Estimate epipolar geometry from a (redundant) set of point correspondences between two uncalibrated cameras

Uncalibrated case

p M p

int

=

For a given camera:

Camera coordinates ) ( ) (

1 int ,

left left

left

p M p

=

Camera coordinates Image pixel

So, for two cameras (left and right):

) ( 1 int ,

) (

right right

right

p M p

=

coordinates Internal calibration matrices, one per camera g p coordinates

slide-32
SLIDE 32

10/22/2009 31

Uncalibrated case: fundamental matrix

) ( ) (

=

Τ left right Ep

p

From before, the essential matrix E.

) ( ) (

1 int ,

left left

left

p M p

=

) ( 1 int ,

) (

right right

right

p M p

=

( ) ( )

1 int , 1 int ,

=

− Τ − left left right right

p M E p M

( )

1 int , int ,

=

− Τ − Τ left left right right

p EM M p

( )

, , f f g g

=

Τ left right p

F p

Fundamental matrix

Fundamental matrix

  • Relates pixel coordinates in the two views
  • More general form than essential matrix: we remove

g need to know intrinsic parameters

  • If we estimate fundamental matrix from correspondences

in pixel coordinates, can reconstruct epipolar geometry without intrinsic or extrinsic parameters

slide-33
SLIDE 33

10/22/2009 32

Computing F from correspondences

( )

1 int . int , − Τ −

=

left right

EM M F

  • Cameras are uncalibrated: we don’t know E or left or

right Mint matrices

  • Estimate F from 8+ point correspondences

=

Τ left right p

F p

  • Estimate F from 8+ point correspondences.

Computing F from correspondences

=

Τ left right p

F p

Each point correspondence generates one constraint on F Collect n of these constraints Solve for f , vector of parameters.

slide-34
SLIDE 34

10/22/2009 33

Stereo pipeline with weak calibration

  • So, where to start with uncalibrated cameras?

– Need to find fundamental matrix F and the correspondences (pairs of points (u’,v’) ↔ (u,v)). (p p ( ) ( ))

  • 1) Find interest points in image (more on this later)
  • 2) Compute correspondences
  • 3) Compute epipolar geometry
  • 4) Refine

Example from Andrew Zisserman

1) Find interest points (next week)

Stereo pipeline with weak calibration

slide-35
SLIDE 35

10/22/2009 34

2) Match points only using proximity

Stereo pipeline with weak calibration Putative matches based on correlation search

slide-36
SLIDE 36

10/22/2009 35

RANSAC for robust estimation of the fundamental matrix

  • Select random sample of correspondences
  • Compute F using them

– This determines epipolar constraint

  • Evaluate amount of support – inliers within threshold

distance of epipolar line

  • Choose F with most support (inliers)

Putative matches based on correlation search

slide-37
SLIDE 37

10/22/2009 36

Pruned matches

  • Correspondences consistent with epipolar geometry
  • Resulting epipolar geometry
slide-38
SLIDE 38

10/22/2009 37

Next:

  • Tuesday: local invariant features

– How to find interest points? – How to describe local neighborhoods more robustly than with a list of pixel intensities?