A Hierarchical Matching of Deformable Shapes Pedro Felzenszwalb - - PowerPoint PPT Presentation

a hierarchical matching of deformable shapes
SMART_READER_LITE
LIVE PREVIEW

A Hierarchical Matching of Deformable Shapes Pedro Felzenszwalb - - PowerPoint PPT Presentation

A Hierarchical Matching of Deformable Shapes Pedro Felzenszwalb Department of Computer Science University of Chicago Joint work with Joshua Schwartz Shape-based recognition Humans can recognize many objects based on shape alone.


slide-1
SLIDE 1

A Hierarchical Matching of Deformable Shapes

Pedro Felzenszwalb Department of Computer Science University of Chicago

Joint work with Joshua Schwartz

slide-2
SLIDE 2

Shape-based recognition

  • Humans can recognize many objects based on shape alone.
  • Fundamental cue for many object categories.
  • Classical approach for recognizing rigid objects.
  • Invariant to photometric variation.
slide-3
SLIDE 3

Comparing and matching shapes

  • Related problems
  • Measuring the similarity between shapes.
  • Finding a set of correspondences between shapes.
  • Finding a shape similar to a model in an image.

slide-4
SLIDE 4

Elastic matching

  • Measure amount of bending and stretching necessary to turn
  • ne curve into another [Basri et al 95], [Sebastian et al 03].
  • Similar to computing edit distance between strings.
  • Efficient dynamic programming algorithms.
  • Can capture some but not all important shape aspects.

Can turn these into each other without much bending anywhere. Similar objects with completely different local boundary properties.

slide-5
SLIDE 5

A1 A2 B1 B2 q2 q1 q3 p3 p2 p1

  • Compose matchings between subcurves to get longer matchings.
  • Different kind of dynamic programming.
  • Cost of combination depends on:
  • Cost of matchings being combined.
  • Arrangement of endpoints.

Our approach: compositional method

For long matchings the endpoints are far away and we capture global geometric properties.

slide-6
SLIDE 6

b a c e d g f h i g | e,c i | d,b h | c,d f | a,e d | c,b e | a,c c | a,b

Shape-tree

  • Shape-tree of curve from a to b:
  • Select midpoint c, store relative location c | a,b.
  • Left child is a shape-tree of sub-curve from a to c.
  • Right child is a shape-tree of sub-curve from c to b.
slide-7
SLIDE 7

b a c e d g f h i g | e,c i | d,b h | c,d f | a,e d | c,b e | a,c c | a,b

Shape-tree

  • Invariant to similarity transformation.
  • Sub-tree is shape-tree of sub-curve.
  • Given placement for a,b we can reconstruct the curve.
  • Bottom nodes captures local curvature.
  • Top nodes capture curvature of sub-sampled curve.
slide-8
SLIDE 8

Relative locations

  • Bookstein coordinates for representing B | A,C.
  • There exists a unique similarity transformation T taking:
  • A to (-1/2,0)
  • C to (1/2,0)
  • Look at T(B).

(-1/2,0) (1/2,0)

A B C A B C

slide-9
SLIDE 9

Deformation model

  • Independently perturb relative locations stored in a shape-tree.
  • Reconstructed curve is perceptually similar to original.
  • Local and global properties are preserved.
slide-10
SLIDE 10

Distance between curves

  • Given curves A and B.
  • Can’t compare shape-trees for A and B built separately.
  • Search over shape-trees for A and B looking for similar pair.
  • Can be done in O(n3m3) time using DP (n = |A|, m = |B|).
  • Our current approach:
  • Fix shape-tree for A and look for map from points in A to

points in B that doesn’t deform the shape-tree much.

  • Efficient O(nm3) DP algorithm.
slide-11
SLIDE 11

Matching open curves

  • Curves: A = (a1, ..., an) and B = (b1, ..., bm).
  • Assume a1 → b1 and an → bm.
  • Shape-tree defines midpoint ai dividing A into A1 and A2.
  • Search over corresponding point bj dividing B.

ψ(A, B) = min

bj

  • ψ(A1, B1) + ψ(A2, B2) +

λ ∗ dif((ai|a1, an), (bj|b1, bm))

  • is similarity between A and B.

ψ measures difference in relative locations. dif is a scaling factor. λ

slide-12
SLIDE 12

Dynamic programming

  • Let v be node in shape-tree of A.
  • Corresponds to subcurve A’.
  • Table T(v):
  • T(v)[i][j] is cost of matching A’ to (bi, ..., bj).
  • T(v) can be computed using T(u) and T(w) where u and w

are children of v in shape-tree.

  • O(n) tables, O(m2) entries per table, O(m) to compute entry.
  • O(nm3) algorithm.
  • Generalization can cut off sub-trees to allow for missing parts.
  • Can also handle closed curves...
slide-13
SLIDE 13

Recognition results - swedish leaves

Nearest neighbor classification Shape-tree 96.28 Inner distance 94.13 Shape context 88.12 15 species 75 examples per species (25 training, 50 test)

slide-14
SLIDE 14

Recognition results - MPEG7

Bullseye score Shape-tree 87.70

  • Hier. procrustes

86.35 Inner distance 85.40 Curve edit 78.14 Shape context 76.51 Example categories: 70 categories 20 shapes per category

slide-15
SLIDE 15

Cluttered images

(1) input (2) edges (3) contours (4) detection model

slide-16
SLIDE 16

b a M C p q

Matching to cluttered images

  • M: model (closed curve).
  • C: curves in image.
  • P: endpoints of curves in C.
  • Match([a,b], [p,q]): matching of subcurve of M from a to b

to subset of C with a → p and b → q.

slide-17
SLIDE 17

Matching to cluttered images

  • Use DP to match each curve in C to every subcurve of M.
  • Generate a set of initial matchings Match([a,b], [p,q]).
  • Running time is linear on total length of image contours.
  • Define gap matching Match([a,b], [p,q]) from every subcurve
  • f M to every pair of anchor points in the image.
  • Cost depends on length of [a,b].
  • Stitch partial matchings together to form longer matchings.
  • Using compositional rule.
  • Second phase of DP.
slide-18
SLIDE 18

Compositional rule

m = (q+r)/2 Match([a,b], [p,q]) = w1 Match([b,c], [r,s]) = w2 Match([a,c], [p,s]) = w1 + w2 + dif((b|a,c), (m|p,s)) If ||q-r|| < T can compose Match([a,b], [p,q]) and Match([b,c], [r,s])

b a c M C s r p q

slide-19
SLIDE 19

Example compositions

  • Composing
  • Match([c,d], [r,s]), Match([d,e], [t,u]).
  • Get Match([c,e], [r,u]).
  • Match([a,b], [p,q]) with gap matching Match([b,c], [q,r]).
  • Get matching Match([a,c], [p,r]).
  • ...
slide-20
SLIDE 20

Example results

best match in each image model

slide-21
SLIDE 21

More results

best match in each image model

slide-22
SLIDE 22

Object detection

slide-23
SLIDE 23

Part-based models

  • Sub-trees usually represent fairly generic curves.
  • We can share sub-trees among different models.
  • Leads to a notion of parts.
  • Useful for bottom-up matching.
  • We can generalize shape-tree models using grammars.
  • Allows for models to share parts.
  • Parts can share sub-parts.
  • Objects can have variable structure.
slide-24
SLIDE 24

Hierarchical curve models (HCM)

  • Underlying PCFG defying the “syntactic” structure of objects.
  • Single terminal l corresponding to line segment.
  • Productions:
  • X → l
  • X → YZ
  • X(a,b) is curve of type X from a to b.
  • Geometry of curve is defined by its structure and conditional

distributions over midpoint choice.

  • For each rule r = X → YZ we have Pr(c | a,b).
slide-25
SLIDE 25
  • To generate a curve of type X from a to b,
  • Pick production r = X → YZ.
  • Pick midpoint c from Pr(c | a,b).
  • Generate curve of type Y from a to c and Z from c to b.
  • Get a new stochastic grammar:
  • Nonterminals X(a,b) and terminals l(a,b).
  • Sentences are polygonal chains: l(a1,a2) l(a2,a3) ... l(an-1,an).
  • P(X(a,b) → Y(a,c)Z(c,b)) = P(X → YZ) Pr(c | a,b).
  • P(X(a,b) → l(a,b)) = P(X → l).
slide-26
SLIDE 26

Examples

  • Shape-tree deformation model is HCM with fixed structure.
  • Underlying grammar has one non-terminal and production

for each node in shape-tree.

  • Always generates the same structure.
  • Pr(c | a,b) is parametric model defined by midpoint location.
  • L(a,b) generates an “almost straight curve” from a to b:
  • L(a,b) ➝ L(a,c) L(c,b) where c ~ (a+b)/2
  • L(a,b) ➝ l(a,b)
  • Recursive model with a single non-terminal.
  • These are two extremes...
slide-27
SLIDE 27

Current and future work...

  • Relationship to wavelets.
  • Learning HCMs from example shapes.
  • Using HCMs for parsing images.
  • ...

random shapes