Discriminant Hypothesis for Visual Saliency and its Applications in - - PowerPoint PPT Presentation

discriminant hypothesis for visual saliency and its
SMART_READER_LITE
LIVE PREVIEW

Discriminant Hypothesis for Visual Saliency and its Applications in - - PowerPoint PPT Presentation

Discriminant Hypothesis for Visual Saliency and its Applications in Computer Vision Dashan Gao Joint work with Nuno Vasconcelos Statistical Visual Computing Laboratory Department of Electrical and Computer Engineering, University of


slide-1
SLIDE 1

SVCL

Discriminant Hypothesis for Visual Saliency and its Applications in Computer Vision

Dashan Gao Joint work with Nuno Vasconcelos

Statistical Visual Computing Laboratory Department of Electrical and Computer Engineering, University of California, San Diego

slide-2
SLIDE 2

SVCL

What is visual saliency?

  • certain image features that attract visual

attention

(Yarbus, 1967) (Treisman & Gormican, 1988)

slide-3
SLIDE 3

SVCL

What is known about saliency?

  • Bottom-up (BU) saliency

– stimulus-driven mechanism, fast – goal independent

– e.g. traffic signs

  • Top-down (TD) saliency

– goal-driven mechanism, slower – provides informative locations to the specific task

Yarbus 1967:

  • 1. Free viewing
  • 2. Estimate the economic

level of the people

  • 3. Judge their age
slide-4
SLIDE 4

SVCL

In computer vision

  • saliency is widely used in visual recognition

systems, as a pre-processing stage

– a sparse image representation – reduces computation – example: weakly supervised learning of object categories

Fergus et. al 2003 Sivic et. al 2003

slide-5
SLIDE 5

SVCL

however, in computer vision

  • most saliency definitions are universal, divorced

from the recognition problem

– repeatability (stability) [ Forstner(94), Harris-Stephens(88), shi-Tomasi(94), Mikolajczyk(01,04)] – continuity (curvature) [ Sha’ashua-Ullman(88), Asada-Brady(86)] – complexity (information content) [ Kadir-Brady(01),Sebe-Lew(03)] – rarity (low probability) [ Walker et al.(98), Oliva et al.(03), Bruce-Tsotsos(05)]

  • in result,

– detected salient locations may not be very informative for recognition.

Harris detection

slide-6
SLIDE 6

SVCL

  • Hypothesis: saliency is a discrim inant process
  • requires a stimulus of interest, and a null

hypothesis of stimuli that are not salient

– context dependent: salient attributes depend on the

  • bject of interest and the context (null hypothesis)
  • Definition:

salient features are those that best distinguish the given visual concept from the null hypothesis

Discriminant saliency

(NIPS 2004)

slide-7
SLIDE 7

SVCL

Infomax feature selection

  • solution:

features that maximize the mutual information between the features (X) and the class label (Y)

  • under a constraint of computational parsimony
  • we use marginal mutual information,

(more on this later) X= { X1, … , Xn}

slide-8
SLIDE 8

SVCL

Top-down Discriminant Saliency Model

Scale Selection

W j

WTA Faces Discriminant Feature Selection Salient Features Background Training Testing Saliency Map Original Feature Set

Malik-Perona pre-attentive perception model [ M-K90]

slide-9
SLIDE 9

SVCL

Qualitative evaluation

  • discriminant saliency more correlates with target objects

Original images Saliency Maps by Discriminant Saliency Scale Saliency Detector [K-B 01] Harris Saliency Detector [H-S 88] Salient locations by Discriminant Saliency

slide-10
SLIDE 10

SVCL

Visual classification

  • how informative are the salient points?

– evaluated by measuring actual recognition rates – classifier: histogram of saliency values, fed to an SVM for a presence/ absence test – compared with two standard saliency detectors, and two bench marks

6 0 6 5 7 0 7 5 8 0 8 5 9 0 9 5 1 00

F a c e s M

  • t
  • r

b i k e s A i r p l a n e s DiscSa l- DCT Sca le Sa lie ncy Ha rr is Sa lie ncy I m a ge Pix e l Conste lla tion

slide-11
SLIDE 11

SVCL

Repeatability test

  • robustness to various transformations

2 2.5 3 3.5 4 4.5 5 5.5 6 10 20 30 40 50 60 70 80 90 100

increasing scale changes repeatability (%)

DSD SSD HarrLap HesLap Mser 2 2.5 3 3.5 4 4.5 5 5.5 6 10 20 30 40 50 60 70 80 90 100

increasing viewpoint angle repeatability (%)

DSD SSD HarrLap HesLap Mser 2 2.5 3 3.5 4 4.5 5 5.5 6 10 20 30 40 50 60 70 80 90 100

increasing blur repeatability (%)

DSD SSD HarrLap HesLap Mser 2 2.5 3 3.5 4 4.5 5 5.5 6 10 20 30 40 50 60 70 80 90 100

increasing JPEG compression repeatability (%)

DSD SSD HarrLap HesLap Mser

scale+ rotation view angle blurring JPEG com pression

SSD: Kadir-Brady 2001 HarrLap/ HesLap: Mikolajczyk 2004 Mser: Matas et al. 2004

slide-12
SLIDE 12

SVCL

Other research work based on Top- down Discriminant Saliency

  • a hierarchical model learning of complex features

and detectors for classification

Faces Discriminant Feature Selection Background Saliency Map & Salient Locations Salient (complex) Features Initial Feature Set Object Detection Complex Feature Generation Feature Complexity Control New Complex Feature Set

(CVPR 2005)

slide-13
SLIDE 13

SVCL

Bayesian Integration

  • Advantage

– the trade-off between high selectivity (by TD) and high accuracy (by BU)

  • Probabilistic formulation of salient locations

– saliency output: probability distributions of saliency locations

  • ver the image plane
  • Saliency as Bayesian inference

– BU saliency -> prior, TD saliency -> likelihood – inference from the two observations: both accuracy and selectivity -> a posterior

slide-14
SLIDE 14

SVCL

Results

  • better locations
  • better selectivity (classification accuracy)

BU TD Integration

slide-15
SLIDE 15

SVCL

Applications

  • Automated gathering of training examples
slide-16
SLIDE 16

SVCL

Applications

  • Region-of-interest (ROI) based Image compression

Normal JPEG In low bit rate

ROI compression

Joint work with Sunhyoung Han

slide-17
SLIDE 17

SVCL

Discriminant saliency hypothesis

  • these results are encouraging, but how do we evaluate the

hypothesis as a whole?

  • two fundamental questions

– can it explain biological saliency? – can it drive both bottom-up and top-down saliency?

  • bottom-up saliency particularly interesting

– bottom-up visual pathway much better understood than its top- down counterpart

  • motivated us to study bottom-up discriminant saliency
slide-18
SLIDE 18

SVCL

  • Recall: bottom-up saliency

– stimulus-driven mechanism – saliency detection on a single image

  • Center-surround mechanism

Bottom-up discriminant saliency

X: features, Y: { center, surround}

Wl Wl

1

P(x|center) P(x| surround)

x l

slide-19
SLIDE 19

SVCL

  • band-pass features exhibit regular patterns of response to

natural images

– bow-tie shaped conditional distributions

(Buccigrossi & Simoncelli, 1999; Huang & Mumford, 1999)

– although fine details of feature dependency may vary from scene to scene, coarse structure follows a universal law for all classes – feature dependencies are not informative about image classes

Joint feature distribution

top: three images. bottom: conditional histogram of the same coefficient, conditioned on the value of its parent. P(xi| xj)

slide-20
SLIDE 20

SVCL

  • enables the approximation of mutual information by the sum
  • f marginal mutual information (Vasconcelos & Vasconcelos, 2004)

– all complexity is encoded in the second term

  • from a computational standpoint, this is extremely simple to

compute (computational parsimony)

Joint feature distribution

discriminant info of individual features discriminant info of features dependencies

slide-21
SLIDE 21

SVCL

Generalized Gaussian density (GGD)

  • the marginal distributions of natural image features follow a

generalized Gaussian density (GGD)

For β= 1, a Laplace distribution, and a Gaussian when β= 2

−0.2 −0.1 0.1 0.2 10

−5

10

−4

10

−3

10

−2

10

−1

10

X P(X)

Histogram GGD −2 −1 1 2 10

−5

10

−4

10

−3

10

−2

10

−1

10

X P(X)

Histogram GGD

Examples of GGD fit for responses of two Gabor filters

slide-22
SLIDE 22

SVCL

In summary

  • combining principles of

– infomax organization – computational parsimony – neural tuning to stimulus statistics

  • leads to a very simple

saliency operator

  • which is approximately
  • ptimal in the minimum

error probability sense

Feature decomposition Σ

Color (R/G, B/Y) Intensity Orientation

Feature maps Feature saliency maps Discriminant measure

slide-23
SLIDE 23

SVCL

Biological plausibility

  • discriminant saliency can be implemented with a three-layer

neural network

−10 −5 −1.2 0 1.2 5 10 −0.6 −0.4 −0.2 x φ(x) φ′(x)

Σ S(l)

Layer 3 differential simple cell complex cell

Σ

| . |β1

... ...

Σ

| . |β0

... ...

{ {

Σ φ(.) Σ Tl

W0

l

W1

l ψ[ xj,Φ0] ψ[ xj,Φ1] +

  • g[ xj]

I l(Xk,Y)

Layer 1 Layer 2

| . |β0

xj

| . |β1

H(Y) +

cortical columns

slide-24
SLIDE 24

SVCL

Single vs. conjunctive feature search

Find a bar different from all others

discriminant saliency prediction

slide-25
SLIDE 25

SVCL

Asymmetries in visual search

Time(Find a “Q” among “O”s) < Time(Find a “O” among “Q”s)

presence of a feature absence of a feature

saliency prediction search time # of distractors

slide-26
SLIDE 26

SVCL

Distractor heterogeneity (relevant dimension)

  • Saliency perception is nonlinear and affected by heterogeneous

distractors if the heterogeneity is in the same dimension

bg= 0o bg= 10o bg= 20o

Nothdurft (1993)

10 20 30 40 50 60 70 80 90 0.5 1 1.5 2 2.5 3 3.5

Orientation contrast (deg) Relative Saliency

bg=0 bg=10 bg=20

Discriminant saliency

slide-27
SLIDE 27

SVCL

Prediction of human eye fixations on natural images (qualitative)

S H

slide-28
SLIDE 28

SVCL

Applications in motion saliency

  • discriminant saliency is combined with motion field

– optical flow – removes the camera motion

Joint work with Vijay Mahadevan

(NIPS 2007)

slide-29
SLIDE 29

SVCL

Dynamic background

  • discriminant saliency is combined with dynamic texture model

– dealing with complex motion of the background scene

  • background subtraction in dynamic scenes

Joint work with Vijay Mahadevan

(NIPS 2007)

slide-30
SLIDE 30

SVCL

Discussion

  • discriminant saliency is

– biologically plausible – consistent with psychophysics of saliency

  • connects a number of “disjoint” observations from

neurophysiology and psychophysics

– divisive normalization and saliency asymmetries

  • a (unified) holistic functional justification for V1

– optimally detects salient locations in the visual field in a decision-theoretic sense under certain approximations for the sake of computational parsimony