[PPT] - Discriminant Hypothesis for Visual Saliency and its Applications in PowerPoint Presentation

SLIDE 1

SVCL

Discriminant Hypothesis for Visual Saliency and its Applications in Computer Vision

Dashan Gao Joint work with Nuno Vasconcelos

Statistical Visual Computing Laboratory Department of Electrical and Computer Engineering, University of California, San Diego

SLIDE 2

SVCL

What is visual saliency?

certain image features that attract visual

attention

(Yarbus, 1967) (Treisman & Gormican, 1988)

SLIDE 3

SVCL

What is known about saliency?

Bottom-up (BU) saliency

– stimulus-driven mechanism, fast – goal independent

– e.g. traffic signs

Top-down (TD) saliency

– goal-driven mechanism, slower – provides informative locations to the specific task

Yarbus 1967:

1. Free viewing
2. Estimate the economic

level of the people

3. Judge their age

SLIDE 4

SVCL

In computer vision

saliency is widely used in visual recognition

systems, as a pre-processing stage

– a sparse image representation – reduces computation – example: weakly supervised learning of object categories

Fergus et. al 2003 Sivic et. al 2003

SLIDE 5

SVCL

however, in computer vision

most saliency definitions are universal, divorced

from the recognition problem

– repeatability (stability) [ Forstner(94), Harris-Stephens(88), shi-Tomasi(94), Mikolajczyk(01,04)] – continuity (curvature) [ Sha’ashua-Ullman(88), Asada-Brady(86)] – complexity (information content) [ Kadir-Brady(01),Sebe-Lew(03)] – rarity (low probability) [ Walker et al.(98), Oliva et al.(03), Bruce-Tsotsos(05)]

in result,

– detected salient locations may not be very informative for recognition.

Harris detection

SLIDE 6

SVCL

Hypothesis: saliency is a discrim inant process
requires a stimulus of interest, and a null

hypothesis of stimuli that are not salient

– context dependent: salient attributes depend on the

bject of interest and the context (null hypothesis)
Definition:

salient features are those that best distinguish the given visual concept from the null hypothesis

Discriminant saliency

(NIPS 2004)

SLIDE 7

SVCL

Infomax feature selection

solution:

features that maximize the mutual information between the features (X) and the class label (Y)

under a constraint of computational parsimony
we use marginal mutual information,

(more on this later) X= { X1, … , Xn}

SLIDE 8

SVCL

Top-down Discriminant Saliency Model

Scale Selection

W j

WTA Faces Discriminant Feature Selection Salient Features Background Training Testing Saliency Map Original Feature Set

Malik-Perona pre-attentive perception model [ M-K90]

SLIDE 9

SVCL

Qualitative evaluation

discriminant saliency more correlates with target objects

Original images Saliency Maps by Discriminant Saliency Scale Saliency Detector [K-B 01] Harris Saliency Detector [H-S 88] Salient locations by Discriminant Saliency

SLIDE 10

SVCL

Visual classification

how informative are the salient points?

– evaluated by measuring actual recognition rates – classifier: histogram of saliency values, fed to an SVM for a presence/ absence test – compared with two standard saliency detectors, and two bench marks

6 0 6 5 7 0 7 5 8 0 8 5 9 0 9 5 1 00

F a c e s M

t
r

b i k e s A i r p l a n e s DiscSa l- DCT Sca le Sa lie ncy Ha rr is Sa lie ncy I m a ge Pix e l Conste lla tion

SLIDE 11

SVCL

Repeatability test

robustness to various transformations

2 2.5 3 3.5 4 4.5 5 5.5 6 10 20 30 40 50 60 70 80 90 100

increasing scale changes repeatability (%)

DSD SSD HarrLap HesLap Mser 2 2.5 3 3.5 4 4.5 5 5.5 6 10 20 30 40 50 60 70 80 90 100

increasing viewpoint angle repeatability (%)

DSD SSD HarrLap HesLap Mser 2 2.5 3 3.5 4 4.5 5 5.5 6 10 20 30 40 50 60 70 80 90 100

increasing blur repeatability (%)

DSD SSD HarrLap HesLap Mser 2 2.5 3 3.5 4 4.5 5 5.5 6 10 20 30 40 50 60 70 80 90 100

increasing JPEG compression repeatability (%)

DSD SSD HarrLap HesLap Mser

scale+ rotation view angle blurring JPEG com pression

SSD: Kadir-Brady 2001 HarrLap/ HesLap: Mikolajczyk 2004 Mser: Matas et al. 2004

SLIDE 12

SVCL

Other research work based on Top- down Discriminant Saliency

a hierarchical model learning of complex features

and detectors for classification

Faces Discriminant Feature Selection Background Saliency Map & Salient Locations Salient (complex) Features Initial Feature Set Object Detection Complex Feature Generation Feature Complexity Control New Complex Feature Set

(CVPR 2005)

SLIDE 13

SVCL

Bayesian Integration

Advantage

– the trade-off between high selectivity (by TD) and high accuracy (by BU)

Probabilistic formulation of salient locations

– saliency output: probability distributions of saliency locations

ver the image plane
Saliency as Bayesian inference

– BU saliency -> prior, TD saliency -> likelihood – inference from the two observations: both accuracy and selectivity -> a posterior

SLIDE 14

SVCL

Results

better locations
better selectivity (classification accuracy)

BU TD Integration

SLIDE 15

SVCL

Applications

Automated gathering of training examples

SLIDE 16

SVCL

Applications

Region-of-interest (ROI) based Image compression

Normal JPEG In low bit rate

ROI compression

Joint work with Sunhyoung Han

SLIDE 17

SVCL

Discriminant saliency hypothesis

these results are encouraging, but how do we evaluate the

hypothesis as a whole?

two fundamental questions

– can it explain biological saliency? – can it drive both bottom-up and top-down saliency?

bottom-up saliency particularly interesting

– bottom-up visual pathway much better understood than its top- down counterpart

motivated us to study bottom-up discriminant saliency

SLIDE 18

SVCL

Recall: bottom-up saliency

– stimulus-driven mechanism – saliency detection on a single image

Center-surround mechanism

Bottom-up discriminant saliency

X: features, Y: { center, surround}

Wl Wl

1

P(x|center) P(x| surround)

x l

SLIDE 19

SVCL

band-pass features exhibit regular patterns of response to

natural images

– bow-tie shaped conditional distributions

(Buccigrossi & Simoncelli, 1999; Huang & Mumford, 1999)

– although fine details of feature dependency may vary from scene to scene, coarse structure follows a universal law for all classes – feature dependencies are not informative about image classes

Joint feature distribution

top: three images. bottom: conditional histogram of the same coefficient, conditioned on the value of its parent. P(xi| xj)

SLIDE 20

SVCL

enables the approximation of mutual information by the sum
f marginal mutual information (Vasconcelos & Vasconcelos, 2004)

– all complexity is encoded in the second term

from a computational standpoint, this is extremely simple to

compute (computational parsimony)

Joint feature distribution

discriminant info of individual features discriminant info of features dependencies

SLIDE 21

SVCL

Generalized Gaussian density (GGD)

the marginal distributions of natural image features follow a

generalized Gaussian density (GGD)

For β= 1, a Laplace distribution, and a Gaussian when β= 2

−0.2 −0.1 0.1 0.2 10

−5

10

−4

10

−3

10

−2

10

−1

10

X P(X)

Histogram GGD −2 −1 1 2 10

−5

10

−4

10

−3

10

−2

10

−1

10

X P(X)

Histogram GGD

Examples of GGD fit for responses of two Gabor filters

SLIDE 22

SVCL

In summary

combining principles of

– infomax organization – computational parsimony – neural tuning to stimulus statistics

leads to a very simple

saliency operator

which is approximately
ptimal in the minimum

error probability sense

Feature decomposition Σ

Color (R/G, B/Y) Intensity Orientation

Feature maps Feature saliency maps Discriminant measure

SLIDE 23

SVCL

Biological plausibility

discriminant saliency can be implemented with a three-layer

neural network

−10 −5 −1.2 0 1.2 5 10 −0.6 −0.4 −0.2 x φ(x) φ′(x)

Σ S(l)

Layer 3 differential simple cell complex cell

Σ

| . |β1

... ...

Σ

| . |β0

... ...

{ {

Σ φ(.) Σ Tl

W0

l

W1

l ψ[ xj,Φ0] ψ[ xj,Φ1] +

g[ xj]

I l(Xk,Y)

Layer 1 Layer 2

| . |β0

xj

| . |β1

H(Y) +

cortical columns

SLIDE 24

SVCL

Single vs. conjunctive feature search

Find a bar different from all others

discriminant saliency prediction

SLIDE 25

SVCL

Asymmetries in visual search

Time(Find a “Q” among “O”s) < Time(Find a “O” among “Q”s)

presence of a feature absence of a feature

saliency prediction search time # of distractors

SLIDE 26

SVCL

Distractor heterogeneity (relevant dimension)

Saliency perception is nonlinear and affected by heterogeneous

distractors if the heterogeneity is in the same dimension

bg= 0o bg= 10o bg= 20o

Nothdurft (1993)

10 20 30 40 50 60 70 80 90 0.5 1 1.5 2 2.5 3 3.5

Orientation contrast (deg) Relative Saliency

bg=0 bg=10 bg=20

Discriminant saliency

SLIDE 27

SVCL

Prediction of human eye fixations on natural images (qualitative)

S H

SLIDE 28

SVCL

Applications in motion saliency

discriminant saliency is combined with motion field

– optical flow – removes the camera motion

Joint work with Vijay Mahadevan

(NIPS 2007)

SLIDE 29

SVCL

Dynamic background

discriminant saliency is combined with dynamic texture model

– dealing with complex motion of the background scene

background subtraction in dynamic scenes

Joint work with Vijay Mahadevan

(NIPS 2007)

SLIDE 30

SVCL

Discussion

discriminant saliency is

– biologically plausible – consistent with psychophysics of saliency

connects a number of “disjoint” observations from

neurophysiology and psychophysics

– divisive normalization and saliency asymmetries

a (unified) holistic functional justification for V1

– optimally detects salient locations in the visual field in a decision-theoretic sense under certain approximations for the sake of computational parsimony