[PPT] - A Kernel Density Based Approach for Large Scale Image Retrieval and PowerPoint Presentation

SLIDE 1

A Kernel Density Based Approach for Large Scale Image Retrieval and Its Application to Tattoo Identification

Wei Tong CMU

SLIDE 2

Outline

Why tattoo retrieval

– Used in forensic to identify

Victims, criminals
Content-based image retrieval

– Bag of visual words model

Our method

– Similar to the language model

Examples
Other interesting applications

– Graffiti, trademark…

Summary

SLIDE 3

Introduction to Tattoos

Tattoo facts

– A form of body modification – Embedded deeply into the skin – Over 5000 years

SLIDE 4

Introduction to Tattoos

Tattoo statistics ( 2003)

Americans that have at least one tattoo:

– Young adults (18 -29 age group): ~40%

SLIDE 5

Introduction to Tattoos

Tattoo is a (soft) biometric trait

– Primary traits

Identify an individual
May not be available or work well in certain conditions

– Soft biometric traits

Provide some identifying information
But lack distinctiveness and permanence to sufficiently

differentiate between two individuals

Gender, height, weight, tattoo, scar, birthmark

SLIDE 6

Introduction to Tattoos

Tattoo for Law Enforcement

– Victim identification, e.g. 9/11 bombing

(a) (b) (c) (d) (a) a tsunami victim, (b) body of unidentified murdered women, (c) and (d) body parts found in a Florida state park

SLIDE 7

Introduction to Tattoos

Tattoos for Law Enforcement
Identify criminals

– Often contain hidden meaning of a suspect’s criminal history » e.g. Previous convictions, years spent in jail – Gang membership. About 800,000 gang members on the streets nationwide; 100,000 in greater LA area alone

18th Street gang , the largest LA-based street gang

SLIDE 8

Introduction to Tattoos

Law enforcement agencies photograph tattoos

– when a suspect is arrested – They have done this for many years

SLIDE 9

Introduction to Tattoos

Michigan State Police Database

~100,000 images (JPEG, 640x480 color images)

SLIDE 10

Tattoo Image Matching and Retrieval

ANSI/NIST

Tattoo Classes

– 8 major classes – 70 subclasses

Human Animal Plant Flag Object Abstract Symbol Other

SLIDE 11

Tattoo Image Matching and Retrieval

Existing practice is text search

– Class label/keyword based – Tedious manual annotation – Same image different label/keyword by different individuals – One image, multiple keywords – Non-uniform class/keyword distribution

a keyword returns thousands of images

– Change in keywords requires re-annotation

SLIDE 12

Tattoo Image Matching and Retrieval

Content-based image retrieval

– Given a tattoo image query – Find the most similar tattoos in the database

Query

SLIDE 13

Introduction to Tattoos

Tattoo-ID system
Prototype tattoo retrieval system
Delivered to FBI in 2011
Licensed the technology to MorphoTrak

SLIDE 14

Introduction to Tattoos

A recent story on tattoo identification

This person gave his name as “Darnell Lewis” to a police officer, but the police man noticed that the man had “Frazier” tattooed on his neck which is his real

surname. He was arrested on four misdemeanor warrants.

SLIDE 15

Introduction to Tattoos

FBI is developing the Next Generation

Identification (NGI) system for identifying criminals

– $ 1 billion program – SMT (scar, mark, tattoo) is one of the major components – Expected by 2014

SLIDE 16

Content-based Image Retrieval

What “content” should be used

– Difficult to understand the information needs of a user from a query image

SLIDE 17

Content-based Image Retrieval

Images with similar color

SLIDE 18

Content-based Image Retrieval

Images with similar shape

SLIDE 19

Content-based Image Retrieval

Images with similar semantic

SLIDE 20

Content-based Image Retrieval

Challenges in CBIR

– You get drunk, – REALLY drunk, – Hit over the head, – Kidnapped to another city

in a country on the other side of the world

– When you wake up, – You try to figure out what city are you in, and what is going

n
That’s what it’s like to be a CBIR system!

SLIDE 21

Content-based Image Retrieval

Near Duplicate Image Retrieval

– Given a query image, identify gallery images with high visual similarity

SLIDE 22

Content-based Image Retrieval

Bag-of-features

– Detect local interest points – Represent each interest point by a descriptor – An image is a collection of those points.

22 19 23 1 66 103 45 6 38 232 44 11 48 29 55 129 1 11 78 110 1 32 220 30 11 34 21

Descriptors of the key points Original image Detected key points

Each descriptor is 5 dimension

SLIDE 23

Content-based Image Retrieval

Bag-of-words Model

A document A collection of the words in the document An image A collection of the key points

f the image

What is the difference

The same word appears in many documents No “same key point”, but “similar key point” appears in many images which have similar “visual content” Group “similar key point” in different images in to “visual words”

SLIDE 24

Content-based Image Retrieval

Bag-of-words Model

b1 b2 b3 b4 b5 b6 b7 b8 … … … b1 b2 b3 b4 Group key points into visual words Represent images by histograms of visual words

SLIDE 25

Content-based Image Retrieval

Shortcomings of bag-of-words model
Independent steps

– Generating “bag-of words” representation – Image retrieval Separate steps hurt performance

Computationally expensive

– Clustering key points into “visual words”

Inconsistent mapping

– Distant keypoints may belong to the same cluster

Influenced by outliers

– Every keypoints must be mapped to a cluster center

SLIDE 26

Our Method

Database

– Image -> distribution – Keypoints -> sampled from the distribution – Distribution -> kernel density estimation

Retrieval

– Query likelihood

Given a distribution, how likely would it generate the query keypoints

……

SLIDE 27

Efficient Kernel Density Estimation

A straightforward way

– Kernel density estimation – Retrieval: query likelihood

Two Challenges:

– How to efficiently estimate the density function of each image – How to avoid linear scan of the database for retrieval

SLIDE 28

Efficient Kernel Density Estimation

Weighted mixture model for each image
When is very large

SLIDE 29

Efficient Kernel Density Estimation

The image is eventually represented by

– Estimate by MLE – Problem

is high dimensional
Need to compute for every image

SLIDE 30

Use a local kernel
is a sparse vector with proper

Efficient Kernel Density Estimation

SLIDE 31

Efficient Kernel Density Estimation

Updating rule for MLE

– It is globally optimal

Approximation

– Only update once – Exact optimal solution when are far apart from each other

SLIDE 32

Efficient Kernel Density Estimation

Algorithm to compute of each image

– Range search – Compute

SLIDE 33

Regularization

MLE of is poor

– with limited number of keypoints

Regularization

– global – KL divergence

SLIDE 34

Regularization

The solution of becomes:
Global is:

Global MLE of

SLIDE 35

Retrieval

Query Likelihood:
Problem: is not sparse anymore
has two parts

Not sparse Sparse

SLIDE 36

Retrieval

Decompose the query likelihood

Sparse, inverted index can be utilized to identify a small set

f candidate images.

Constant, independent of individual images, ignored in image search

…

SLIDE 37

Retrieval

Algorithm: query

– Construct for active centers: – Construct candidate image set – Compute query likelihood for every candidate image

SLIDE 38

Compare to BoW

Our Method BoW Unified framework BoW generation and retrieval are separated Randomly sample centers (efficient) Clustering (very slow) Only close by keypoints are mapped to nearby centers (consistent mapping) Distant keypoints may belong to the same cluster (inconsistent mapping) Keypoints far away from all centers will be discard (robust to outliers) Every keypoints must be mapped to a center (influenced by outliers)

SLIDE 39

Experiments

Three datasets

# of images # of features Size of DB # of querys Tattoo 101,754 10,843,145 3.4GB 995 Oxford5K 5,062 14,972,956 4.7GB 55 Oxford5K+ Flickr1M 1,002,805 823,297,045 252.7GB 55

SLIDE 40

Experiments

Tattoo dataset

– Provided by Michigan State Police Department – Images are manually cropped prior to feature extraction – Examples of near duplicates

SLIDE 41

Experiments

Oxford building dataset

– VGG group from Oxford university – 11 Oxford landmarks

SLIDE 42

Experiments

Metrics

– CMC scores for tattoo dataset

Percentage of queries whose matched images are found

in the first k retrieved images

– mAP for the other two datasets

AP: the area under precision-recall curve
mAP: mean AP of all the queries

SLIDE 43

Results

Retrieval accuracy
Speed

– Similar retrieval speed as BoW+TF-IDF – Tattoo: 0.01s/ query – Oxford5K+ Flickr1M: 0.9s/query

Tattoo (CMC rank10) Oxford5K Oxford5K+ Flickr1M Our method 0.82 0.61 0.45 BoW (AKM) +TF-IDF 0.78 0.57 0.39

SLIDE 44

Experiments on Tattoo Dataset

Number of random centers:

SLIDE 45

Experiments on Tattoo Dataset

Radius for range search:

– : average pairwise distance

SLIDE 46

Experiments on Tattoo Dataset

Not sensitive to

SLIDE 47

Examples on Tattoo Dataset

Successful examples: top10 retrievals

Query

SLIDE 48

Examples on Tattoo Dataset

Failed cases: top10 retrievals

rank 45 rank 7512 rank 12785 rank 7420

Query

Truth

SLIDE 49

Examples on Oxford Dataset

Query

SLIDE 50

Other Applications

Graffiti

– Graffiti are common in almost all big cities all over the world – Even around CMU

SLIDE 51

Graffiti Around CMU

SLIDE 52

Graffiti Around CMU

SLIDE 53

Graffiti

Graffiti is a big problem for many cities.

– Broken Window Theory – City of Riverside in California spends more than $1 million each year to remove graffiti – Identify moniker

Graffiti also play an important role in gang culture

Graffiti of the Six Duce East Coast Crips founded in Los Angeles in 1971 which is one

f the largest and most violent street gangs

in the United States, with an estimated membership of 30,000. Notice the use of the basic lettering style. The spelling of six is done with a “c” to reinforce the Crip

identity. The arrow is used among African-

American gangs to express their territory

SLIDE 54

Graffiti Retrieval

More challenge than tattoo

SLIDE 55

Trademark Retrieval

SLIDE 56

Summary

Tattoos are a soft biometric trait

– Identify victims, criminals – An application of CBIR

BoW for CBIR

– Inspired by text retrieval – Separate steps:

Clustering for generating BoW representation
Standard text retrieval

SLIDE 57

Summary

Our integrated method:

– Image -> underlying distribution – Keypoints -> sampled from the distribution – Distribution -> kernel density estimation

Weighted mixture model
Local kernel for

– Sparse representation – Efficiency

– Retrieval: query likelihood

SLIDE 58

Summary

Contributions:

– Unified framework – Efficient algorithms

Density estimation
Retrieval
Impact

– Not limited to retrieval

Classification: represent an image by

– Applicable to other domains

Audio, e.g., MFCC
Text: long query problem

SLIDE 59

Thanks!

SLIDE 60

SLIDE 61

SLIDE 62

SLIDE 63

Efficient Kernel Density Estimation

Solve MLE by Bound optimization

1 2

( , ) l θ θ

1 2

, θ θ

1 2

, θ θ

Start with initial guess

SLIDE 64

Efficient Kernel Density Estimation

Solve MLE by Bound optimization

1 2

, θ θ

Start with initial guess
Come up with a lower bounded

1 2 1 2 1 2

( , ) ( , ) ( , ) l l Q θ θ θ θ θ θ ≥ +

{ }

1 2

, θ θ

1 2

( , ) l θ θ

Touch Point

1 2 1 2 1 2

( , ) ( , ) ( , ) l l Q θ θ θ θ θ θ ≥ +

1 2 1 1 2 2

( , ) is a concave function Touch point: ( , ) Q Q θ θ θ θ θ θ = = =

SLIDE 65

Efficient Kernel Density Estimation

Solve MLE by Bound optimization

1 2 1 2 1 2

( , ) ( , ) ( , ) l l Q θ θ θ θ θ θ ≥ +

{ }

1 2

, θ θ

1 2

( , ) l θ θ

{ }

1 1 1 2

, θ θ

1 2

, θ θ

Start with initial guess
Come up with a lower bounded
Search the optimal solution that

maximizes

1 2 1 2 1 2

( , ) ( , ) ( , ) l l Q θ θ θ θ θ θ ≥ +

1 2 1 1 2 2

( , ) is a concave function Touch point: ( , ) Q Q θ θ θ θ θ θ = = =

1 2

( , ) Q θ θ

SLIDE 66

Efficient Kernel Density Estimation

Solve MLE by Bound optimization

1 1 1 2 1 2 1 2

( , ) ( , ) ( , ) l l Q θ θ θ θ θ θ ≥ +

{ }

1 2

, θ θ

1 2

( , ) l θ θ

{ }

1 1 1 2

, θ θ

{ }

2 2 1 2

, θ θ

1 2

, θ θ

Start with initial guess
Come up with a lower bounded
Search the optimal solution that

maximizes

Repeat the procedure

1 2 1 2 1 2

( , ) ( , ) ( , ) l l Q θ θ θ θ θ θ ≥ +

1 2 1 1 2 2

( , ) is a concave function Touch point: ( , ) Q Q θ θ θ θ θ θ = = =

1 2

( , ) Q θ θ

SLIDE 67

Efficient Kernel Density Estimation

Solve MLE by Bound optimization

Optimal Point

1 2

( , ) l θ θ

{ }

1 2

, θ θ

{ }

1 1 1 2

, θ θ

{ }

2 2 1 2

, ,... θ θ

1 2

, θ θ

Start with initial guess
Come up with a lower bounded
Search the optimal solution that

maximizes

Repeat the procedure
Converge to local optimal

1 2 1 2 1 2

( , ) ( , ) ( , ) l l Q θ θ θ θ θ θ ≥ +

1 2 1 1 2 2

( , ) is a concave function Touch point: ( , ) Q Q θ θ θ θ θ θ = = =

1 2

( , ) Q θ θ

SLIDE 68

Our Method

Notations: