A Kernel Density Based Approach for Large Scale Image Retrieval and - - PowerPoint PPT Presentation
A Kernel Density Based Approach for Large Scale Image Retrieval and - - PowerPoint PPT Presentation
A Kernel Density Based Approach for Large Scale Image Retrieval and Its Application to Tattoo Identification Wei Tong CMU Outline Why tattoo retrieval Used in forensic to identify Victims, criminals Content-based image
Outline
- Why tattoo retrieval
– Used in forensic to identify
- Victims, criminals
- Content-based image retrieval
– Bag of visual words model
- Our method
– Similar to the language model
- Examples
- Other interesting applications
– Graffiti, trademark…
- Summary
Introduction to Tattoos
- Tattoo facts
– A form of body modification – Embedded deeply into the skin – Over 5000 years
Introduction to Tattoos
- Tattoo statistics ( 2003)
Americans that have at least one tattoo:
– Young adults (18 -29 age group): ~40%
Introduction to Tattoos
- Tattoo is a (soft) biometric trait
– Primary traits
- Identify an individual
- May not be available or work well in certain conditions
– Soft biometric traits
- Provide some identifying information
- But lack distinctiveness and permanence to sufficiently
differentiate between two individuals
- Gender, height, weight, tattoo, scar, birthmark
Introduction to Tattoos
- Tattoo for Law Enforcement
– Victim identification, e.g. 9/11 bombing
(a) (b) (c) (d) (a) a tsunami victim, (b) body of unidentified murdered women, (c) and (d) body parts found in a Florida state park
Introduction to Tattoos
- Tattoos for Law Enforcement
- Identify criminals
– Often contain hidden meaning of a suspect’s criminal history » e.g. Previous convictions, years spent in jail – Gang membership. About 800,000 gang members on the streets nationwide; 100,000 in greater LA area alone
18th Street gang , the largest LA-based street gang
Introduction to Tattoos
- Law enforcement agencies photograph tattoos
– when a suspect is arrested – They have done this for many years
Introduction to Tattoos
- Michigan State Police Database
~100,000 images (JPEG, 640x480 color images)
Tattoo Image Matching and Retrieval
- ANSI/NIST
Tattoo Classes
– 8 major classes – 70 subclasses
Human Animal Plant Flag Object Abstract Symbol Other
Tattoo Image Matching and Retrieval
- Existing practice is text search
– Class label/keyword based – Tedious manual annotation – Same image different label/keyword by different individuals – One image, multiple keywords – Non-uniform class/keyword distribution
- a keyword returns thousands of images
– Change in keywords requires re-annotation
Tattoo Image Matching and Retrieval
- Content-based image retrieval
– Given a tattoo image query – Find the most similar tattoos in the database
Query
Introduction to Tattoos
- Tattoo-ID system
- Prototype tattoo retrieval system
- Delivered to FBI in 2011
- Licensed the technology to MorphoTrak
Introduction to Tattoos
- A recent story on tattoo identification
This person gave his name as “Darnell Lewis” to a police officer, but the police man noticed that the man had “Frazier” tattooed on his neck which is his real
- surname. He was arrested on four misdemeanor warrants.
Introduction to Tattoos
- FBI is developing the Next Generation
Identification (NGI) system for identifying criminals
– $ 1 billion program – SMT (scar, mark, tattoo) is one of the major components – Expected by 2014
Content-based Image Retrieval
- What “content” should be used
– Difficult to understand the information needs of a user from a query image
Content-based Image Retrieval
- Images with similar color
Content-based Image Retrieval
- Images with similar shape
Content-based Image Retrieval
- Images with similar semantic
Content-based Image Retrieval
- Challenges in CBIR
– You get drunk, – REALLY drunk, – Hit over the head, – Kidnapped to another city
- in a country on the other side of the world
– When you wake up, – You try to figure out what city are you in, and what is going
- n
- That’s what it’s like to be a CBIR system!
Content-based Image Retrieval
- Near Duplicate Image Retrieval
– Given a query image, identify gallery images with high visual similarity
Content-based Image Retrieval
- Bag-of-features
– Detect local interest points – Represent each interest point by a descriptor – An image is a collection of those points.
22 19 23 1 66 103 45 6 38 232 44 11 48 29 55 129 1 11 78 110 1 32 220 30 11 34 21
Descriptors of the key points Original image Detected key points
Each descriptor is 5 dimension
Content-based Image Retrieval
- Bag-of-words Model
A document A collection of the words in the document An image A collection of the key points
- f the image
What is the difference
The same word appears in many documents No “same key point”, but “similar key point” appears in many images which have similar “visual content” Group “similar key point” in different images in to “visual words”
Content-based Image Retrieval
- Bag-of-words Model
b1 b2 b3 b4 b5 b6 b7 b8 … … … b1 b2 b3 b4 Group key points into visual words Represent images by histograms of visual words
Content-based Image Retrieval
- Shortcomings of bag-of-words model
- Independent steps
– Generating “bag-of words” representation – Image retrieval Separate steps hurt performance
- Computationally expensive
– Clustering key points into “visual words”
- Inconsistent mapping
– Distant keypoints may belong to the same cluster
- Influenced by outliers
– Every keypoints must be mapped to a cluster center
Our Method
- Database
– Image -> distribution – Keypoints -> sampled from the distribution – Distribution -> kernel density estimation
- Retrieval
– Query likelihood
- Given a distribution, how likely would it generate the query keypoints
……
Efficient Kernel Density Estimation
- A straightforward way
– Kernel density estimation – Retrieval: query likelihood
- Two Challenges:
– How to efficiently estimate the density function of each image – How to avoid linear scan of the database for retrieval
Efficient Kernel Density Estimation
- Weighted mixture model for each image
- When is very large
Efficient Kernel Density Estimation
- The image is eventually represented by
– Estimate by MLE – Problem
- is high dimensional
- Need to compute for every image
- Use a local kernel
- is a sparse vector with proper
Efficient Kernel Density Estimation
Efficient Kernel Density Estimation
- Updating rule for MLE
– It is globally optimal
- Approximation
– Only update once – Exact optimal solution when are far apart from each other
Efficient Kernel Density Estimation
- Algorithm to compute of each image
– Range search – Compute
Regularization
- MLE of is poor
– with limited number of keypoints
- Regularization
– global – KL divergence
Regularization
- The solution of becomes:
- Global is:
Global MLE of
Retrieval
- Query Likelihood:
- Problem: is not sparse anymore
- has two parts
Not sparse Sparse
Retrieval
- Decompose the query likelihood
Sparse, inverted index can be utilized to identify a small set
- f candidate images.
Constant, independent of individual images, ignored in image search
…
Retrieval
- Algorithm: query
– Construct for active centers: – Construct candidate image set – Compute query likelihood for every candidate image
Compare to BoW
Our Method BoW Unified framework BoW generation and retrieval are separated Randomly sample centers (efficient) Clustering (very slow) Only close by keypoints are mapped to nearby centers (consistent mapping) Distant keypoints may belong to the same cluster (inconsistent mapping) Keypoints far away from all centers will be discard (robust to outliers) Every keypoints must be mapped to a center (influenced by outliers)
Experiments
- Three datasets
# of images # of features Size of DB # of querys Tattoo 101,754 10,843,145 3.4GB 995 Oxford5K 5,062 14,972,956 4.7GB 55 Oxford5K+ Flickr1M 1,002,805 823,297,045 252.7GB 55
Experiments
- Tattoo dataset
– Provided by Michigan State Police Department – Images are manually cropped prior to feature extraction – Examples of near duplicates
Experiments
- Oxford building dataset
– VGG group from Oxford university – 11 Oxford landmarks
Experiments
- Metrics
– CMC scores for tattoo dataset
- Percentage of queries whose matched images are found
in the first k retrieved images
– mAP for the other two datasets
- AP: the area under precision-recall curve
- mAP: mean AP of all the queries
Results
- Retrieval accuracy
- Speed
– Similar retrieval speed as BoW+TF-IDF – Tattoo: 0.01s/ query – Oxford5K+ Flickr1M: 0.9s/query
Tattoo (CMC rank10) Oxford5K Oxford5K+ Flickr1M Our method 0.82 0.61 0.45 BoW (AKM) +TF-IDF 0.78 0.57 0.39
Experiments on Tattoo Dataset
- Number of random centers:
Experiments on Tattoo Dataset
- Radius for range search:
– : average pairwise distance
Experiments on Tattoo Dataset
- Not sensitive to
Examples on Tattoo Dataset
- Successful examples: top10 retrievals
Query
Examples on Tattoo Dataset
- Failed cases: top10 retrievals
rank 45 rank 7512 rank 12785 rank 7420
Query
Truth
Examples on Oxford Dataset
Query
Other Applications
- Graffiti
– Graffiti are common in almost all big cities all over the world – Even around CMU
Graffiti Around CMU
Graffiti Around CMU
Graffiti
- Graffiti is a big problem for many cities.
– Broken Window Theory – City of Riverside in California spends more than $1 million each year to remove graffiti – Identify moniker
- Graffiti also play an important role in gang culture
Graffiti of the Six Duce East Coast Crips founded in Los Angeles in 1971 which is one
- f the largest and most violent street gangs
in the United States, with an estimated membership of 30,000. Notice the use of the basic lettering style. The spelling of six is done with a “c” to reinforce the Crip
- identity. The arrow is used among African-
American gangs to express their territory
Graffiti Retrieval
- More challenge than tattoo
Trademark Retrieval
Summary
- Tattoos are a soft biometric trait
– Identify victims, criminals – An application of CBIR
- BoW for CBIR
– Inspired by text retrieval – Separate steps:
- Clustering for generating BoW representation
- Standard text retrieval
Summary
- Our integrated method:
– Image -> underlying distribution – Keypoints -> sampled from the distribution – Distribution -> kernel density estimation
- Weighted mixture model
- Local kernel for
– Sparse representation – Efficiency
– Retrieval: query likelihood
Summary
- Contributions:
– Unified framework – Efficient algorithms
- Density estimation
- Retrieval
- Impact
– Not limited to retrieval
- Classification: represent an image by
– Applicable to other domains
- Audio, e.g., MFCC
- Text: long query problem
Thanks!
Efficient Kernel Density Estimation
- Solve MLE by Bound optimization
1 2
( , ) l θ θ
1 2
, θ θ
1 2
, θ θ
- Start with initial guess
Efficient Kernel Density Estimation
- Solve MLE by Bound optimization
1 2
, θ θ
- Start with initial guess
- Come up with a lower bounded
1 2 1 2 1 2
( , ) ( , ) ( , ) l l Q θ θ θ θ θ θ ≥ +
{ }
1 2
, θ θ
1 2
( , ) l θ θ
Touch Point
1 2 1 2 1 2
( , ) ( , ) ( , ) l l Q θ θ θ θ θ θ ≥ +
1 2 1 1 2 2
( , ) is a concave function Touch point: ( , ) Q Q θ θ θ θ θ θ = = =
Efficient Kernel Density Estimation
- Solve MLE by Bound optimization
1 2 1 2 1 2
( , ) ( , ) ( , ) l l Q θ θ θ θ θ θ ≥ +
{ }
1 2
, θ θ
1 2
( , ) l θ θ
{ }
1 1 1 2
, θ θ
1 2
, θ θ
- Start with initial guess
- Come up with a lower bounded
- Search the optimal solution that
maximizes
1 2 1 2 1 2
( , ) ( , ) ( , ) l l Q θ θ θ θ θ θ ≥ +
1 2 1 1 2 2
( , ) is a concave function Touch point: ( , ) Q Q θ θ θ θ θ θ = = =
1 2
( , ) Q θ θ
Efficient Kernel Density Estimation
- Solve MLE by Bound optimization
1 1 1 2 1 2 1 2
( , ) ( , ) ( , ) l l Q θ θ θ θ θ θ ≥ +
{ }
1 2
, θ θ
1 2
( , ) l θ θ
{ }
1 1 1 2
, θ θ
{ }
2 2 1 2
, θ θ
1 2
, θ θ
- Start with initial guess
- Come up with a lower bounded
- Search the optimal solution that
maximizes
- Repeat the procedure
1 2 1 2 1 2
( , ) ( , ) ( , ) l l Q θ θ θ θ θ θ ≥ +
1 2 1 1 2 2
( , ) is a concave function Touch point: ( , ) Q Q θ θ θ θ θ θ = = =
1 2
( , ) Q θ θ
Efficient Kernel Density Estimation
- Solve MLE by Bound optimization
Optimal Point
1 2
( , ) l θ θ
{ }
1 2
, θ θ
{ }
1 1 1 2
, θ θ
{ }
2 2 1 2
, ,... θ θ
1 2
, θ θ
- Start with initial guess
- Come up with a lower bounded
- Search the optimal solution that
maximizes
- Repeat the procedure
- Converge to local optimal
1 2 1 2 1 2
( , ) ( , ) ( , ) l l Q θ θ θ θ θ θ ≥ +
1 2 1 1 2 2
( , ) is a concave function Touch point: ( , ) Q Q θ θ θ θ θ θ = = =
1 2
( , ) Q θ θ
Our Method
- Notations: