In Search of Art Elliot J. Crowley and Andrew Zisserman Visual - - PowerPoint PPT Presentation

▶

May 23, 2023 212 likes •638 views

In Search of Art Elliot J. Crowley and Andrew Zisserman Visual Geometry Group Department of Engineering Science University of Oxford The Goal An on-the-fly system for searching paintings visually A user can type in the name of any

SLIDE 1

In Search of Art

Elliot J. Crowley and Andrew Zisserman Visual Geometry Group Department of Engineering Science University of Oxford

SLIDE 2

The Goal

An on-the-fly system for searching paintings

visually

A user can type in the name of any category...
Then hundreds of paintings containing that

category will be retrieved in a matter of seconds

dog

SLIDE 3

Benefits

In many instances, the retrieved paintings will

not have been known to contain the category

Meaning these are new discoveries for the Art

History community

dog

SLIDE 4

Why is this good?

Art historians can discover when something

first appeared in paintings

They can also observe how things have

changed over time

SLIDE 5

How is this achieved?

Natural images annotated with object

categories are everywhere.

These can be used to learn object classifiers.

Google images of dog

SLIDE 6

Dataset of Paintings

We use `Your Paintings’ as the dataset
`Your Paintings’ consists of over 210,000

paintings from UK galleries http://www.bbc.co.uk/arts/yourpaintings/

Method is independent of dataset however
Can use other datasets e.g. Rijksmuseum or

PrintART

SLIDE 7

Outline

Methodology
Quantitative Evaluation
Aligning retrieved objects

SLIDE 8

What do we do?

We crawl Google Images for a given category and

learn a CNN-based classifier

This classifier is applied to a dataset of paintings,

retrieving paintings containing the category

SLIDE 9

The Architecture

SLIDE 10

How do we do this quickly?

The bulk of the data has been pre-processed
ffline (negative training data, dataset of

paintings)

Online processing of Google Images is done in

parallel across multiple cores

SLIDE 11

In more detail…

For a given query, the top 200 Google Image

Hits are downloaded

For each of these

a CNN feature is computed online

This is the positive

training data

SLIDE 12

Negative Training Data

Offline, images are downloaded for Google

searches of `things’ and ‘photos’

The features for

these are pre-computed

SLIDE 13

Classification

A Support Vector Machine is used to learn a

classifier that discriminates the positive training data from the negative data

beard not beard

SLIDE 14

Retrieval

The classifier is applied to the pre-processed

features of `Your Paintings’

Each painting is given a score by the classifier

SLIDE 15

Retrieval

The paintings are displayed in order of score.

beard

SLIDE 16

The Architecture - Timings

0.5s 4.5s <0.5s <0.5s 2s

SLIDE 17

Example Queries

bridge

SLIDE 18

Example Queries

carriage

SLIDE 19

Example Queries

flower

SLIDE 20

Example Queries

house

SLIDE 21

Outline

Methodology
Quantitative Evaluation
Aligning retrieved objects

SLIDE 22

Quantitative Evaluation

Evaluating the domain transfer problem of

learning classifiers on natural images and applying these to paintings

SLIDE 23

Test Set

For this an annotated dataset of paintings is

required

10,000 paintings in `Your Paintings’ have been

tagged by the public

These tags + painting titles are used to form

the `Paintings Dataset’ with annotations corresponding to classes of PASCAL VOC

SLIDE 24

The Paintings Dataset

Class Paintings with Class Aeroplane 200 Bird 805 Boat 2143 Chair 1202 Cow 625 Dining-table 1201 Dog 1145 Horse 1493 Sheep 751 Train 329

Assume complete annotation

in the PASCAL sense

Assess by calculating APs

Train Dog Horse

SLIDE 25

Training Datasets

4 Datasets of natural images are used for training
VOC12, VOC12+, Net Noisy, Net Curated

SLIDE 26

Experiments

Features compared:

Shallow Features - Fisher Vectors

VS.

Deep Features - Convolutional Neural

Networks (CNNs)

SLIDE 27

Experiments - Features

Fisher Vector VS. CNN Features
CNN outperforms

Fisher Vectors

Added advantage
f being lower

dimensionality

SLIDE 28

Augmentation

No augmentation
C+F augmentation

224 224 224 256

SLIDE 29

Experiments - Augmentation

Sum Pool: Classifier applied to

mean of augmented windows

Max Pool: Classifier applied to

each augmented window and maximum score recorded

Best performance is aug + sum

pool but almost as good with no aug + sum pool

SLIDE 30

Experiments - Dimensionality

1K performs best
Not that different from the
thers however

SLIDE 31

Experiment Conclusions

For the on-the-fly system 1K CNN features are

used as these performed the best

Sum pooled features are used for `Your

Paintings’ as time is not a factor in computing these

No augmentation is used on the images

downloaded from Google (0.3s per image per core vs. 2.4s)

SLIDE 32

Outline

Methodology
Quantitative Evaluation
Aligning retrieved objects

SLIDE 33

Alignment

Some objects are automatically aligned…

moustache

SLIDE 34

The Pencil Moustache

Anonymous Trendsetter, 1565

Copycats, Now

SLIDE 35

Alignment

Other objects require some work…

train

SLIDE 36

Solution

Learn a DPM [1] on either

1. annotated bounding boxes (e.g. PASCAL

VOC) or

2. the downloaded Google Images

[1] P Felzenszwalb, R Girshick, D McAllester, D Ramanan, Object Detection with Discriminatively Trained Part Based Models, CVPR 2010

SLIDE 37

Auto-alignment

train

SLIDE 38

Auto-alignment

horse

SLIDE 39

Conclusion

We provide a system that can find objects in

paintings with high precision in very little time

The objects found can be further curated

using a DPM

SLIDE 40

Links

VISOR: Visual Search of BBC News [1]

http://www.robots.ox.ac.uk/~vgg/research/on-the-fly/

CNN code [2]

http://www.robots.ox.ac.uk/~vgg/research/deep_eval/

Our system

COMING SHORTLY!

[1] K Chatfield, A Zisserman, VISOR: Towards On-the-Fly Large-Scale Object Category Retrieval, ACCV, 2012 [2] Ken Chatfield, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman, Return of the Devil in the Details: Delving Deep into Convolutional Nets, BMVC, 2014

SLIDE 41

Thank you

Any questions?
Or email elliot@robots.ox.ac.uk