In Search of Art Elliot J. Crowley and Andrew Zisserman Visual - - PowerPoint PPT Presentation

in search of art
SMART_READER_LITE
LIVE PREVIEW

In Search of Art Elliot J. Crowley and Andrew Zisserman Visual - - PowerPoint PPT Presentation

In Search of Art Elliot J. Crowley and Andrew Zisserman Visual Geometry Group Department of Engineering Science University of Oxford The Goal An on-the-fly system for searching paintings visually A user can type in the name of any


slide-1
SLIDE 1

In Search of Art

Elliot J. Crowley and Andrew Zisserman Visual Geometry Group Department of Engineering Science University of Oxford

slide-2
SLIDE 2

The Goal

  • An on-the-fly system for searching paintings

visually

  • A user can type in the name of any category...
  • Then hundreds of paintings containing that

category will be retrieved in a matter of seconds

dog

slide-3
SLIDE 3

Benefits

  • In many instances, the retrieved paintings will

not have been known to contain the category

  • Meaning these are new discoveries for the Art

History community

dog

slide-4
SLIDE 4

Why is this good?

  • Art historians can discover when something

first appeared in paintings

  • They can also observe how things have

changed over time

slide-5
SLIDE 5

How is this achieved?

  • Natural images annotated with object

categories are everywhere.

  • These can be used to learn object classifiers.

Google images of dog

slide-6
SLIDE 6

Dataset of Paintings

  • We use `Your Paintings’ as the dataset
  • `Your Paintings’ consists of over 210,000

paintings from UK galleries http://www.bbc.co.uk/arts/yourpaintings/

  • Method is independent of dataset however
  • Can use other datasets e.g. Rijksmuseum or

PrintART

slide-7
SLIDE 7

Outline

  • Methodology
  • Quantitative Evaluation
  • Aligning retrieved objects
slide-8
SLIDE 8

What do we do?

  • We crawl Google Images for a given category and

learn a CNN-based classifier

  • This classifier is applied to a dataset of paintings,

retrieving paintings containing the category

slide-9
SLIDE 9

The Architecture

slide-10
SLIDE 10

How do we do this quickly?

  • The bulk of the data has been pre-processed
  • ffline (negative training data, dataset of

paintings)

  • Online processing of Google Images is done in

parallel across multiple cores

slide-11
SLIDE 11

In more detail…

  • For a given query, the top 200 Google Image

Hits are downloaded

  • For each of these

a CNN feature is computed online

  • This is the positive

training data

slide-12
SLIDE 12

Negative Training Data

  • Offline, images are downloaded for Google

searches of `things’ and ‘photos’

  • The features for

these are pre-computed

slide-13
SLIDE 13

Classification

  • A Support Vector Machine is used to learn a

classifier that discriminates the positive training data from the negative data

beard not beard

slide-14
SLIDE 14

Retrieval

  • The classifier is applied to the pre-processed

features of `Your Paintings’

  • Each painting is given a score by the classifier
slide-15
SLIDE 15

Retrieval

  • The paintings are displayed in order of score.

beard

slide-16
SLIDE 16

The Architecture - Timings

0.5s 4.5s <0.5s <0.5s 2s

slide-17
SLIDE 17

Example Queries

bridge

slide-18
SLIDE 18

Example Queries

carriage

slide-19
SLIDE 19

Example Queries

flower

slide-20
SLIDE 20

Example Queries

house

slide-21
SLIDE 21

Outline

  • Methodology
  • Quantitative Evaluation
  • Aligning retrieved objects
slide-22
SLIDE 22

Quantitative Evaluation

  • Evaluating the domain transfer problem of

learning classifiers on natural images and applying these to paintings

slide-23
SLIDE 23

Test Set

  • For this an annotated dataset of paintings is

required

  • 10,000 paintings in `Your Paintings’ have been

tagged by the public

  • These tags + painting titles are used to form

the `Paintings Dataset’ with annotations corresponding to classes of PASCAL VOC

slide-24
SLIDE 24

The Paintings Dataset

Class Paintings with Class Aeroplane 200 Bird 805 Boat 2143 Chair 1202 Cow 625 Dining-table 1201 Dog 1145 Horse 1493 Sheep 751 Train 329

  • Assume complete annotation

in the PASCAL sense

  • Assess by calculating APs

Train Dog Horse

slide-25
SLIDE 25

Training Datasets

  • 4 Datasets of natural images are used for training
  • VOC12, VOC12+, Net Noisy, Net Curated
slide-26
SLIDE 26

Experiments

Features compared:

  • Shallow Features - Fisher Vectors

VS.

  • Deep Features - Convolutional Neural

Networks (CNNs)

slide-27
SLIDE 27

Experiments - Features

  • Fisher Vector VS. CNN Features
  • CNN outperforms

Fisher Vectors

  • Added advantage
  • f being lower

dimensionality

slide-28
SLIDE 28

Augmentation

  • No augmentation
  • C+F augmentation

224 224 224 256

slide-29
SLIDE 29

Experiments - Augmentation

  • Sum Pool: Classifier applied to

mean of augmented windows

  • Max Pool: Classifier applied to

each augmented window and maximum score recorded

  • Best performance is aug + sum

pool but almost as good with no aug + sum pool

slide-30
SLIDE 30

Experiments - Dimensionality

  • 1K performs best
  • Not that different from the
  • thers however
slide-31
SLIDE 31

Experiment Conclusions

  • For the on-the-fly system 1K CNN features are

used as these performed the best

  • Sum pooled features are used for `Your

Paintings’ as time is not a factor in computing these

  • No augmentation is used on the images

downloaded from Google (0.3s per image per core vs. 2.4s)

slide-32
SLIDE 32

Outline

  • Methodology
  • Quantitative Evaluation
  • Aligning retrieved objects
slide-33
SLIDE 33

Alignment

  • Some objects are automatically aligned…

moustache

slide-34
SLIDE 34

The Pencil Moustache

Anonymous Trendsetter, 1565

Copycats, Now

slide-35
SLIDE 35

Alignment

  • Other objects require some work…

train

slide-36
SLIDE 36

Solution

Learn a DPM [1] on either

  • 1. annotated bounding boxes (e.g. PASCAL

VOC) or

  • 2. the downloaded Google Images

[1] P Felzenszwalb, R Girshick, D McAllester, D Ramanan, Object Detection with Discriminatively Trained Part Based Models, CVPR 2010

slide-37
SLIDE 37

Auto-alignment

train

slide-38
SLIDE 38

Auto-alignment

horse

slide-39
SLIDE 39

Conclusion

  • We provide a system that can find objects in

paintings with high precision in very little time

  • The objects found can be further curated

using a DPM

slide-40
SLIDE 40

Links

  • VISOR: Visual Search of BBC News [1]

http://www.robots.ox.ac.uk/~vgg/research/on-the-fly/

  • CNN code [2]

http://www.robots.ox.ac.uk/~vgg/research/deep_eval/

  • Our system

COMING SHORTLY!

[1] K Chatfield, A Zisserman, VISOR: Towards On-the-Fly Large-Scale Object Category Retrieval, ACCV, 2012 [2] Ken Chatfield, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman, Return of the Devil in the Details: Delving Deep into Convolutional Nets, BMVC, 2014

slide-41
SLIDE 41

Thank you

  • Any questions?
  • Or email elliot@robots.ox.ac.uk