Learning Deep Features for Scene Recognition using Places Database - PowerPoint PPT Presentation
Learning Deep Features for Scene Recognition using Places Database Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, Aude Oliva NIPS2014 Bora elikkale INTRODUCTION Human Visual Recognition Samples world several times / sec
Learning Deep Features for Scene Recognition using Places Database Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, Aude Oliva NIPS2014 Bora Çelikkale
INTRODUCTION Human Visual Recognition Samples world several times / sec ~millions images within a year
INTRODUCTION Primate Brain Hierarchical organization in layers of increasing processing complexity Inspired CNNs
PROBLEM & MOTIVATION Obj Classification have obtained astonishing performanace with large databases (ImageNet) Iconic images do not contain the richness and diversity of visual info in scenes
CONTRIBUTIONS Scene-centric database 60x larger than SUN Comparison metrics for scene datasets: Density, Diversity
SCENE DATASETS Scene15 MIT Indoor67 (Lazebnik et al. 2006) (Quatham & Torralba 2009) 67 categories of indoor places 15 categories 15.620 imgs ~3000 imgs SUN (Xiao et al. 2010) Places (Zhou et al. 2014) 397 (well-sampled) categories 476 categories 130.519 imgs 7.076.580 imgs
PLACES DATASET Google Images Same categories from SUN 1 Bing Images 696 popular adjectives in Eng Flickr >40M imgs are downloaded
PLACES DATASET PCA-based duplicate removal across SUN 2 Places & SUN have different images Allows to combine Places & SUN
PLACES DATASET Annotations (with AMT) 3 Questions (eg: is this a living room?) Two round setup: 1. Default answer is NO 2. Default answer is YES Imgs shown / round : 750 + 60 from SUN for control Take >90% accuracy
COMPARISON METRICS Relative Density
COMPARISON METRICS Relative Density Images have more similar neighbors NN of a 1 NN of b 1
COMPARISON METRICS Relative Diversity Simpson Index: two random individual belong to same specie NN of a 1 NN of b 1
EXPERIMENTS Density & Diversity Comparison (AMT) 1 Relative diversity vs. relative density per each category and dataset Show 12 pairs of images Workers select the most similar pair Diversity: pairs are chosen random for each db Density: 5th NN (avoid near duplicates) is chosen as pair with GIST
EXPERIMENTS Cross Dataset Generalization 2 Training and testing across different datasets ImageNet-CNN and linear SVM
EXPERIMENTS Comparison with Hand-designed Features 3
EXPERIMENTS Training CNN for Scene Recognition 4 2,5M imgs from 205 categories, on AlexNet
PLACES-CNNs Hybrid-AlexNet Places + ImageNet 3.5M imgs, 1183 categories Accuracy = 0.5230 on validation set Places205-GoogLeNet (on 205 categories) Accuracy: top1 = 0.5567 , top5 = 0.8541 on validation set Places205-VGG16 (on 205 categories) Accuracy: top1 = 0.5890 , top5 = 0.8770 on validation set
PLACES2 DATASET 400+ unique scene categories >10M images AlexNet top1 accuracy: 43.0% VGG16 top1 accuracy: 47.6%
DEMO http://places.csail.mit.edu/demo.html http://places2.csail.mit.edu/demo.html
THANK YOU
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.