Mariette Awad Assistant Professor Assistant Professor Electrical - - PowerPoint PPT Presentation

mariette awad assistant professor assistant professor
SMART_READER_LITE
LIVE PREVIEW

Mariette Awad Assistant Professor Assistant Professor Electrical - - PowerPoint PPT Presentation

Energy Aware Recognition for Man Made Structures and other research projects at the American University of Beirut Mariette Awad Assistant Professor Assistant Professor Electrical and Computer Engineering Department American University of


slide-1
SLIDE 1

Energy Aware Recognition for Man Made Structures and other research projects at the American University of Beirut

Mariette Awad Assistant Professor Assistant Professor Electrical and Computer Engineering Department American University of Beirut, Lebanon y

slide-2
SLIDE 2

Outline

  • On Going Projects
  • uSee for Man made Structures
  • Biologically Inspired Deep Visual Networks
slide-3
SLIDE 3

Lebanon

slide-4
SLIDE 4

eirut y of Be versity n Univ erican Ame

slide-5
SLIDE 5

S AUB G l I f S

  • me AUB General Info
  • AUB founded in 1866

Si di d b Hi h

  • Since 2004 accredited by Higher

Educational of the Middle States Association of Colleges and g Schools in the US

  • 120 programs: Bachelor, Masters

and PhD degrees and PhD degrees

  • 6 faculties: Agriculture and Food

Sciences, Arts and S i E i i d Sciences, Engineering and Architecture, Health Sciences, Medicine, Business

  • Faculty of Engineering since 1944
slide-6
SLIDE 6

Mood Based Internet Radio Mood-Based Internet Radio- Tuner App pp

Undergraduate Students D id M h li

Emotion Song Fetcher &

David Matchoulian Yara Rizk Maya Safieddine

Detector Fetcher & Classifier

Maya Safieddine

Song Selector

slide-7
SLIDE 7

MusAR for an Augmented MusAR for an Augmented Museum Experience

Undergraduate Students Anis Halabi Giorgio Saad Giorgio Saad Piotr Yordanov

slide-8
SLIDE 8
slide-9
SLIDE 9

G t B d Pi A Gesture Based Piano App

Undergraduate Students Haya Mortada Sara Kheireddine Sara Kheireddine Fadi Cham m as Bahaa El Hakim

slide-10
SLIDE 10
  • Frame rate of 3.33 frames per

3 33 p second, for both

  • ptions
  • f

single key and multiple key generation of notes generation of notes

  • Equivalent to 200 frames per

minute. Thi f ll f

  • This frame rate allows for a

maximum tempo of 100 beats per minute, assuming the p , g majority of the notes played are half notes.

  • Given that moderate pieces are
  • Given that moderate pieces are

usually played at a tempo of 108 beats per minute, and that most b i i d beginner pieces do not use shorter than half notes: playability is OK

slide-11
SLIDE 11

PicoS paces: A Mobile Proj ected Touch Interface for Collaborative Applications

Prior Art Proposed Solution

M S l

S ation Segment ation S ation Segment ation Feature Feature t ti Feature Feature t ti

Master S lave

Undergraduate Students Marc Farra

n extractio n n extractio n Synchronization Correspondences and Correspondences and

Maya Kreidieh Moham ed Mehanna

Correspondences and Matching Correspondences and Matching Event Trigger

slide-12
SLIDE 12

Outline

  • On Going Projects
  • uSee Project
  • Biologically Inspired Deep Visual Networks
slide-13
SLIDE 13

uS ee: An Energy Aware S ift based Framework for S upervised Visual Framework for S upervised Visual S earch of Man Made S tructures

G d t St d t Graduate Student Aym an El Mobacher

slide-14
SLIDE 14

Problem S tatement

  • Proliferation of digital images and videos + the ease of acquisition

using smart-phones => opportunities for novel visual mining and search

  • Energy aware computing trends and a somewhat limited processing

capabilities of these handheld devices

  • Required

to better fit an environment where “green”, “mobility”, and “on-the-go” are prevailing

  • uSee: a supervised learning framework using SIFT keypoints

▫ exploits the physical world p p y ▫ delivers context-based services

slide-15
SLIDE 15

Prior Work

  • Visual salient regions and attention model based filtration so that
  • nly keypoints within the region of interest are used in the matching

process while dropping those in the background [Zhang et al., Bonaiuto p pp g g

et al.]

  • Consistent line clusters as mid-level features for performing

content- based image retrieval (CBIR) and utilizing relations among d i hi h l f hi h l l bj (b ildi ) d i and within the clusters for high level object (buildings) detection

[Shapiro et al.]

  • Causal multi-scale random fields to create a structured vs. non-

structured dichotomy using image sections [Kum ar and Hebertin ] structured dichotomy using image sections [Kum ar and Hebertin ]

  • Scale invariant descriptors followed by a nearest neighbor search of

the database for the best match based on “hyper polyhedron with adaptive threshold” indexing [Shao et al ] adaptive threshold indexing [Shao et al.]

slide-16
SLIDE 16

Methodology

l d d d ll i

  • Implemented as an on demand pull service
  • Based on energy aware processing of building images
  • Pre-processing phase:

▫ via cloud ( porting it locally now) ▫ highlights the areas with high variation in gradient angle using an entropy-based metric i i di id d i t l t l di t l i ti hi h ▫ image is divided into 2 clusters: low gradient angle variation vs high gradient angle variation

  • Signature Extraction:

▫ exploits the inherent symmetry and repetitive patterns in man made ▫ exploits the inherent symmetry and repetitive patterns in man-made structures ▫ guarantees an energy aware framework for SIFT keypoints matching ▫ SIFT keypoints are extracted -> correlated -> clustered based on yp b threshold ▫ r (%) SIFT keypoints are selected from clusters (r pre-defined for image)

slide-17
SLIDE 17

P i g Preprocessing

slide-18
SLIDE 18

Methodology - Workflow

uSee clustering workflow uSee keypoints selection workflow

slide-19
SLIDE 19

S ignature Extraction

80 90 uster 40 50 60 70 80 ypoints within a clu 10 20 30 40 m ber of sim ilar key

  • Identification: when new image is acquired

▫ Extract signature

1 3 5 7 9 11 13 15 17 19 21 23 25 27 Num Clusters of keypoints

Extract signature ▫ Compute L2 norm between the query’s and all the database’s signatures ▫ Identification based on a maximum voting scheme

slide-20
SLIDE 20

Validation1

  • ZuBuD

▫ 201 buildings with 5 reference views and 1 query image for each building

  • Several values for r were tested for both the reference and the query

images

  • Average number of all SIFT keypoints in a given image about 740
slide-21
SLIDE 21
  • Reduction in operational complexity at runtime instead

Results1

Reduction in operational complexity at runtime instead

  • f comparing a new query image n keypoints to 5*n*d

thus perfoming 5*n2*d comparisons, r keypoints where << l 5* 2*d i d d r<<n, only 5*r2*d comparisons are needed.

  • With 50 keypoints (r/ n = 6.8%), we save 99.54% on

5 yp ( / ), 99 54 computing energy without affecting accuracy results. U i % f SIFT k i t d d ll i lt

  • Using 15.5% of SIFT keypoints exceeded all prior results

achieved, to the best of our knowledge, on ZuBuD: reached 99.1% accuracy in building recognition. 99 y g g

slide-22
SLIDE 22

Method # of keypoints in reference image # of keypoints in query image r/n Recognition rate reference image query image

[8] All All

  • 94.80%

90.4% (Correct classification) [24] All All

  • 90.4% (Correct classification)

94.8% (Correct match in top 5) 90.4% (Correct classification) [26] All All

  • 90.4% (Correct classification)

96.5% (Correct match in top 5) [27] 335 335 45.3% 96.50% [27] 335 335 45.3% 96.50% 20 20 2.7% 91.30% 30 30 4.1% 94.80% uSee 30 30 4.1% 94.80% 40 40 5.4% 95.70% 50 50 6.8% 96.50% 50 50 6.8% 96.50% 100 75 10.1% 98.30% 100 115 15.5% 99.10%

slide-23
SLIDE 23

Validation 2

  • Further tests conducted on home grown database of buildings from

the city of Beirut (Beirut Database)

▫ 5 reference images taken at the same time of day g y ▫ 1 query image at different times and weather conditions ▫ total of 38 buildings

slide-24
SLIDE 24

The test images in Beirut db are different from different from their corresponding f i reference in illumination, cam era angle, and g , scale which are major image processing processing challenges not present in the Z B D db ZuBuD db

slide-25
SLIDE 25
slide-26
SLIDE 26

Outline

  • On Going Projects
  • uSee Project
  • Biologically Inspired Deep Visual Networks
slide-27
SLIDE 27

Biologically Inspired Deep Networks Biologically Inspired Deep Networks for Visual Identification

G d t St d t Graduate Student L’emir Salim Chehab

slide-28
SLIDE 28

D B li f N k Deep Belief Networks

  • Deep belief networks: probabilistic generative models composed of multiple

layers of stochastic variables (Boltzmann Machines) layers of stochastic variables (Boltzmann Machines)

  • First two layers an undirected bipartite graph (bidirectional connection).

The rest of the connections are feedforward unidirectional The rest of the connections are feedforward unidirectional

Deep Belief Network

slide-29
SLIDE 29

Fukushima Work Fukushima Work

  • Four

stages

  • f

alternating S-cells d C ll i and C-cells in addition to inhibitory surround from S- cells to C cells and a cells to C-cells and a contrast extracting layer

  • S-cells equivalent to

simple cells in primary visual cortex primary visual cortex and responsible for feature extracting

Connection of S cells and C cells

  • C-sells

allows for positional errors

Connection of S-cells and C-cells

slide-30
SLIDE 30

Riesenhuber Prior Work Riesenhuber Prior Work

Hierarchical feedforward architecture composed

  • f 2 stages and 2
  • perations:
  • weighted linear

ti summation

  • nonlinear maximum
  • peration

Riesenhuber et al. simple feedforward network

slide-31
SLIDE 31

Poggio Prior Work Poggio Prior Work

  • Battery of Gabor

filters applied to b i S ’

  • btain S1 stage’s

response S d C

  • S1 proceeds to C1

and layers alternate between S cells and C cells S-cells and C-cells

  • A total of 108 – 109

neurons used neurons used

  • Accounts mainly

for ventral stream for ventral stream part of the visual cortex

Poggio et al. feedforward model

slide-32
SLIDE 32

Methodology gy

  • Includes the “Photoreceptor layer” before the first layer
  • 8 main layers: every two consecutive layers represent a stage of one of

the four visual regions V1 - V2 Layers:

  • Consecutive S and C-cells in each layer
  • Bidirectional connections similar to a deep belief network
  • Bidirectional connections similar to a deep belief network
  • Supervised training

V4 - IT Layers: V4 - IT Layers:

  • Unidirectional feedforward connections leading to the output
  • Unsupervised learning
slide-33
SLIDE 33

Proposed Model Proposed Model

  • Network structure: Input – photoreceptor layer – 1000 – 1000 –

500 – 500 – 200 – 200 – 100 – 100 – output

  • The cross-entropy error as a cost function
  • Training using Hinton’s algorithm
slide-34
SLIDE 34

R l Results

  • MIT-CBCL Street Scene Database: 3547

3547 images. 9

  • bject

categories = [cars, pedestrians, bicycles, buildings, tree s, skies, roads, sidewalks, stores].

  • Data set is split into 70% training and

30% testing.

  • Results are based on the average of 15

runs.

  • 90% correct classification rate versus an

88% achieved in Poggio’s model on the same data set.

slide-35
SLIDE 35

Contact info m ariette.awad@aub.edu.lb