[PPT] - Mariette Awad Assistant Professor Assistant Professor Electrical PowerPoint Presentation

SLIDE 1

Energy Aware Recognition for Man Made Structures and other research projects at the American University of Beirut

Mariette Awad Assistant Professor Assistant Professor Electrical and Computer Engineering Department American University of Beirut, Lebanon y

SLIDE 2

Outline

On Going Projects
uSee for Man made Structures
Biologically Inspired Deep Visual Networks

SLIDE 3

Lebanon

SLIDE 4

eirut y of Be versity n Univ erican Ame

SLIDE 5

S AUB G l I f S

me AUB General Info
AUB founded in 1866

Si di d b Hi h

Since 2004 accredited by Higher

Educational of the Middle States Association of Colleges and g Schools in the US

120 programs: Bachelor, Masters

and PhD degrees and PhD degrees

6 faculties: Agriculture and Food

Sciences, Arts and S i E i i d Sciences, Engineering and Architecture, Health Sciences, Medicine, Business

Faculty of Engineering since 1944

SLIDE 6

Mood Based Internet Radio Mood-Based Internet Radio- Tuner App pp

Undergraduate Students D id M h li

Emotion Song Fetcher &

David Matchoulian Yara Rizk Maya Safieddine

Detector Fetcher & Classifier

Maya Safieddine

Song Selector

SLIDE 7

MusAR for an Augmented MusAR for an Augmented Museum Experience

Undergraduate Students Anis Halabi Giorgio Saad Giorgio Saad Piotr Yordanov

SLIDE 8

SLIDE 9

G t B d Pi A Gesture Based Piano App

Undergraduate Students Haya Mortada Sara Kheireddine Sara Kheireddine Fadi Cham m as Bahaa El Hakim

SLIDE 10

Frame rate of 3.33 frames per

3 33 p second, for both

ptions
f

single key and multiple key generation of notes generation of notes

Equivalent to 200 frames per

minute. Thi f ll f

This frame rate allows for a

maximum tempo of 100 beats per minute, assuming the p , g majority of the notes played are half notes.

Given that moderate pieces are
Given that moderate pieces are

usually played at a tempo of 108 beats per minute, and that most b i i d beginner pieces do not use shorter than half notes: playability is OK

SLIDE 11

PicoS paces: A Mobile Proj ected Touch Interface for Collaborative Applications

Prior Art Proposed Solution

M S l

S ation Segment ation S ation Segment ation Feature Feature t ti Feature Feature t ti

Master S lave

Undergraduate Students Marc Farra

n extractio n n extractio n Synchronization Correspondences and Correspondences and

Maya Kreidieh Moham ed Mehanna

Correspondences and Matching Correspondences and Matching Event Trigger

SLIDE 12

Outline

On Going Projects
uSee Project
Biologically Inspired Deep Visual Networks

SLIDE 13

uS ee: An Energy Aware S ift based Framework for S upervised Visual Framework for S upervised Visual S earch of Man Made S tructures

G d t St d t Graduate Student Aym an El Mobacher

SLIDE 14

Problem S tatement

Proliferation of digital images and videos + the ease of acquisition

using smart-phones => opportunities for novel visual mining and search

Energy aware computing trends and a somewhat limited processing

capabilities of these handheld devices

Required

to better fit an environment where “green”, “mobility”, and “on-the-go” are prevailing

uSee: a supervised learning framework using SIFT keypoints

▫ exploits the physical world p p y ▫ delivers context-based services

SLIDE 15

Prior Work

Visual salient regions and attention model based filtration so that
nly keypoints within the region of interest are used in the matching

process while dropping those in the background [Zhang et al., Bonaiuto p pp g g

et al.]

Consistent line clusters as mid-level features for performing

content- based image retrieval (CBIR) and utilizing relations among d i hi h l f hi h l l bj (b ildi ) d i and within the clusters for high level object (buildings) detection

[Shapiro et al.]

Causal multi-scale random fields to create a structured vs. non-

structured dichotomy using image sections [Kum ar and Hebertin ] structured dichotomy using image sections [Kum ar and Hebertin ]

Scale invariant descriptors followed by a nearest neighbor search of

the database for the best match based on “hyper polyhedron with adaptive threshold” indexing [Shao et al ] adaptive threshold indexing [Shao et al.]

SLIDE 16

Methodology

l d d d ll i

Implemented as an on demand pull service
Based on energy aware processing of building images
Pre-processing phase:

▫ via cloud ( porting it locally now) ▫ highlights the areas with high variation in gradient angle using an entropy-based metric i i di id d i t l t l di t l i ti hi h ▫ image is divided into 2 clusters: low gradient angle variation vs high gradient angle variation

Signature Extraction:

▫ exploits the inherent symmetry and repetitive patterns in man made ▫ exploits the inherent symmetry and repetitive patterns in man-made structures ▫ guarantees an energy aware framework for SIFT keypoints matching ▫ SIFT keypoints are extracted -> correlated -> clustered based on yp b threshold ▫ r (%) SIFT keypoints are selected from clusters (r pre-defined for image)

SLIDE 17

P i g Preprocessing

SLIDE 18

Methodology - Workflow

uSee clustering workflow uSee keypoints selection workflow

SLIDE 19

S ignature Extraction

80 90 uster 40 50 60 70 80 ypoints within a clu 10 20 30 40 m ber of sim ilar key

Identification: when new image is acquired

▫ Extract signature

1 3 5 7 9 11 13 15 17 19 21 23 25 27 Num Clusters of keypoints

Extract signature ▫ Compute L2 norm between the query’s and all the database’s signatures ▫ Identification based on a maximum voting scheme

SLIDE 20

Validation1

ZuBuD

▫ 201 buildings with 5 reference views and 1 query image for each building

Several values for r were tested for both the reference and the query

images

Average number of all SIFT keypoints in a given image about 740

SLIDE 21

Reduction in operational complexity at runtime instead

Results1

Reduction in operational complexity at runtime instead

f comparing a new query image n keypoints to 5*n*d

thus perfoming 5n2d comparisons, r keypoints where << l 5* 2d i d d r<<n, only 5r2*d comparisons are needed.

With 50 keypoints (r/ n = 6.8%), we save 99.54% on

5 yp ( / ), 99 54 computing energy without affecting accuracy results. U i % f SIFT k i t d d ll i lt

Using 15.5% of SIFT keypoints exceeded all prior results

achieved, to the best of our knowledge, on ZuBuD: reached 99.1% accuracy in building recognition. 99 y g g

SLIDE 22

Method # of keypoints in reference image # of keypoints in query image r/n Recognition rate reference image query image

[8] All All

94.80%

90.4% (Correct classification) [24] All All

90.4% (Correct classification)

94.8% (Correct match in top 5) 90.4% (Correct classification) [26] All All

90.4% (Correct classification)

96.5% (Correct match in top 5) [27] 335 335 45.3% 96.50% [27] 335 335 45.3% 96.50% 20 20 2.7% 91.30% 30 30 4.1% 94.80% uSee 30 30 4.1% 94.80% 40 40 5.4% 95.70% 50 50 6.8% 96.50% 50 50 6.8% 96.50% 100 75 10.1% 98.30% 100 115 15.5% 99.10%

SLIDE 23

Validation 2

Further tests conducted on home grown database of buildings from

the city of Beirut (Beirut Database)

▫ 5 reference images taken at the same time of day g y ▫ 1 query image at different times and weather conditions ▫ total of 38 buildings

SLIDE 24

The test images in Beirut db are different from different from their corresponding f i reference in illumination, cam era angle, and g , scale which are major image processing processing challenges not present in the Z B D db ZuBuD db

SLIDE 25

SLIDE 26

Outline

On Going Projects
uSee Project
Biologically Inspired Deep Visual Networks

SLIDE 27

Biologically Inspired Deep Networks Biologically Inspired Deep Networks for Visual Identification

G d t St d t Graduate Student L’emir Salim Chehab

SLIDE 28

D B li f N k Deep Belief Networks

Deep belief networks: probabilistic generative models composed of multiple

layers of stochastic variables (Boltzmann Machines) layers of stochastic variables (Boltzmann Machines)

First two layers an undirected bipartite graph (bidirectional connection).

The rest of the connections are feedforward unidirectional The rest of the connections are feedforward unidirectional

Deep Belief Network

SLIDE 29

Fukushima Work Fukushima Work

Four

stages

f

alternating S-cells d C ll i and C-cells in addition to inhibitory surround from S- cells to C cells and a cells to C-cells and a contrast extracting layer

S-cells equivalent to

simple cells in primary visual cortex primary visual cortex and responsible for feature extracting

Connection of S cells and C cells

C-sells

allows for positional errors

Connection of S-cells and C-cells

SLIDE 30

Riesenhuber Prior Work Riesenhuber Prior Work

Hierarchical feedforward architecture composed

f 2 stages and 2
perations:
weighted linear

ti summation

nonlinear maximum
peration

Riesenhuber et al. simple feedforward network

SLIDE 31

Poggio Prior Work Poggio Prior Work

Battery of Gabor

filters applied to b i S ’

btain S1 stage’s

response S d C

S1 proceeds to C1

and layers alternate between S cells and C cells S-cells and C-cells

A total of 108 – 109

neurons used neurons used

Accounts mainly

for ventral stream for ventral stream part of the visual cortex

Poggio et al. feedforward model

SLIDE 32

Methodology gy

Includes the “Photoreceptor layer” before the first layer
8 main layers: every two consecutive layers represent a stage of one of

the four visual regions V1 - V2 Layers:

Consecutive S and C-cells in each layer
Bidirectional connections similar to a deep belief network
Bidirectional connections similar to a deep belief network
Supervised training

V4 - IT Layers: V4 - IT Layers:

Unidirectional feedforward connections leading to the output
Unsupervised learning

SLIDE 33

Proposed Model Proposed Model

Network structure: Input – photoreceptor layer – 1000 – 1000 –

500 – 500 – 200 – 200 – 100 – 100 – output

The cross-entropy error as a cost function
Training using Hinton’s algorithm

SLIDE 34

R l Results

MIT-CBCL Street Scene Database: 3547

3547 images. 9

bject

categories = [cars, pedestrians, bicycles, buildings, tree s, skies, roads, sidewalks, stores].

Data set is split into 70% training and

30% testing.

Results are based on the average of 15

runs.

90% correct classification rate versus an

88% achieved in Poggio’s model on the same data set.

SLIDE 35

Mariette Awad Assistant Professor Assistant Professor Electrical and Computer Engineering Department American University of Beirut, Lebanon y

Outline

Lebanon

eirut y of Be versity n Univ erican Ame

S AUB G l I f S

Si di d b Hi h

Educational of the Middle States Association of Colleges and g Schools in the US

and PhD degrees and PhD degrees

Sciences, Arts and S i E i i d Sciences, Engineering and Architecture, Health Sciences, Medicine, Business

Mood Based Internet Radio Mood-Based Internet Radio- Tuner App pp

Undergraduate Students D id M h li

David Matchoulian Yara Rizk Maya Safieddine

Maya Safieddine

MusAR for an Augmented MusAR for an Augmented Museum Experience

Undergraduate Students Anis Halabi Giorgio Saad Giorgio Saad Piotr Yordanov

G t B d Pi A Gesture Based Piano App

3 33 p second, for both

single key and multiple key generation of notes generation of notes

minute. Thi f ll f

maximum tempo of 100 beats per minute, assuming the p , g majority of the notes played are half notes.

usually played at a tempo of 108 beats per minute, and that most b i i d beginner pieces do not use shorter than half notes: playability is OK

PicoS paces: A Mobile Proj ected Touch Interface for Collaborative Applications

Undergraduate Students Marc Farra

Maya Kreidieh Moham ed Mehanna

Outline

uS ee: An Energy Aware S ift based Framework for S upervised Visual Framework for S upervised Visual S earch of Man Made S tructures

G d t St d t Graduate Student Aym an El Mobacher

Problem S tatement

Prior Work

Methodology

P i g Preprocessing

S ignature Extraction

Validation1

Results1

Reduction in operational complexity at runtime instead

thus perfoming 5*n2*d comparisons, r keypoints where << l 5* 2*d i d d r<<n, only 5*r2*d comparisons are needed.

5 yp ( / ), 99 54 computing energy without affecting accuracy results. U i % f SIFT k i t d d ll i lt

achieved, to the best of our knowledge, on ZuBuD: reached 99.1% accuracy in building recognition. 99 y g g

Validation 2

The test images in Beirut db are different from different from their corresponding f i reference in illumination, cam era angle, and g , scale which are major image processing processing challenges not present in the Z B D db ZuBuD db

Outline

Biologically Inspired Deep Networks Biologically Inspired Deep Networks for Visual Identification

G d t St d t Graduate Student L’emir Salim Chehab

D B li f N k Deep Belief Networks

Fukushima Work Fukushima Work

Riesenhuber Prior Work Riesenhuber Prior Work

Poggio Prior Work Poggio Prior Work

Methodology gy

Proposed Model Proposed Model

R l Results

Contact info m ariette.awad@aub.edu.lb

thus perfoming 5n2d comparisons, r keypoints where << l 5* 2d i d d r<<n, only 5r2*d comparisons are needed.