[PDF] - New Challenges in New Challenges in Semantic Concept Detection PDF Document

SLIDE 1

New Challenges in New Challenges in Semantic Concept Detection Semantic Concept Detection

M.-F. Weng, C.-K. Chen, Y.-H. Yang, R.-E. Fan, Y.-T. Hsieh, Y.-Y. Chuang, W. H. Hsu, and C.-J. Lin

National Taiwan University

2

Preliminaries for Semantic Preliminaries for Semantic Concept Detection Concept Detection

What are preliminaries for building a semantic

concept detection system? – – A lexicon of well A lexicon of well-

defined concepts

defined concepts – – Training resources Training resources

Video data
Annotations
Features

– – Tools Tools

Tagging or labeling tools, (e.g., CMU and IBM tools)
Feature extractors
Machine learning tools, (e.g., LIBSVM)
Semantic concept detection tailored tools

SLIDE 2

3

Semantic Concepts Semantic Concepts

MediaMill 101 Columbia374 LSCOM 449

TRECVID-2006 LSCOM-Lite (39) TRECVID-2005 10 Concepts

4

Video Data Sets Video Data Sets

TV 03 TV 04 TV 05 TV 06 TV 07

Dataset 50 100 150 200 Video Length (hrs)

devel. set

test set

~190 hours ~190 hours News Video News Video ~330 hours ~330 hours Multi Multi-

Lang.

Lang. Broadcast Broadcast News Video News Video 100 hours 100 hours Sound and Sound and Vision Vision Video Video… …

SLIDE 3

5

Annotations Annotations

TV 05 TV 06 TV 07 Dataset 50 100 150 200 Video Length (hrs)

devel. set

test set NIST Truth Judgments Common Annotations

6

Features, Detectors, Scores Features, Detectors, Scores

Scores of TV Scores of TV 07 dataset 07 dataset VIREO VIREO-

374

374 detectors detectors

Color moment
Wavelet texture
Keypoint feature

VIREO-374

Columbia374 Columbia374 scores of TV scores of TV 06/07 dataset 06/07 dataset Columbia 374 Columbia 374 detectors detectors

EDH
GBR
GCM

Columbia374

Scores of TV Scores of TV 05/06 dataset 05/06 dataset 5 sets of 101 5 sets of 101 classifiers classifiers

Visual feature
Text feature

MediaMill Baseline Scores Scores Detectors Featur eatures

SLIDE 4

7

Available Resources Available Resources

Concept definition is sufficient
Training resources are plentiful
No feature extractors and tailored tools available

Tailored Tools Machine Learning Tools Feature Extractors Tagging Tools

Tools

Features Annotations Video data

Training Resources Well-defined concepts

8

The New Challenges The New Challenges

Challenge 1 : Easy and Efficient Tools

– – L L datasets, M M concepts, N N features, imply LMN

LMN

classifiers – Each classifier has to consider many parameters – – Time seems very limited Time seems very limited to validate each parameter and to train all classifiers

Challenge 2: Resource Exploitation or Reuse

– Resources are precious – Existent resources are potentially useful potentially useful for new dataset – Plentiful resources have not been fully utilized not been fully utilized

SLIDE 5

9

Facing the New Challenges Facing the New Challenges

Challenge 1:

– Extended LIBSVM to improve training efficiency – Developed an efficient and easy-to-use toolkit tailored for semantic concept detection

Challenge 2:

– Reused classifiers of past data to improve accuracy by late aggregation – Exploited contextual relationship and temporal dependency from annotations to boost accuracy

10

Facing the New Challenges Facing the New Challenges

Challenge 1 : Easy and Efficient Tools

– Extended LIBSVM to improve training efficiency – Developed an efficient and easy-to-use toolkit tailored for semantic concept detection

Challenge 2

– Reused classifiers of past data to improve accuracy by late aggregation – Exploited contextual relationship and temporal dependency from annotations to boost accuracy

SLIDE 6

11

A Tailored Toolkit A Tailored Toolkit

We extended LIBSVM in three aspects for

semantic concept detection:

– Using dense representations – Exploiting parallelism of independent concepts, features, and SVM model parameters – Narrowing down parameter search to a safe range

Overall, training time of our baseline was

approximately reduced from 14 days to about 3 days from 14 days to about 3 days

12

Facing the New Challenges Facing the New Challenges

Challenge 1

– Extended LIBSVM to improve training efficiency – Developed an efficient and easy-to-use toolkit tailored for semantic concept detection

Challenge 2 : Resource Exploitation or Reuse

– Reused classifiers of past data to improve accuracy by late aggregation – Exploited contextual relationship and temporal dependency from annotations to boost accuracy

SLIDE 7

13

Reuse Past Data Reuse Past Data

Early aggregation

– – Must re Must re-

train classifiers

train classifiers – – Cause considerable training time Cause considerable training time

Late aggregation

– – Simple and direct Simple and direct – – May be biased May be biased

14

Late Aggregation Late Aggregation

We adopt late aggregation to reuse existent

classifiers by two strategies:

– – Equally Average Aggregation Equally Average Aggregation

Simply average the scores of past and newly trained

classifiers

– – Concept Concept-

dependent Weighted Aggregation

dependent Weighted Aggregation

Use concept-dependent weights to aggregate

classifiers

SLIDE 8

15

Aggregation Benefits Aggregation Benefits

0.05 0.1 0.15 0.2 0.25 0.3 F l a g

U

S P

l

i c e _ S e c u r i t y E x p l

s

i

n

_ F i r e A i r p l a n e A i r p l a n e C h a r t s M i l i t a r y D e s e r t D e s e r t T r u c k W e a t h e r M

u

n t a i n P e

p

l e

M

a r c h i n g S p

r

t s S p

r

t s B

a

t _ S h i p M a p s M a p s M e e t i n g C

m

p u t e r _ T V C

m

p u t e r _ T V

s

c r e e n s c r e e n C a r A n i m a l O f f i c e W a t e r s c a p e _ W a t e r f r

n

t O v e r a l l infAP TV07 Classifiers Average Aggregation Weighted Aggregation

Overall Improvement Ratio – Average Aggregation : 22% Weighed Aggregation : 30%

16

Facing the New Challenges Facing the New Challenges

Challenge 1

– Extended LIBSVM to improve training efficiency – Developed an efficient and easy-to-use toolkit tailored for semantic concept detection

Challenge 2: Resource Exploitation or Reuse

– Reused classifiers of past data to improve accuracy by late aggregation – Exploited contextual relationship and temporal dependency from annotations to boost accuracy

SLIDE 9

17

Observation in Annotations Observation in Annotations

A sequence of video shots A lexicon

f concepts

car

utdoor

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 building 1 1 1 1 1 sky 1 1 1 1 1 people 1 1 1 urban 1 1 1 1 1 1 1

Contextual relationship Temporal Dependency

18

Post Post-

processing Framework

processing Framework

Video Segmentation Feature Extraction Concept Detection Temporal Filtering Shot Ranking Video Sequence Concept Reranking Combination Annotation Temporal Dependency Mining Unsupervised Contextual Fusion Temporal Filter Design

Mining phase: Processing phase: Detecting phase:

SLIDE 10

19

Temporal Filtering Temporal Filtering

Video Segmentation Feature Extraction Concept Detection Temporal Filtering Shot Ranking Video Sequence Concept Reranking Combination Annotation Temporal Dependency Mining Unsupervised Contextual Fusion Temporal Filter Design

Mining phase: Processing phase: Detecting phase:

20

Temporal Dependency Temporal Dependency

Different concepts have different levels of

dependency at different temporal distance

– E.g., sports, weather, maps, explosion sports, weather, maps, explosion

2000 4000 6000 8000 10000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

sports weather maps explosion Temporal Distance Chi-square test

2 k

χ

SLIDE 11

21

Temporal Filter Temporal Filter

Xt-1

) | 1 (

1 −

=

t t

l P x

Xt-2

) | 1 (

2 −

=

t t

l P x

Xt-k

) | 1 (

k t t

l P

−

= x

Xt Xt+1 Xt+2 Xt+k

…

) | 1 (

k t t

l P

+

= x ) | 1 (

2 +

=

t t

l P x ) | 1 (

1 +

=

t t

l P x ) | 1 (

t t

l P x =

… ….. …..

w0 w1 w1 w2 wk w2 wk

( ) [ ]

∑

= −

= = =

d k k t t k t

l P w l P ) | 1 ( ) 1 ( ˆ x SVM Classifier

) | 1 (

1 1 − − = t t

l P x ) | 1 (

2 2 − − = t t

l P x

) | 1 (

k t k t

l P

− − = x

) | 1 (

k t k t

l P

+ + = x

) | 1 (

2 2 + + = t t

l P x ) | 1 (

1 1 + + = t t

l P x ) | 1 (

t t

l P x =

22

Filtering Prediction Filtering Prediction

A sequence of shots for predicting sports

sports

– – Classifier prediction results Classifier prediction results – – After temporal filtering After temporal filtering

0.2 0.4 0.6 0.8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

0.1 0.2 0.3 0.4 0.5

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Misclassification picked up Rank rose

SLIDE 12

23

Concept Concept Reranking Reranking

Video Segmentation Feature Extraction Concept Detection Temporal Filtering Shot Ranking Video Sequence Concept Reranking Combination Annotation Temporal Dependency Mining Unsupervised Contextual Fusion Temporal Filter Design

Mining phase: Processing phase: Detecting phase:

24

Initial ranking list produced by a baseline method Target concept: ‘boat’ (search or detection)

Initial list

SLIDE 13

25

training test

Step 1. Randomly split to training and test sets

Initial list

26

training test

Step 2. Learn to maintain the ranking orders

Initial list

1. Related concepts
1. Related concepts
2. Importance of each
2. Importance of each

concept concept

Related concepts:

cean

waterscape Shot pairs

SLIDE 14

27

training test

Related concepts:

cean

waterscape

Step 3. Context fusion on the test data

Initial list

28

Reranked test Initial list training

Step 4. Merge

SLIDE 15

29

Combination Combination

Video Segmentation Feature Extraction Concept Detection Temporal Filtering Shot Ranking Video Sequence Concept Reranking Combination Annotation Temporal Dependency Mining Unsupervised Contextual Fusion Temporal Filter Design

Mining phase: Processing phase: Detecting phase:

30

Post Post-

processing Benefits

processing Benefits

F l a g

U

S P

l

i c e _ S e c u r i t y E x p l

s

i

n

_ F i r e M i l i t a r y A i r p l a n e W e a t h e r C h a r t s T r u c k M

u

n t a i n M e e t i n g P e

p

l e

M

a r c h i n g B

a

t _ S h i p S p

r

t s D e s e r t C

m

p u t e r _ T V

s

c r e e n A n i m a l C a r M a p s O f f i c e W a t e r s c a p e _ W a t e r f r

n

t O v e r a l l 0.05 0.1 0.15 0.2 0.25 0.3 infAP Classification Concept Reranking Temporal Filtering Parallel Combination

Overall improvement ratio: Overall improvement ratio: Parallel Combination: +10% Parallel Combination: +10% +45% +29% +37% +17%

SLIDE 16

31

Conclusion Conclusion

We reduce the training time of detectors

reduce the training time of detectors by using a tailored toolkit for semantic concept detection

The proposed aggregation methods reuse the

reuse the classifiers classifiers of past data and can boost the detection accuracy

Our post-processing approaches exploit

exploit existent resource existent resource and can further improve detection results

32

New Challenges in New Challenges in Semantic Concept Detection Semantic Concept Detection

M.-F. Weng, C.-K. Chen, Y.-H. Yang, R.-E. Fan, Y.-T. Hsieh, Y.-Y. Chuang, W. H. Hsu, and C.-J. Lin

National Taiwan University

Preliminaries for Semantic Preliminaries for Semantic Concept Detection Concept Detection

concept detection system? – – A lexicon of well A lexicon of well-

defined concepts – – Training resources Training resources

– – Tools Tools

Semantic Concepts Semantic Concepts

MediaMill 101 Columbia374 LSCOM 449

Video Data Sets Video Data Sets

Annotations Annotations

Features, Detectors, Scores Features, Detectors, Scores

VIREO-374

Columbia374

MediaMill Baseline Scores Scores Detectors Featur eatures

Available Resources Available Resources

Tailored Tools Machine Learning Tools Feature Extractors Tagging Tools

Tools

Features Annotations Video data

Training Resources Well-defined concepts

The New Challenges The New Challenges

– – L L datasets, M M concepts, N N features, imply L*M*N

L*M*N

classifiers – Each classifier has to consider many parameters – – Time seems very limited Time seems very limited to validate each parameter and to train all classifiers

– Resources are precious – Existent resources are potentially useful potentially useful for new dataset – Plentiful resources have not been fully utilized not been fully utilized

Facing the New Challenges Facing the New Challenges

– Extended LIBSVM to improve training efficiency – Developed an efficient and easy-to-use toolkit tailored for semantic concept detection

– Reused classifiers of past data to improve accuracy by late aggregation – Exploited contextual relationship and temporal dependency from annotations to boost accuracy

Facing the New Challenges Facing the New Challenges

– Extended LIBSVM to improve training efficiency – Developed an efficient and easy-to-use toolkit tailored for semantic concept detection

– Reused classifiers of past data to improve accuracy by late aggregation – Exploited contextual relationship and temporal dependency from annotations to boost accuracy

A Tailored Toolkit A Tailored Toolkit

semantic concept detection:

– Using dense representations – Exploiting parallelism of independent concepts, features, and SVM model parameters – Narrowing down parameter search to a safe range

approximately reduced from 14 days to about 3 days from 14 days to about 3 days

Facing the New Challenges Facing the New Challenges

– Extended LIBSVM to improve training efficiency – Developed an efficient and easy-to-use toolkit tailored for semantic concept detection

– Reused classifiers of past data to improve accuracy by late aggregation – Exploited contextual relationship and temporal dependency from annotations to boost accuracy

Reuse Past Data Reuse Past Data

– – Must re Must re-

train classifiers – – Cause considerable training time Cause considerable training time

– – Simple and direct Simple and direct – – May be biased May be biased

Late Aggregation Late Aggregation

classifiers by two strategies:

– – Equally Average Aggregation Equally Average Aggregation

classifiers

– – Concept Concept-

dependent Weighted Aggregation

classifiers

Aggregation Benefits Aggregation Benefits

Facing the New Challenges Facing the New Challenges

– Extended LIBSVM to improve training efficiency – Developed an efficient and easy-to-use toolkit tailored for semantic concept detection

– Reused classifiers of past data to improve accuracy by late aggregation – Exploited contextual relationship and temporal dependency from annotations to boost accuracy

Observation in Annotations Observation in Annotations

Contextual relationship Temporal Dependency

Post Post-

processing Framework

Temporal Filtering Temporal Filtering

Temporal Dependency Temporal Dependency

dependency at different temporal distance

– E.g., sports, weather, maps, explosion sports, weather, maps, explosion

Temporal Filter Temporal Filter

…

… ….. …..

∑

Filtering Prediction Filtering Prediction

sports

– – Classifier prediction results Classifier prediction results – – After temporal filtering After temporal filtering

Concept Concept Reranking Reranking

Initial ranking list produced by a baseline method Target concept: ‘boat’ (search or detection)

Step 1. Randomly split to training and test sets

Step 2. Learn to maintain the ranking orders

Step 3. Context fusion on the test data

Step 4. Merge

Combination Combination

Post Post-

processing Benefits

Conclusion Conclusion

reduce the training time of detectors by using a tailored toolkit for semantic concept detection

reuse the classifiers classifiers of past data and can boost the detection accuracy

– – L L datasets, M M concepts, N N features, imply LMN

LMN