New Challenges in New Challenges in Semantic Concept Detection - - PDF document

new challenges in new challenges in semantic concept
SMART_READER_LITE
LIVE PREVIEW

New Challenges in New Challenges in Semantic Concept Detection - - PDF document

New Challenges in New Challenges in Semantic Concept Detection Semantic Concept Detection M.-F. Weng, C.-K. Chen, Y.-H. Yang, R.-E. Fan, Y.-T. Hsieh, Y.-Y. Chuang, W. H. Hsu, and C.-J. Lin National Taiwan University Preliminaries for Semantic


slide-1
SLIDE 1

New Challenges in New Challenges in Semantic Concept Detection Semantic Concept Detection

M.-F. Weng, C.-K. Chen, Y.-H. Yang, R.-E. Fan, Y.-T. Hsieh, Y.-Y. Chuang, W. H. Hsu, and C.-J. Lin

National Taiwan University

2

Preliminaries for Semantic Preliminaries for Semantic Concept Detection Concept Detection

  • What are preliminaries for building a semantic

concept detection system? – – A lexicon of well A lexicon of well-

  • defined concepts

defined concepts – – Training resources Training resources

  • Video data
  • Annotations
  • Features

– – Tools Tools

  • Tagging or labeling tools, (e.g., CMU and IBM tools)
  • Feature extractors
  • Machine learning tools, (e.g., LIBSVM)
  • Semantic concept detection tailored tools
slide-2
SLIDE 2

3

Semantic Concepts Semantic Concepts

MediaMill 101 Columbia374 LSCOM 449

TRECVID-2006 LSCOM-Lite (39) TRECVID-2005 10 Concepts

4

Video Data Sets Video Data Sets

TV 03 TV 04 TV 05 TV 06 TV 07

Dataset 50 100 150 200 Video Length (hrs)

  • devel. set

test set

~190 hours ~190 hours News Video News Video ~330 hours ~330 hours Multi Multi-

  • Lang.

Lang. Broadcast Broadcast News Video News Video 100 hours 100 hours Sound and Sound and Vision Vision Video Video… …

slide-3
SLIDE 3

5

Annotations Annotations

TV 05 TV 06 TV 07 Dataset 50 100 150 200 Video Length (hrs)

  • devel. set

test set NIST Truth Judgments Common Annotations

6

Features, Detectors, Scores Features, Detectors, Scores

Scores of TV Scores of TV 07 dataset 07 dataset VIREO VIREO-

  • 374

374 detectors detectors

  • Color moment
  • Wavelet texture
  • Keypoint feature

VIREO-374

Columbia374 Columbia374 scores of TV scores of TV 06/07 dataset 06/07 dataset Columbia 374 Columbia 374 detectors detectors

  • EDH
  • GBR
  • GCM

Columbia374

Scores of TV Scores of TV 05/06 dataset 05/06 dataset 5 sets of 101 5 sets of 101 classifiers classifiers

  • Visual feature
  • Text feature

MediaMill Baseline Scores Scores Detectors Featur eatures

slide-4
SLIDE 4

7

Available Resources Available Resources

  • Concept definition is sufficient
  • Training resources are plentiful
  • No feature extractors and tailored tools available

Tailored Tools Machine Learning Tools Feature Extractors Tagging Tools

Tools

Features Annotations Video data

Training Resources Well-defined concepts

8

The New Challenges The New Challenges

  • Challenge 1 : Easy and Efficient Tools

– – L L datasets, M M concepts, N N features, imply L*M*N

L*M*N

classifiers – Each classifier has to consider many parameters – – Time seems very limited Time seems very limited to validate each parameter and to train all classifiers

  • Challenge 2: Resource Exploitation or Reuse

– Resources are precious – Existent resources are potentially useful potentially useful for new dataset – Plentiful resources have not been fully utilized not been fully utilized

slide-5
SLIDE 5

9

Facing the New Challenges Facing the New Challenges

  • Challenge 1:

– Extended LIBSVM to improve training efficiency – Developed an efficient and easy-to-use toolkit tailored for semantic concept detection

  • Challenge 2:

– Reused classifiers of past data to improve accuracy by late aggregation – Exploited contextual relationship and temporal dependency from annotations to boost accuracy

10

Facing the New Challenges Facing the New Challenges

  • Challenge 1 : Easy and Efficient Tools

– Extended LIBSVM to improve training efficiency – Developed an efficient and easy-to-use toolkit tailored for semantic concept detection

  • Challenge 2

– Reused classifiers of past data to improve accuracy by late aggregation – Exploited contextual relationship and temporal dependency from annotations to boost accuracy

slide-6
SLIDE 6

11

A Tailored Toolkit A Tailored Toolkit

  • We extended LIBSVM in three aspects for

semantic concept detection:

– Using dense representations – Exploiting parallelism of independent concepts, features, and SVM model parameters – Narrowing down parameter search to a safe range

  • Overall, training time of our baseline was

approximately reduced from 14 days to about 3 days from 14 days to about 3 days

12

Facing the New Challenges Facing the New Challenges

  • Challenge 1

– Extended LIBSVM to improve training efficiency – Developed an efficient and easy-to-use toolkit tailored for semantic concept detection

  • Challenge 2 : Resource Exploitation or Reuse

– Reused classifiers of past data to improve accuracy by late aggregation – Exploited contextual relationship and temporal dependency from annotations to boost accuracy

slide-7
SLIDE 7

13

Reuse Past Data Reuse Past Data

  • Early aggregation

– – Must re Must re-

  • train classifiers

train classifiers – – Cause considerable training time Cause considerable training time

  • Late aggregation

– – Simple and direct Simple and direct – – May be biased May be biased

14

Late Aggregation Late Aggregation

  • We adopt late aggregation to reuse existent

classifiers by two strategies:

– – Equally Average Aggregation Equally Average Aggregation

  • Simply average the scores of past and newly trained

classifiers

– – Concept Concept-

  • dependent Weighted Aggregation

dependent Weighted Aggregation

  • Use concept-dependent weights to aggregate

classifiers

slide-8
SLIDE 8

15

Aggregation Benefits Aggregation Benefits

0.05 0.1 0.15 0.2 0.25 0.3 F l a g

  • U

S P

  • l

i c e _ S e c u r i t y E x p l

  • s

i

  • n

_ F i r e A i r p l a n e A i r p l a n e C h a r t s M i l i t a r y D e s e r t D e s e r t T r u c k W e a t h e r M

  • u

n t a i n P e

  • p

l e

  • M

a r c h i n g S p

  • r

t s S p

  • r

t s B

  • a

t _ S h i p M a p s M a p s M e e t i n g C

  • m

p u t e r _ T V C

  • m

p u t e r _ T V

  • s

c r e e n s c r e e n C a r A n i m a l O f f i c e W a t e r s c a p e _ W a t e r f r

  • n

t O v e r a l l infAP TV07 Classifiers Average Aggregation Weighted Aggregation

Overall Improvement Ratio – Average Aggregation : 22% Weighed Aggregation : 30%

16

Facing the New Challenges Facing the New Challenges

  • Challenge 1

– Extended LIBSVM to improve training efficiency – Developed an efficient and easy-to-use toolkit tailored for semantic concept detection

  • Challenge 2: Resource Exploitation or Reuse

– Reused classifiers of past data to improve accuracy by late aggregation – Exploited contextual relationship and temporal dependency from annotations to boost accuracy

slide-9
SLIDE 9

17

Observation in Annotations Observation in Annotations

A sequence of video shots A lexicon

  • f concepts

car

  • utdoor

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 building 1 1 1 1 1 sky 1 1 1 1 1 people 1 1 1 urban 1 1 1 1 1 1 1

Contextual relationship Temporal Dependency

18

Post Post-

  • processing Framework

processing Framework

Video Segmentation Feature Extraction Concept Detection Temporal Filtering Shot Ranking Video Sequence Concept Reranking Combination Annotation Temporal Dependency Mining Unsupervised Contextual Fusion Temporal Filter Design

Mining phase: Processing phase: Detecting phase:

slide-10
SLIDE 10

19

Temporal Filtering Temporal Filtering

Video Segmentation Feature Extraction Concept Detection Temporal Filtering Shot Ranking Video Sequence Concept Reranking Combination Annotation Temporal Dependency Mining Unsupervised Contextual Fusion Temporal Filter Design

Mining phase: Processing phase: Detecting phase:

20

Temporal Dependency Temporal Dependency

  • Different concepts have different levels of

dependency at different temporal distance

– E.g., sports, weather, maps, explosion sports, weather, maps, explosion

2000 4000 6000 8000 10000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

sports weather maps explosion Temporal Distance Chi-square test

2 k

χ

slide-11
SLIDE 11

21

Temporal Filter Temporal Filter

Xt-1

) | 1 (

1 −

=

t t

l P x

Xt-2

) | 1 (

2 −

=

t t

l P x

Xt-k

) | 1 (

k t t

l P

= x

Xt Xt+1 Xt+2 Xt+k

) | 1 (

k t t

l P

+

= x ) | 1 (

2 +

=

t t

l P x ) | 1 (

1 +

=

t t

l P x ) | 1 (

t t

l P x =

… ….. …..

w0 w1 w1 w2 wk w2 wk

( ) [ ]

= −

= = =

d k k t t k t

l P w l P ) | 1 ( ) 1 ( ˆ x SVM Classifier

) | 1 (

1 1 − − = t t

l P x ) | 1 (

2 2 − − = t t

l P x

) | 1 (

k t k t

l P

− − = x

) | 1 (

k t k t

l P

+ + = x

) | 1 (

2 2 + + = t t

l P x ) | 1 (

1 1 + + = t t

l P x ) | 1 (

t t

l P x =

22

Filtering Prediction Filtering Prediction

  • A sequence of shots for predicting sports

sports

– – Classifier prediction results Classifier prediction results – – After temporal filtering After temporal filtering

0.2 0.4 0.6 0.8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

0.1 0.2 0.3 0.4 0.5

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Misclassification picked up Rank rose

slide-12
SLIDE 12

23

Concept Concept Reranking Reranking

Video Segmentation Feature Extraction Concept Detection Temporal Filtering Shot Ranking Video Sequence Concept Reranking Combination Annotation Temporal Dependency Mining Unsupervised Contextual Fusion Temporal Filter Design

Mining phase: Processing phase: Detecting phase:

24

Initial ranking list produced by a baseline method Target concept: ‘boat’ (search or detection)

Initial list

slide-13
SLIDE 13

25

training test

Step 1. Randomly split to training and test sets

Initial list

26

training test

Step 2. Learn to maintain the ranking orders

Initial list

  • 1. Related concepts
  • 1. Related concepts
  • 2. Importance of each
  • 2. Importance of each

concept concept

Related concepts:

  • cean

waterscape Shot pairs

slide-14
SLIDE 14

27

training test

Related concepts:

  • cean

waterscape

Step 3. Context fusion on the test data

Initial list

28

Reranked test Initial list training

Step 4. Merge

slide-15
SLIDE 15

29

Combination Combination

Video Segmentation Feature Extraction Concept Detection Temporal Filtering Shot Ranking Video Sequence Concept Reranking Combination Annotation Temporal Dependency Mining Unsupervised Contextual Fusion Temporal Filter Design

Mining phase: Processing phase: Detecting phase:

30

Post Post-

  • processing Benefits

processing Benefits

F l a g

  • U

S P

  • l

i c e _ S e c u r i t y E x p l

  • s

i

  • n

_ F i r e M i l i t a r y A i r p l a n e W e a t h e r C h a r t s T r u c k M

  • u

n t a i n M e e t i n g P e

  • p

l e

  • M

a r c h i n g B

  • a

t _ S h i p S p

  • r

t s D e s e r t C

  • m

p u t e r _ T V

  • s

c r e e n A n i m a l C a r M a p s O f f i c e W a t e r s c a p e _ W a t e r f r

  • n

t O v e r a l l 0.05 0.1 0.15 0.2 0.25 0.3 infAP Classification Concept Reranking Temporal Filtering Parallel Combination

Overall improvement ratio: Overall improvement ratio: Parallel Combination: +10% Parallel Combination: +10% +45% +29% +37% +17%

slide-16
SLIDE 16

31

Conclusion Conclusion

  • We reduce the training time of detectors

reduce the training time of detectors by using a tailored toolkit for semantic concept detection

  • The proposed aggregation methods reuse the

reuse the classifiers classifiers of past data and can boost the detection accuracy

  • Our post-processing approaches exploit

exploit existent resource existent resource and can further improve detection results

32

Thank You For Your Attention Thank You For Your Attention