AITOK at the NTICR-14 OpenLiveQ-2 Tokushima University Hiroki - - PowerPoint PPT Presentation

aitok at the nticr 14 openliveq 2
SMART_READER_LITE
LIVE PREVIEW

AITOK at the NTICR-14 OpenLiveQ-2 Tokushima University Hiroki - - PowerPoint PPT Presentation

AITOK at the NTICR-14 OpenLiveQ-2 Tokushima University Hiroki Tanioka Good Morning! I am Hiroki Tanioka. I am here because I got a notice from OpenLiveQ-2. You can call me just Hiroki. 2 NTCIR-14 OpenLiveQ-2 WHAT WH T IS TA


slide-1
SLIDE 1

AITOK at the NTICR-14 OpenLiveQ-2

Tokushima University Hiroki Tanioka

slide-2
SLIDE 2

Good Morning!

I am Hiroki Tanioka.

I am here because I got a notice from OpenLiveQ-2.

You can call me just “Hiroki”.

2

slide-3
SLIDE 3

NTCIR-14 OpenLiveQ-2

WH WHAT T IS TA TARGET? T? OpenLiveQ-2 requires sorted QA lists for each query to participants. The queries are short queries which are composed of some keywords. The QAs are released with some statistics including click through rate, views count, updated time, etc. 1,000 QA list are for train, other 1,000 QA list are for test. HO HOW TO EVAL ALUAT ATE? To evaluate submitted QA list, this task has two phases; offline test and

  • nline test.

Offline test means calculating accuracy in some mesures, nDCG, Q- measure, etc. using prepared answer list. Online test means comparing superiority of submitted QA lists at Yahoo! Chiebukuro by live users.

Mo More re info at http://www.openliveq.net/

3

slide-4
SLIDE 4

Questions are commonly expressed so as to elicit information and to require resolution or discussion from users. But, the readers are not the same.

4

slide-5
SLIDE 5

Wha What is qu quest stion ion?

To find out the statistics of catchy in QA systems, participated in OpenLiveQ-2.

5

slide-6
SLIDE 6

1.

Offline Test

Why climb a mountain? Because there is a mountain.

6

slide-7
SLIDE 7

My Strategy to Climb the Mountain

▧ Research the last case in NTCIR-13 ▧ Gathering available information ▧ Tuning based on like linear programming Anyway, I climbed to the top of the mountain.

7

slide-8
SLIDE 8

Let’s review some available information

Que Questi tions ns

Query ID, rank of the question n search result, title of question, snippet, body of the question

Sta Statu tus

Status of the question

Up Updated time

Last update time of the question

Ans nswers rs

Number of the answers for the question, body of the best answer of the question

Vie Views

Page view of the question

Cl Clickthr hroug ugh

most frequent rank of the question, Clickthrough rate

8

slide-9
SLIDE 9

Where is my blue bird? (Q-score)

▧ In the offline test, we continued to submit a run a day from the end of August.

1 11 21 31 41 51 61 AITOK Rank Date Q-measure Rank Desc 8/24 0.38194 56This result is only for uploading test from AITOK. 8/25 0.39724 451-gram TF-IDF with click through rate with cutoff 8/26 0.39852 441-gram TF-IDF+ with click through rate with cutoff 8/27 0.40479 432-gram TF-IDF+ with click through rate with cutoff 8/28 0.42008 402-gram TF-IDF+ with click through rate with cutoff without rank 8/29 0.41748 42Dependent 2-gram TF-IDF with click through rate with cutoff without rank 8/30 0.4391 312-gram TF-IDF+ with click with cutoff and view without rank 8/31 0.42676 362-gram TF-IDF+ with click and view with cutoff without rank 9/1 0.43231 33cutoff and view 9/2 0.49363 14view count 9/3 0.49319 16click through and view count 9/4 0.49347 15view count sorted with click, updated, answers, order, rank and cutoff 9/5 0.49393 13view count sorted with answers, cutoff, click, updated, order and rank 9/6 0.499 4view count worted with answers x tf-idf weighted by query 9/7 0.5 3view count + answers x 2-gram tf-idf weighted by query 9/8 0.50152 1view count + answers x snippet 2-gram tf-idf weighted by query 9/9 0.49838 5view count + answers x snippet 2-gram tf-idf weighted by query 9/10 0.50028 2view count + answers x snippet 2-gram tf-idf double-weighted by norm query 9/11 0.49483 7view count + answers x snippet word2vec double-weighted by norm query 9/12 0.49427 9view count + answers x snippet word2vec double-weighted by norm query v2 9/13 0.49412 12view count + answers x snippet L1 word2vec double-weighted by norm query 9/14 0.49437 8view count + answers x snippet cos word2vec double-weighted by norm query

9

slide-10
SLIDE 10

Where is my blue bird? (Q-score)

▧ Which is important ? (view count, answers, click-through, update date, etc.)

1 11 21 31 41 51 61 AITOK Rank 8/24 8/30 9/1 9/5 9/14 Date Q-measure Rank Desc 8/24 0.38194 56This result is only for uploading test from AITOK. 8/25 0.39724 451-gram TF-IDF with click through rate with cutoff 8/26 0.39852 441-gram TF-IDF+ with click through rate with cutoff 8/27 0.40479 432-gram TF-IDF+ with click through rate with cutoff 8/28 0.42008 402-gram TF-IDF+ with click through rate with cutoff without rank 8/29 0.41748 42Dependent 2-gram TF-IDF with click through rate with cutoff without rank 8/30 0.4391 312-gram TF-IDF+ with click with cutoff and view without rank 8/31 0.42676 362-gram TF-IDF+ with click and view with cutoff without rank 9/1 0.43231 33cutoff and view 9/2 0.49363 14view count 9/3 0.49319 16click through and view count 9/4 0.49347 15view count sorted with click, updated, answers, order, rank and cutoff 9/5 0.49393 13view count sorted with answers, cutoff, click, updated, order and rank 9/6 0.499 4view count worted with answers x tf-idf weighted by query 9/7 0.5 3view count + answers x 2-gram tf-idf weighted by query 9/8 0.50152 1view count + answers x snippet 2-gram tf-idf weighted by query 9/9 0.49838 5view count + answers x snippet 2-gram tf-idf weighted by query 9/10 0.50028 2view count + answers x snippet 2-gram tf-idf double-weighted by norm query 9/11 0.49483 7view count + answers x snippet word2vec double-weighted by norm query 9/12 0.49427 9view count + answers x snippet word2vec double-weighted by norm query v2 9/13 0.49412 12view count + answers x snippet L1 word2vec double-weighted by norm query 9/14 0.49437 8view count + answers x snippet cos word2vec double-weighted by norm query

10

slide-11
SLIDE 11

Where is my blue bird? (Q-score)

▧ View count and answers emphasizes the top score in offline test.

1 11 21 31 41 51 61 AITOK Rank Date Q-measure Rank Desc 8/24 0.38194 56This result is only for uploading test from AITOK. 8/25 0.39724 451-gram TF-IDF with click through rate with cutoff 8/26 0.39852 441-gram TF-IDF+ with click through rate with cutoff 8/27 0.40479 432-gram TF-IDF+ with click through rate with cutoff 8/28 0.42008 402-gram TF-IDF+ with click through rate with cutoff without rank 8/29 0.41748 42Dependent 2-gram TF-IDF with click through rate with cutoff without rank 8/30 0.4391 312-gram TF-IDF+ with click with cutoff and view without rank 8/31 0.42676 362-gram TF-IDF+ with click and view with cutoff without rank 9/1 0.43231 33cutoff and view 9/2 0.49363 14view count 9/3 0.49319 16click through and view count 9/4 0.49347 15view count sorted with click, updated, answers, order, rank and cutoff 9/5 0.49393 13view count sorted with answers, cutoff, click, updated, order and rank 9/6 0.499 4view count worted with answers x tf-idf weighted by query 9/7 0.5 3view count + answers x 2-gram tf-idf weighted by query 9/8 0.50152 1view count + answers x snippet 2-gram tf-idf weighted by query 9/9 0.49838 5view count + answers x snippet 2-gram tf-idf weighted by query 9/10 0.50028 2view count + answers x snippet 2-gram tf-idf double-weighted by norm query 9/11 0.49483 7view count + answers x snippet word2vec double-weighted by norm query 9/12 0.49427 9view count + answers x snippet word2vec double-weighted by norm query v2 9/13 0.49412 12view count + answers x snippet L1 word2vec double-weighted by norm query 9/14 0.49437 8view count + answers x snippet cos word2vec double-weighted by norm query

11

slide-12
SLIDE 12

Where is my blue bird? (Q-score)

▧ I tried using word2vec.

1 11 21 31 41 51 61 AITOK Rank 8/24 8/30 9/2 9/8 9/11 9/14 Date Q-measure Rank Desc 8/24 0.38194 56This result is only for uploading test from AITOK. 8/25 0.39724 451-gram TF-IDF with click through rate with cutoff 8/26 0.39852 441-gram TF-IDF+ with click through rate with cutoff 8/27 0.40479 432-gram TF-IDF+ with click through rate with cutoff 8/28 0.42008 402-gram TF-IDF+ with click through rate with cutoff without rank 8/29 0.41748 42Dependent 2-gram TF-IDF with click through rate with cutoff without rank 8/30 0.4391 312-gram TF-IDF+ with click with cutoff and view without rank 8/31 0.42676 362-gram TF-IDF+ with click and view with cutoff without rank 9/1 0.43231 33cutoff and view 9/2 0.49363 14view count 9/3 0.49319 16click through and view count 9/4 0.49347 15view count sorted with click, updated, answers, order, rank and cutoff 9/5 0.49393 13view count sorted with answers, cutoff, click, updated, order and rank 9/6 0.499 4view count worted with answers x tf-idf weighted by query 9/7 0.5 3view count + answers x 2-gram tf-idf weighted by query 9/8 0.50152 1view count + answers x snippet 2-gram tf-idf weighted by query 9/9 0.49838 5view count + answers x snippet 2-gram tf-idf weighted by query 9/10 0.50028 2view count + answers x snippet 2-gram tf-idf double-weighted by norm query 9/11 0.49483 7view count + answers x snippet word2vec double-weighted by norm query 9/12 0.49427 9view count + answers x snippet word2vec double-weighted by norm query v2 9/13 0.49412 12view count + answers x snippet L1 word2vec double-weighted by norm query 9/14 0.49437 8view count + answers x snippet cos word2vec double-weighted by norm query

12

slide-13
SLIDE 13

Offline Test Result (Q-measure, nDGC, ERR)

▧ We won the first place in almost all scores.

0.1 0.2 0.3 0.4 0.5 118[ADAPT] 112[ADAPT] 144[YJRS] 113[YJRS] 100[YJRS] 148[YJRS] 89[ORG] 90[OKSAT] 99[ADAPT] 101[AITOK] 140[OKSAT] 94[OKSAT] 136[YJRS] 116[OKSAT] 132[OKSAT] 151[OKSAT] 91[YJRS] 142[OKSAT] 119[OKSAT] 95[YJRS] 102[AITOK] 103[AITOK] 105[AITOK] @109[AITOK] 146[OKSAT] @107[AITOK] 121[OKSAT] 108[OKSAT] 114[OKSAT] @115[AITOK] 135[OKSAT] 98[OKSAT] 117[AITOK] 126[ADAPT] @111[AITOK] 153[OKSAT] 110[ADAPT] 143[ADAPT] 133[ADAPT] 106[ADAPT] 92[YJRS] 128[ADAPT] 93[YJRS] 150[ADAPT] 130[ADAPT] 147[ADAPT] 104[OKSAT] 138[OKSAT] 152[ADAPT] 122[AITOK] 124[AITOK] 120[AITOK] 125[AITOK] 145[AITOK] 96[OKSAT] 97[OKSAT] 141[AITOK] 149[AITOK] 139[AITOK] 123[ADAPT] 134[AITOK] 127[AITOK] 129[AITOK] 137[AITOK] *131[AITOK] Q-Score

13

slide-14
SLIDE 14

My blue bird is here!

My blue bird, Catchy is hiding in the view count and TF-IDF :

▧ Vi

Views: the number of view, Page view of the question

▧ Qu

Questions: body and snippet with TF-IDF Besides,

▧ Status: unconfirmed ▧ Updated time: unconfirmed ▧ Answers: a bit affected ▧ Clickthrough: a bit affected

14

slide-15
SLIDE 15

2.

Online Test

To the forest…

15

slide-16
SLIDE 16

Online Test Result (top60 credit)

▧ ID131 is overtaken by ID111.

  • 1500
  • 1000
  • 500
500 1000 1500 2000 2500 Credit Team ID(Run ID) 89[ORG] 98[OKSAT] 140[OKSAT] 116[OKSAT] 114[OKSAT] 151[OKSAT] 132[OKSAT] 119[OKSAT] 94[OKSAT] 142[OKSAT] 146[OKSAT] 121[OKSAT] 108[OKSAT] 144[YJRS] 153[OKSAT] 128[ADAPT] 136[YJRS] 133[ADAPT] 123[ADAPT] 148[YJRS] 143[ADAPT] 110[ADAPT] 150[ADAPT] 130[ADAPT] 91[YJRS] 103[AITOK] 152[ADAPT] 96[OKSAT] 120[AITOK] 149[AITOK] 124[AITOK] 125[AITOK] 147[ADAPT] 145[AITOK] 129[AITOK] 138[OKSAT] 102[AITOK] 127[AITOK] 139[AITOK] 134[AITOK] 141[AITOK] 122[AITOK] *131[AITOK] 137[AITOK] 105[AITOK] 135[OKSAT] 115[AITOK] 126[ADAPT] 107[AITOK] 117[AITOK] 109[AITOK] 104[OKSAT] 95[YJRS] 112[ADAPT] 106[ADAPT] 100[YJRS] 113[YJRS] 118[ADAPT] 111[AITOK] 93[YJRS] 92[YJRS]

16

slide-17
SLIDE 17

Online Test Result (top30 credit)

▧ Where did my blue bird fly away?

  • 1000
  • 500
500 1000 1500 2000 credit

138[OKSAT] 134[AITOK] 137[AITOK] 124[AITOK] 127[AITOK] 102[AITOK] 145[AITOK] 147[ADAPT] 131[AITOK]* 122[AITOK] 129[AITOK] 141[AITOK] 139[AITOK] 135[OKSAT] 105[AITOK] 117[AITOK] @115[AITOK] @107[AITOK] 126[ADAPT] @109[AITOK] 112[ADAPT] 104[OKSAT] 95[YJRS] 106[ADAPT] 118[ADAPT] @111[AITOK] 93[YJRS] 100[YJRS] 92[YJRS] 113[YJRS]

17

slide-18
SLIDE 18

Wh What’ at’s s hap appen ened? ed?

ID131 was at the top of the offline test, but ID111 took over in the online test.

18

slide-19
SLIDE 19

To the forest again… (top30 credit)

▧ What is ID111?

Date Q-measure Rank Desc 8/24 0.38194 56This result is only for uploading test from AITOK. 8/25 0.39724 451-gram TF-IDF with click through rate with cutoff 8/26 0.39852 441-gram TF-IDF+ with click through rate with cutoff 8/27 0.40479 432-gram TF-IDF+ with click through rate with cutoff 8/28 0.42008 402-gram TF-IDF+ with click through rate with cutoff without rank 8/29 0.41748 42Dependent 2-gram TF-IDF with click through rate with cutoff without rank 8/30 0.4391 312-gram TF-IDF+ with click with cutoff and view without rank 8/31 0.42676 362-gram TF-IDF+ with click and view with cutoff without rank 9/1 0.43231 33cutoff and view 9/2 0.49363 14view count 9/3 0.49319 16click through and view count 9/4 0.49347 15view count sorted with click, updated, answers, order, rank and cutoff 9/5 0.49393 13view count sorted with answers, cutoff, click, updated, order and rank 9/6 0.499 4view count worted with answers x tf-idf weighted by query 9/7 0.5 3view count + answers x 2-gram tf-idf weighted by query 9/8 0.50152 1view count + answers x snippet 2-gram tf-idf weighted by query 9/9 0.49838 5view count + answers x snippet 2-gram tf-idf weighted by query 9/10 0.50028 2view count + answers x snippet 2-gram tf-idf double-weighted by norm query 9/11 0.49483 7view count + answers x snippet word2vec double-weighted by norm query 9/12 0.49427 9view count + answers x snippet word2vec double-weighted by norm query v2 9/13 0.49412 12view count + answers x snippet L1 word2vec double-weighted by norm query 9/14 0.49437 8view count + answers x snippet cos word2vec double-weighted by norm query

Using clickthrough rate Using updated time

19

slide-20
SLIDE 20

To the forest again… (top30 credit)

▧ What is ID111?

  • 1000
  • 500
500 1000 1500 2000 credit

138[OKSAT] 134[AITOK] 137[AITOK] 124[AITOK] 127[AITOK] 102[AITOK] 145[AITOK] 147[ADAPT] 131[AITOK]* 122[AITOK] 129[AITOK] 141[AITOK] 139[AITOK] 135[OKSAT] 105[AITOK] 117[AITOK] @115[AITOK] @107[AITOK] 126[ADAPT] @109[AITOK] 112[ADAPT] 104[OKSAT] 95[YJRS] 106[ADAPT] 118[ADAPT] @111[AITOK] 93[YJRS] 100[YJRS] 92[YJRS] 113[YJRS]

Using clickthrough rate Using updated time Not using clickthrough rate

20

slide-21
SLIDE 21

Of Offline

View count and TF-IDF based query search are effective in every metric at evaluation with relevance judgement data.

Comparison of Two Type of Tests On Online

Clickthrough rate and Updated time are effective in credit at evaluation with real users.

21

slide-22
SLIDE 22

Summary

Offline test results were incredibly good: ▧ TF-IDF guesses intention of users. ▧ View count represents reputation. Online test result were unexpected: ▧ Clickthrough are effective for live users. ▧ Updated time are also effective.

22

slide-23
SLIDE 23

Want is

Catchy?

Reputation & Freshness

23

slide-24
SLIDE 24

Thanks!

Any ques estions?

You can find me at: @taniokah tanioka.hiroki@tokushima-u.ac.jp

24