Relevance of Time Spent on Web Pages WEBKDD August 20, 2006, - - PowerPoint PPT Presentation

relevance of time spent on web pages
SMART_READER_LITE
LIVE PREVIEW

Relevance of Time Spent on Web Pages WEBKDD August 20, 2006, - - PowerPoint PPT Presentation

Relevance of Time Spent on Web Pages WEBKDD August 20, 2006, Philadelphia, USA Peter I. Hofgesang hpi@few.vu.nl Intention of an online visitor Real-world: customers have the ability to explicitly express what they are looking for Web:


slide-1
SLIDE 1

Relevance of Time Spent on Web Pages

Peter I. Hofgesang

hpi@few.vu.nl

WEBKDD August 20, 2006, Philadelphia, USA

slide-2
SLIDE 2

Intention of an online visitor

  • Real-world: customers have the ability to

explicitly express what they are looking for

  • Web: intention is hidden and can only be

partially revealed from implicit indicators in the traces users leave behind

WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA

slide-3
SLIDE 3

(Broadly) Available information

  • Order of visited pages (P1P2P3 …)
  • Page popularity (nr. of times visited)
  • Time Spent on Page (TSP)?

– claimed to be important in IR, HCI, E-learning – only rarely used in WUM – details are often not reported (however, preprocessing is not obvious!)

WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA

slide-4
SLIDE 4

Example I

20 40 60 80 100 120 Home 1 SALES-Rest 2 SALES-Household 3 SALES-Household 4 SHOP-Household 5 SHOP-Household 6 SHOP-Household 7 SHOP-Household 8 SHOP-Household 9 SHOP-Household 10 SALES-Rest 11 SALES-Man 12 SALES-Man 13 SALES-Man 14 SALES-Man 15 SALES-Man 16 SALES-Man 17 SALES-Man 18 SALES-Man 19 SALES-Woman 20 SALES-Woman 21 SALES-Woman 22 SALES-Woman 23 SALES-Woman 24 SALES-Woman 25 SALES-Woman 26 TSoP (seconds) Pages visited

WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA

slide-5
SLIDE 5

Example II

20 40 60 80 100 120 140 160 180 200 SHOP-PC 1 Info 2 SHOP-TV 3 SHOP-Child 4 SHOP-PC 5 Info 6 SHOP-Child 7 SHOP-Home 8 SHOP-Child 9 SHOP-Home 10 SHOP-PC 11 TSoP (seconds) Pages visited

WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA

slide-6
SLIDE 6

Example III

500 1000 1500 Home 1 SHOP-TV 2 CART-Home 3 CART-Add 4 CART-Add 5 CART-Home 6 Order 7 Order 8 Order 9 Personal 10 Personal 11 Personal 12 Personal 13 Personal 14 Personal 15 Home 16 SHOP-DVD 17 SHOP-DVD 18 SHOP-DVD 19 SHOP-DVD 20 SHOP-DVD 21 SHOP-DVD 22 SHOP-DVD 23 SHOP-DVD 24 SHOP-DVD 25 TSoP (seconds) Pages visited

WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA

slide-7
SLIDE 7

Influential factors I

TSP1 = T2 − T1 (optimistic!)

  • Data preprocessing

– filtering out robot transactions – session identification

  • Distraction
10 20 30 40 50 60 70 80 90 100 110 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 x 10 5

Time (seconds) Number of clicks

Bank Retail 1 Retail 2

WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA

slide-8
SLIDE 8

Influential factors II

  • Page type (Granularity of pages)
  • Hierarchy
  • Network bandwidth and server load
  • Speed of reading, etc.

TSP2 = T2 − T1 − TnetworkTraffic − TserverPageGeneration − Tdistraction

WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA

slide-9
SLIDE 9

Clustering

0.1 0.2 Home 1 Personal 2 SHOP-Woman 3 SHOP-Man 4 SHOP-Child 5 SHOP-Beauty 6 SHOP-Home 7 SHOP-Household 8 SHOP-Sport 9 SHOP-PC 10 SHOP-GSM 11 SHOP-DVD 12 SHOP-TV 13 SHOP-Games 14 SHOP-Special 15 SHOP-Rest 16 SALES-Woman 17 SALES-Man 18 SALES-Child 19 SALES-Beauty 20 SALES-Home 21 SALES-Household 22 SALES-Sport 23 SALES-PC 24 SALES-GSM 25 SALES-DVD 26 SALES-TV 27 SALES-Games 28 SALES-Rest 29 Search 30 CART-Home 31 CART-Add 32 CART-Change 33 CART-Remove 34 DirectOrder 35 Order 36 ThankYou 37 Info 38 Contact 39 Catalog 40

27.63% PageId

0.5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

21.18%

0.1 0.2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

19.45%

0.1 0.2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

19.14%

0.2 0.4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

12.6%

0.1 0.2 Home 1 Personal 2 SHOP-Woman 3 SHOP-Man 4 SHOP-Child 5 SHOP-Beauty 6 SHOP-Home 7 SHOP-Household 8 SHOP-Sport 9 SHOP-PC 10 SHOP-GSM 11 SHOP-DVD 12 SHOP-TV 13 SHOP-Games 14 SHOP-Special 15 SHOP-Rest 16 SALES-Woman 17 SALES-Man 18 SALES-Child 19 SALES-Beauty 20 SALES-Home 21 SALES-Household 22 SALES-Sport 23 SALES-PC 24 SALES-GSM 25 SALES-DVD 26 SALES-TV 27 SALES-Games 28 SALES-Rest 29 Search 30 CART-Home 31 CART-Add 32 CART-Change 33 CART-Remove 34 DirectOrder 35 Order 36 ThankYou 37 Info 38 Contact 39 Catalog 40

33.03% PageId

0.5 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

22.01%

0.5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

19.56%

0.2 0.4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

17.63%

0.5 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

7.78%

Cadez et al. (2001) Similarity based

slide-10
SLIDE 10

Conclusion

  • TSP is a sensitive measure
  • Web log data preprocessing and Time

normalization required

  • Added value in identifying user intention
  • For many applications the combination of

TSP and frequency may be the optimal choice

WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA

slide-11
SLIDE 11

Future (current) work

  • Objective measures of relevance
  • Normally field experiment to provide some

kind of labeled data

  • Special testbed

– e.g., in case of a retail shop environment we have special labels for buyers – the purchased items indicate user interest and can be compared with the visit

WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA

slide-12
SLIDE 12

Questions?