Web Mining Web Mining Web Mining Web Mining
Based on several presentations found on the web: Sh i Ull T i P d Shapiro, Ullman, Terziyan, Pedersen ...
1
Wh t i W b Mi i ? Wh t i W b Mi i ? What is Web Mining? What is Web Mining?
Web mining is the use of data mining techniques to
automatically discover and extract information automat cally d scover and extract nformat on from Web documents/services (Et i i 1996 CACM 39(11)) (Etzioni, 1996, CACM 39(11))
Web mining aims to discovery useful information or
m g m y f f m knowledge from the Web hyperlink structure, page content and usage data. g (Bing LIU 2007, Web Data Mining, Springer)
2
Wh t i W b Mi i ? Wh t i W b Mi i ? What is Web Mining? What is Web Mining?
Motivation / Opportunity
The WWW is huge, widely distributed, global information service
d h f h f d centre and, therefore, constitutes a rich source for data mining
Intelligent Web Search
P li ti R d ti E i
Personalization, Recommendation Engines Web-commerce applications Building the Semantic Web Building the Semantic Web Web page classification and categorization News classification and clustering News classification and clustering Information / trend monitoring Analysis of online communities 3
y
Web and mail spam filtering
Ab d d th it i i Ab d d th it i i Abundance and authority crisis Abundance and authority crisis
Liberal and informal culture of content generation and
dissemination
Redundancy and non-standard form and content Millions of qualifying pages for most broad queries
M ll ons of qual fy ng pages for most broad quer es
Example: java or kayaking
N th it ti i f ti b t th li bilit f it
No authoritative information about the reliability of a site Little support for adapting to the background of specific users Pages added continuously and average page changes in a few
weeks
4