Mining Interesting Link Formation Rules in Social Networks Cane - - PowerPoint PPT Presentation

mining interesting link formation rules in social networks
SMART_READER_LITE
LIVE PREVIEW

Mining Interesting Link Formation Rules in Social Networks Cane - - PowerPoint PPT Presentation

Mining Interesting Link Formation Rules in Social Networks Cane Wing-Ki Leung, Ee-Peng Lim, David Lo, Jianshu Weng School of Information Systems Singapore Management University Outline Introduction Methodology Empirical Study


slide-1
SLIDE 1

Mining Interesting Link Formation Rules in Social Networks

Cane Wing-Ki Leung, Ee-Peng Lim, David Lo, Jianshu Weng School of Information Systems Singapore Management University

slide-2
SLIDE 2

Outline

  • Introduction
  • Methodology
  • Empirical Study
  • Conclusions

2 CIKM'10

27/10/2010

slide-3
SLIDE 3

Introduction

  • Propose the task of mining interesting link formation rules in

social networks

  • Goal: examine how links are formed in social networks as a

structural effect

27/10/2010

CIKM'10 3

slide-4
SLIDE 4

Example: Reciprocity Effect

  • A simple example – reciprocity effect:

– Given is a pair of nodes, called the start node s and the end node e – Suppose we know that e trusts s at a certain time point. Questions:

  • Will s also trust e later?
  • How frequently/likely will this happen?
  • What other connections between s and e may lead to link formation?

4 CIKM'10

27/10/2010

slide-5
SLIDE 5

More on the task

  • Will s trust e later?

– A temporal constraint – A partial order in which “s trusts e” is formed after all other links connecting s and e

  • How frequently/likely will this happen?

– Quantifying the interestingness of the observed patterns

27/10/2010

CIKM'10 5

slide-6
SLIDE 6

More on the task

  • What other connections between s and e may lead to link

formation?

– Structural constraints require that s and e be connected in some way – We consider dyadic and triadic structures, aka local structures, as they have long been used in sociology for studying and predicting the dynamics of large, complex networks – Seek to mine interesting patterns that obey such constraints

27/10/2010

CIKM'10 6

slide-7
SLIDE 7

Outline

  • Introduction
  • Methodology
  • Empirical Study
  • Conclusions

7 CIKM'10

27/10/2010

slide-8
SLIDE 8

Methodology

  • We propose to study local structures for link formation in

social networks

– Introduce link formation rules (LF-rules) as special subgraph patterns – Formulate our task as a subgraph mining task in a social network, modeled as a directed, labeled, temporal graph – Devised a subgraph mining approach (introduced next) – Applied the proposed approach to two real-world datasets

27/10/2010

CIKM'10 8

slide-9
SLIDE 9

Methodology Overview

– Mine LF-rules from a given social network – Apply randomization technique to the network, for estimating the expected support of LF-rules in a random graph – Evaluate interesting rules with higher-than-expected support

9 CIKM'10

27/10/2010

slide-10
SLIDE 10

LF-Patterns and Rules

  • LF-pattern:

– a graph pattern built upon dyadic and/or triadic structures – in any actual occurrence of a LF-pattern, the link from s to e, or simply (s,e), is formed after all other links in the same pattern

27/10/2010

CIKM'10 10

slide-11
SLIDE 11

LF-Patterns and Rules

  • LF-rule:

– generated from a LF-pattern – consists of a precondition and a postcondition

– the (s,e) link in our illustrations is always the postcondition

27/10/2010

CIKM'10 11

slide-12
SLIDE 12

Mining LF-Rules

  • LF-patterns define the structural constraints of LF-rules

– captures the formation of a link from a node s, called the start node, to another node e, called the end node

  • Mining LF-rules:

– we are given a graph G, a predefined minimum frequency (support) and a predefined minimum confidence – find all LF-patterns that satisfy the frequency threshold – generate LF-rules from the frequent LF-patterns and compute their confidence values – retain those that satisfy the confidence threshold

12 CIKM'10

27/10/2010

slide-13
SLIDE 13

Mining LF-Rules

  • Each LF-rule is associated with

– a support value: % of nodes in G that served as the node s of the rule at least once – a confidence value: the likelihood that the (s,e) link exists given that the precondition connecting s and e exists

  • Example:

– Support: ~24% of nodes in G served as node s of this rule – Confidence: Among the nodes that received a link from another node, ~32% of them reciprocated the link

CIKM'10 13

27/10/2010

slide-14
SLIDE 14

Graph Randomization

  • Why?

– LF-rules may exist in the network just by chance

  • How?

– One possibility is graph randomization: randomize an input graph G, but preserve important nodal properties – Compute the support of LF-rules from the randomized graph, called expected support

  • We randomized the connectivity in G while preserving its in-

degree, out-degree, label and timestamp distributions

14 CIKM'10

27/10/2010

slide-15
SLIDE 15

Measuring (Un)Expectedness

  • Expected Support of a rule w.r.t. G

– its support in G’

  • Surprise of a rule

– support divided by expected support of a rule – the higher the more “surprising” or “unexpected”

  • If link formation does follow some rules, we shall expect

those rules to have higher support in G than in G’

15 CIKM'10

27/10/2010

slide-16
SLIDE 16

Summary of Methodology

  • Introduce LF-patterns and LF-rules

– capture structural and temporal constraints

  • Devise a subgraph mining algorithm to find and count such

patterns in a graph G

– output: a set of LF-rules R with sufficient support and confidence

  • Conduct graph randomization on G

– measure the expected support and surprise values of all rules in R

  • Present interesting rules in R with high surprise values

16 CIKM'10

27/10/2010

slide-17
SLIDE 17

Outline

  • Introduction
  • Methodology
  • Empirical Study
  • Conclusions

17 CIKM'10

27/10/2010

slide-18
SLIDE 18

Datasets

  • Epinions

– Web of Trust, with trust (+ve) and distrust (-ve) links

  • myGamma, courtesy of BuzzCity

– friendship network, with friends (+ve) and foe (-ve) links

  • Expected support computed based on 10 randomized

samples of the graphs

27/10/2010

CIKM'10 18

slide-19
SLIDE 19

Interesting LF-rules in myGamma

  • We focus on myGamma for which the complete history and
  • rdering of friendship links are available
  • Top 5-rules in terms of support

– report the interestingness scores of them in terms of support, expected support, surprise (supp/exp. supp), and confidence

19 CIKM'10

27/10/2010

slide-20
SLIDE 20

Interestingness scores

support expected support surprise (supp/exp. supp) confidence 28.91% 22.41% 1.29 43.22% 28.38% 22.37% 1.27 43.1% 25.42% 13.54% 1.88 39.15% 24.37% 1.22% 20.06 31.98% 20.55% 11.49% 1.79 27.52%

20 CIKM'10

27/10/2010

slide-21
SLIDE 21

Other Observations

  • Users tend to rely more on mutually trusted friends in

forming new friendship links. For example,

– R12 (right) has much higher confidence (~34% vs. ~22%) and surprise values (5.32 vs. 3.52) than R11 (left)

21 CIKM'10

27/10/2010

slide-22
SLIDE 22

Other Observations

  • In myGamma, 3.45% of users reciprocated a friend link from

another user with a foe link, but with a much lower likelihood (15.98%) as compared to reciprocal friend links (31.98%)

– probably due to “unwanted friendship” – not frequent/interesting in Epinions as “unwanted trustor” is not an issue

22 CIKM'10

27/10/2010

slide-23
SLIDE 23

Other Observations

  • If a user has formed a link based on a given precondition

through an intermediary (e.g. common friend), then there is a good chance that s(he) has formed a link based on multiple

  • ccurrences of the same precondition

– 29% of users support R5 (left)

  • About two-third of them also support R32 (middle)
  • About one-third of them also support R34 (right)

23 CIKM'10

27/10/2010

slide-24
SLIDE 24

Outline

  • Background

– our task and motivations

  • Methodology
  • Results on myGamma
  • Conclusions

24 CIKM'10

27/10/2010

slide-25
SLIDE 25

Conclusions

  • We proposed the task of mining interesting link formation

rules in social network

– Introduced the notions of LF-patterns and LF-rules, in which a new link between a node pair is formed as structural effect of preexisting links – Formulated as a subgraph mining task from a directed, labeled, temporal graph

  • Proposed a comprehensive subgraph mining approach

– Devised a LF-rule mining algorithm based on gSpan – Presented LF-rules with higher-than-expected support

CIKM'10 25

27/10/2010

slide-26
SLIDE 26

Thank You!

26 CIKM'10

27/10/2010