Determining Top- k Nodes in Social Networks using Shapley Value - - PowerPoint PPT Presentation

determining top k nodes in social networks using shapley
SMART_READER_LITE
LIVE PREVIEW

Determining Top- k Nodes in Social Networks using Shapley Value - - PowerPoint PPT Presentation

Determining Top- k Nodes in Social Networks using Shapley Value Research Supervisor: Prof. Y. Narahari Ramasuri Narayanam nrsuri@csa.iisc.ernet.in Electronic Commerce Laboratory Department of Computer Science and Automation Indian Institute


slide-1
SLIDE 1

Determining Top-k Nodes in Social Networks using Shapley Value

Research Supervisor: Prof. Y. Narahari

Ramasuri Narayanam nrsuri@csa.iisc.ernet.in Electronic Commerce Laboratory Department of Computer Science and Automation Indian Institute of Science Bangalore, India

May 30, 2009

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 1 / 26

slide-2
SLIDE 2

Outline of the Presentation

1

Social Networks : Introduction

1

Influential Nodes in Social Networks

1

Shapely Value based Algorithm for Top-k Nodes Problem

1

Experimental Results

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 2 / 26

slide-3
SLIDE 3

Social Networks : Introduction

Social Networks

Social Networks: A social structure made up of nodes that are tied by

  • ne or more specific types of relationships.

Examples: Friendship networks, coauthorship networks, trade networks, etc.

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 3 / 26

slide-4
SLIDE 4

Social Networks : Introduction

Social Networks

Social Networks: A social structure made up of nodes that are tied by

  • ne or more specific types of relationships.

Examples: Friendship networks, coauthorship networks, trade networks, etc.

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 3 / 26

slide-5
SLIDE 5

Social Networks : Introduction

Social Networks

Social Networks: A social structure made up of nodes that are tied by

  • ne or more specific types of relationships.

Examples: Friendship networks, coauthorship networks, trade networks, etc.

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 3 / 26

slide-6
SLIDE 6

Social Networks : Introduction

Real world social networks: Orkut, wikis, blogs, etc. Social networks are modeled using a graph where nodes represent individuals and edges represents the relationships between nodes

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 4 / 26

slide-7
SLIDE 7

Social Networks : Introduction

Features of Social Networks

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 5 / 26

slide-8
SLIDE 8

Influential Nodes in Social Networks

Influential Nodes in Social Networks

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 6 / 26

slide-9
SLIDE 9

Influential Nodes in Social Networks

Motivating Example 1: Diffusion of Information

Social networks play a key role for the spread of an innovation or technology We would like to market a new product that we hope will be adopted by a large fraction of the network Which set of the individuals should we target for? Idea is to initially target a few influential individuals in the network who will recommend the product to other friends, and so on A natural question is to find a target set of desired cardinality consisting of influential nodes to maximize the volume of the information cascade

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 7 / 26

slide-10
SLIDE 10

Influential Nodes in Social Networks

Motivating Example 2: Co-authorship Networks

co-authorship network is concerned with the collaboration patterns among research communities nodes correspond to researchers and an edge exists if the two corresponding researchers collaborate in a paper interesting to find the most prolific researchers since they are most likely to be the trend setters for breakthrough

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 8 / 26

slide-11
SLIDE 11

Influential Nodes in Social Networks

Linear Thresholds Model

Call a node active if it has adopted the information Initially every node is inactive Let us consider a node i and represent its neighbors by the set N(i) Node i is influenced by a neighbor node j according to a weight wij. These weights are normalized in such a way that

  • j∈N(i)

wij ≤ 1. Further each node i chooses a threshold, say θi, uniformly at random from the interval [0,1] This threshold represents the weighted fraction of node i

′s neighbors

that must become active in order for node i to become active

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 9 / 26

slide-12
SLIDE 12

Influential Nodes in Social Networks

Given a random choice of thresholds and an initial set (call it S) of active nodes, the diffusion process propagates as follows: in time step t, all nodes that were active in step (t − 1) remain active we activate every node i for which the total weight of its active neighbors is at least θi if A(i) is assumed to be the set of active neighbors of node i, then i gets activated if

  • j∈A(i)

wij ≥ θi. This process stops when there is no new active node in a particular time interval

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 10 / 26

slide-13
SLIDE 13

Influential Nodes in Social Networks

Illustrating Linear Threshold Model

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 11 / 26

slide-14
SLIDE 14

Influential Nodes in Social Networks

Illustrating Linear Threshold Model

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 11 / 26

slide-15
SLIDE 15

Influential Nodes in Social Networks

Illustrating Linear Threshold Model

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 11 / 26

slide-16
SLIDE 16

Influential Nodes in Social Networks

Illustrating Linear Threshold Model

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 11 / 26

slide-17
SLIDE 17

Influential Nodes in Social Networks

Illustrating Linear Threshold Model

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 11 / 26

slide-18
SLIDE 18

Influential Nodes in Social Networks

Illustrating Linear Threshold Model

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 11 / 26

slide-19
SLIDE 19

Influential Nodes in Social Networks

Top-k Nodes Problem

Top-k Nodes Problem:

Let us define an objective function σ(.) to be the expected number of active nodes at the end of the diffusion process If S is the initial set of target nodes, then σ(S) is the expected number

  • f active nodes at the end of the diffusion process

For economic reasons, we want to limit the size of the initial active set S For a given constant k, the top-k nodes problem seeks to find a subset

  • f nodes S of cardinality k that maximizes the expected value of σ(S)

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 12 / 26

slide-20
SLIDE 20

Influential Nodes in Social Networks

Applications

Viral Marketing Databases Water Distribution Networks Blogspace Newsgroups Virus propagation networks ——————————————————-

  • R. Akbarinia, F.E. Pacitti, and F.P. Valduriez. Best Position Algorithms for

Top-k Queries. In VLDB, 2007.

  • J. Leskovec, A. Krause, and C. Guestrin. Cost-effective outbreak detection

in networks. In ACM KDD, 2007.

  • N. Agarwal, H. Liu, L. Tang, and P.S. Yu. Identifying influential bloggers in

a community. In WSDM, 2008.

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 13 / 26

slide-21
SLIDE 21

Shapely Value based Algorithm for Top-k Nodes Problem

Shapely Value based Algorithm for Top-k Nodes Problem

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 14 / 26

slide-22
SLIDE 22

Shapely Value based Algorithm for Top-k Nodes Problem

Our Algorithm

Influence of a Node: expected number of other nodes that become active using this node we approach the top-k nodes problem using cooperative game theory we measure the influential capabilities of the nodes as provided by Shapley value

  • ur proposed algorithm is in two steps:

1

construction of RankList[]

2

choosing the top-k nodes from RankList[]

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 15 / 26

slide-23
SLIDE 23

Shapely Value based Algorithm for Top-k Nodes Problem

Construction of Ranklist[]

1 Let πj be the j-th permutation in ˆ

Ω.

2 for j = 1 to t do 3

for i = 1 to n, do

4

MC[i] ← MC[i] + v(Si(πj) ∪ {i}) − v(Si(πj))

5

end for

6 end for 7 for i = 1 to n, do 8

compute Φ[i] ← MC[i]

t

9 end for 10 use an efficient sorting algorithm to sort the nodes in non-increasing

  • rder based on average marginal contribution values

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 16 / 26

slide-24
SLIDE 24

Shapely Value based Algorithm for Top-k Nodes Problem

Choosing Top-k Nodes

1 Naive approach is to choose the first k in the RankList[] as the top-k

nodoes

2 Drawback: Nodes may be clustered 3 RankList[]={5,4,2,7,11,15,9,13,12,10,6,14,3,1,8} 4 Top 4 nodes are clustered 5 Choose nodes satisfying

ranking order of the nodes spreading over the network

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 17 / 26

slide-25
SLIDE 25

Shapely Value based Algorithm for Top-k Nodes Problem

k value Greedy Shapley Value MDH HCH Algorithm Algorithm based Algorithm 1 4 4 4 2 2 8 7 7 4 3 10 10 8 6 4 12 12 8 7 5 13 13 10 8 6 14 14 13 8 7 15 15 13 8 8 15 15 13 8 9 15 15 13 10 10 15 15 13 11 11 15 15 13 13 12 15 15 13 13 13 15 15 14 14 14 15 15 15 15 15 15 15 15 15

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 18 / 26

slide-26
SLIDE 26

Experimental Results

Experimental Results

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 19 / 26

slide-27
SLIDE 27

Experimental Results

Benchmark Algorithms for Top-k Nodes

1 Greedy Algorithm 2 Maximum Degree Heuristic based Algorithm 3 High Clustering Coefficient based Algorithm Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 20 / 26

slide-28
SLIDE 28

Experimental Results

Network Datasets

Dataset Number of Nodes Sparse Random Graph 500 Scale-free Graph 500 Jazz 198 NIPS 1061 Netscience 1589 HEP 10748

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 21 / 26

slide-29
SLIDE 29

Experimental Results

Experiments: Synthetic Datasets

50 100 150 200 250 5 10 15 20 25 30 Number of Active Nodes Initial Target Set Size Greedy Algorithm Proposed Algorithm MDH HCH 50 100 150 200 250 300 350 400 5 10 15 20 25 30 Number of Active Nodes Initial Target Set Size Proposed Algorithm Naive Shapley MDH HCCH 50 100 150 200 250 300 350 400 450 500 550 5 10 15 20 25 30 Number of Active Nodes Initial Target Set Size Greedy Algorithm Proposed Algorithm MDH HCH 50 100 150 200 250 300 350 400 450 5 10 15 20 25 30 Number of Active Nodes Initial Target Set Size Greedy Algorithm Proposed Algorithm MDH HCH

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 22 / 26

slide-30
SLIDE 30

Experimental Results

Experiments: Real World Datasets

25 50 75 100 125 150 175 200 225 10 20 30 40 50 60 70 80 90 100 Number of Active Nodes Initial Target Set Size Greedy Algorithm Proposed Algorithm MDH HCH 50 100 150 200 250 300 350 400 10 20 30 Number of Active Nodes Initial Target Set Size Greedy Algorithm Proposed Algorithm MDH HCH 50 100 150 200 250 300 350 400 10 20 30 40 50 Number of Active Nodes Initial Target Set Size Greedy Algorithm Proposed Algorithm MDH HCH 200 400 600 800 1000 1200 5 10 15 20 25 30 Number of Active Nodes Initial Target Set Size Greedy Algorithm Proposed Algorithm MDH HCH

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 23 / 26

slide-31
SLIDE 31

Experimental Results

Visualization of Jazz Dataset

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 24 / 26

slide-32
SLIDE 32

Experimental Results

Visualization of NIPS Dataset

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 25 / 26

slide-33
SLIDE 33

Thank You

Research Supervisor: Prof. Y. Narahari (IISc) ECL, CSA, IISc May 30, 2009 26 / 26