A common-neighbors-based random graph model for community structure - - PowerPoint PPT Presentation

a common neighbors based random graph model for community
SMART_READER_LITE
LIVE PREVIEW

A common-neighbors-based random graph model for community structure - - PowerPoint PPT Presentation

A common-neighbors-based random graph model for community structure Emily Fischer Cornell University May 12, 2017 A Exam Emily Fischer Outline 1. Introduction Preferential Attachment (PA) 2. Common Neighbors Model (CN) Degree


slide-1
SLIDE 1

A common-neighbors-based random graph model for community structure

Emily Fischer

Cornell University

May 12, 2017

slide-2
SLIDE 2

A Exam Emily Fischer

Outline

  • 1. Introduction
  • Preferential Attachment (PA)
  • 2. Common Neighbors Model (CN)
  • Degree distribution
  • Community structure
slide-3
SLIDE 3

A Exam Emily Fischer

Preferential Attachment

  • Users prefer to connect to

nodes of high degree

slide-4
SLIDE 4

A Exam Emily Fischer

Preferential Attachment

  • Users prefer to connect to

nodes of high degree

  • Results in heavy-tailed degree

distribution

slide-5
SLIDE 5

A Exam Emily Fischer

Issues with Preferential Attachmment

The LinkedIn graph

  • 1. does NOT have a power law degree distribution
  • 2. has “community structure”
slide-6
SLIDE 6

A Exam Emily Fischer

Log-log plots of degree distribution

slide-7
SLIDE 7

A Exam Emily Fischer

Issues with Preferential Attachmment

The LinkedIn graph

  • 1. does NOT have a power law degree distribution
  • 2. has “community structure”
slide-8
SLIDE 8

A Exam Emily Fischer

What is “community structure”?

  • Strong community

structure

  • More edges within

community than between communities

slide-9
SLIDE 9

A Exam Emily Fischer

What is “community structure"?

  • Preferential attachment
  • One central hub around

high-degree node

slide-10
SLIDE 10

A Exam Emily Fischer

Common Neighbors Model

slide-11
SLIDE 11

A Exam Emily Fischer

Common Neighbors Model

Users prefer to connect to nodes with whom they share many mutual friends

slide-12
SLIDE 12

A Exam Emily Fischer

Common Neighbors Model

Users prefer to connect to nodes with whom they share many mutual friends

slide-13
SLIDE 13

A Exam Emily Fischer

Common Neighbors Model

Sequence of graphs (Gt)t≥0.

  • Given graph Gt with n(t) nodes and m(t) edges
slide-14
SLIDE 14

A Exam Emily Fischer

Common Neighbors Model

Sequence of graphs (Gt)t≥0.

  • Given graph Gt with n(t) nodes and m(t) edges
  • At time t + 1, a new node v arrives with probability α
  • If no new arrival, select v uniformly among existing nodes
slide-15
SLIDE 15

A Exam Emily Fischer

Common Neighbors Model

Sequence of graphs (Gt)t≥0.

  • Given graph Gt with n(t) nodes and m(t) edges
  • At time t + 1, a new node v arrives with probability α
  • If no new arrival, select v uniformly among existing nodes
  • Select receiving node w with probability proportional to number of

common neighbors between v and w

  • Γv(t) is the neighborhood of v at time t
  • Kvw(t) = |Γv(t) ∩ Γw(t)|

P(select w | sender = v) = Kvw(t) + δ

  • u Kvu(t) + δn(t)
slide-16
SLIDE 16

A Exam Emily Fischer

Common Neighbors Model

Sequence of graphs (Gt)t≥0.

  • Given graph Gt with n(t) nodes and m(t) edges
  • At time t + 1, a new node v arrives with probability α
  • If no new arrival, select v uniformly among existing nodes
  • Select receiving node w with probability proportional to number of

common neighbors between v and w

  • Γv(t) is the neighborhood of v at time t
  • Kvw(t) = |Γv(t) ∩ Γw(t)|

P(select w | sender = v) = Kvw(t) + δ

  • u Kvu(t) + δn(t)
  • Form directed edge (v, w).
slide-17
SLIDE 17

A Exam Emily Fischer

Common Neighbors Model

What does Kvw(t) look like? Hard to analyze - feedback

slide-18
SLIDE 18

A Exam Emily Fischer

Common Neighbors Model

What does Kvw(t) look like?

slide-19
SLIDE 19

A Exam Emily Fischer

Common Neighbors Model

What does Kvw(t) look like? Hard to analyze - feedback

slide-20
SLIDE 20

A Exam Emily Fischer

Common Neighbor Process

  • Want to model evolution of Kij(t) on its own.
  • Start at ˜

Kij(0) = 0 for all pairs i, j.

slide-21
SLIDE 21

A Exam Emily Fischer

Common Neighbor Process

  • Want to model evolution of Kij(t) on its own.
  • Start at ˜

Kij(0) = 0 for all pairs i, j.

  • Given ( ˜

Kij(t))i,j≥0, at t + 1,

  • Select i uniformly from existing nodes
  • Choose η = c(n(t))θ nodes, j1, j2, . . . , jη, preferentially with Kijℓ(t) ,

and increase Kijℓ(t + 1) = Kijℓ(t) + 1.

slide-22
SLIDE 22

A Exam Emily Fischer

Common Neighbor Process

  • Want to model evolution of Kij(t) on its own.
  • Start at ˜

Kij(0) = 0 for all pairs i, j.

  • Given ( ˜

Kij(t))i,j≥0, at t + 1,

  • Select i uniformly from existing nodes
  • Choose η = c(n(t))θ nodes, j1, j2, . . . , jη, preferentially with ˜

Kijℓ(t), and increase ˜ Kijℓ(t + 1) = ˜ Kijℓ(t) + 1.

slide-23
SLIDE 23

A Exam Emily Fischer

Common Neighbor Process

Let Ni(t) =

  • j

˜ Kij(t) What is the distribution of Ni(t)?

slide-24
SLIDE 24

A Exam Emily Fischer

Common Neighbor Process

Theorem

Let Ni(t) =

j ˜

Kij(t). Then there exists a random variable Zi such that Ni(t) tθ → Zi in probability, where Zi has characteristic function φZ(z) = exp

  • 1 − α

αθ

αθ

1 t (eitz − 1)dt

  • .
slide-25
SLIDE 25

A Exam Emily Fischer

Common Neighbor Process

slide-26
SLIDE 26

A Exam Emily Fischer

Common Neighbor Process

Result

  • The “total common neighbors” Ni(t) converges when scaled by tθ.

In progress/Future

  • Limiting distribution for ˜

Kij(t).

  • Use these distributions to analyze degree distribution of the graph
slide-27
SLIDE 27

A Exam Emily Fischer

Community Structure

  • How to quantify “strong community structure”
  • Compare community structure of CN and PA.
slide-28
SLIDE 28

A Exam Emily Fischer

Community Structure CN vs. PA

slide-29
SLIDE 29

A Exam Emily Fischer

Modularity

Definition

Given a graph partitioned into c communities, the modularity is Q =

c

  • i=1

(eii − a2

i )

where eii is the fraction of edges with both end vertices in community i, and ai is the fraction of ends of edges with vertices in community i.

slide-30
SLIDE 30

A Exam Emily Fischer

Community Detection

  • Community detection algorithms aim to assign nodes to

communities in a way that is reasonable

  • Some algorithms maximize modularity: Fast-greedy (FG),

Largest-eigenvector (LE)

  • But there are other methods as well: Edge-betweenness (EB),

Walktrap (WC).

slide-31
SLIDE 31

A Exam Emily Fischer

Modularity

Averages of modularity over 100 trials (α = .2, δ = .5) Graph EB FG LE WC CN 500 .450 .472 .423 .401 PA 500 .276 .379 .333 .251 CN 1000 .310 .402 .350 .301 PA 1000 .103 .328 .279 .190 CN 5000 .145 .320 .176 PA 5000 .039 .277 .120

slide-32
SLIDE 32

A Exam Emily Fischer

Conclusion

  • 1. PA mode lacks characteristics of LinkedIn network:
  • Power-law degree distribution
  • Lack of community structure
  • 2. Common Neighbors Model
  • Limiting distribution of Ni(t) in the common neighbors process
  • Better community structure than PA
slide-33
SLIDE 33

A Exam Emily Fischer

Edge Acceptance/Rejection

Node v sends an invitation to a node w.

slide-34
SLIDE 34

A Exam Emily Fischer

Model 1: Edge Acceptance/Rejection

w accepts the invitation with probability pvw(t).

slide-35
SLIDE 35

A Exam Emily Fischer

Edge Acceptance/Rejection

How can acceptance probability achieve goals of (1) non-power law degree distribution and (2) community structure?

  • Rich may choose not to get richer
  • Probability of acceptance based on communities

pvw(t) =

  • p

Cv = Cw q Cv = Cw.

slide-36
SLIDE 36

A Exam Emily Fischer

Edge Acceptance/Rejection

How can acceptance probability achieve goals of (1) non-power law degree distribution and (2) community structure?

  • Rich may choose not to get richer: pvw(t) ↓ 0
  • Probability of acceptance based on communities

pvw(t) =

  • p

Cv = Cw q Cv = Cw.

slide-37
SLIDE 37

A Exam Emily Fischer

Edge Acceptance/Rejection

How can acceptance probability achieve goals of (1) non-power law degree distribution and (2) community structure?

  • Rich may choose not to get richer: pvw(t) ↓ 0
  • Probability of acceptance based on communities:

pvw(t) =

  • p

Cv = Cw q Cv = Cw.

slide-38
SLIDE 38

A Exam Emily Fischer

Edge Acceptance/Rejection

For now, constant acceptance probability pvw(t) = p for all v, w and t ≥ 0.