Efficient Anti-community Detection in Complex Networks Sebastian - - PowerPoint PPT Presentation

efficient anti community detection in complex networks
SMART_READER_LITE
LIVE PREVIEW

Efficient Anti-community Detection in Complex Networks Sebastian - - PowerPoint PPT Presentation

Efficient Anti-community Detection in Complex Networks Sebastian Lackner 1 , Andreas Spitz 1 , Mathias Weidemller 2 , and Michael Gertz 1 30 th International Conference on Scientific and Statistical Database Management (SSDBM) July 9 - 11, 2018,


slide-1
SLIDE 1

Efficient Anti-community Detection in Complex Networks

Sebastian Lackner1, Andreas Spitz1, Mathias Weidemüller2, and Michael Gertz1 30th International Conference on Scientific and Statistical Database Management (SSDBM) July 9 - 11, 2018, Bolzano-Bozen, Italy

1Database Systems Research Group, Heidelberg University, Germany

{lackner,spitz,gertz}@informatik.uni-heidelberg.de

2Qantum Dynamics of Atomic and Molecular Systems Group, Heidelberg University, Germany

weidemueller@uni-heidelberg.de

slide-2
SLIDE 2

Community Structure

Many networks contain community structures. Communities are characterized by ◮ many internal edges ◮ few external edges (generalization of cliques) Applications in sociology, computer science, physics, biology, . . . [For10]

1

slide-3
SLIDE 3

Zachary’s Karate Club Network

John A.

  • Mr. Hi

|V | = 34, |E| = 156 Communities in Zachary’s karate club network [Zac77]. Colors denote membership afer the fission of the club.

2

slide-4
SLIDE 4

Anti-community Structure

Anti-Communities are characterized by ◮ few internal edges ◮ many external edges (generalization of multipartite graphs)

3

slide-5
SLIDE 5

Zachary’s Karate Club Network

  • Mr. Hi

John A.

|V | = 34, |E| = 156 Anti-communities in Zachary’s karate club network [Zac77]. Colors denote membership afer the fission of the club.

4

slide-6
SLIDE 6

Challenges and Objectives

◮ Definition How to define anti-communities? ◮ Models and Algorithms Which algorithms can be used? ◮ Exploratory Analysis Are anti-communities also present in other networks?

5

slide-7
SLIDE 7

Definition

slide-8
SLIDE 8

Graph Complement

Original network with 3 anti-communities

6

slide-9
SLIDE 9

Graph Complement

Original network with 3 anti-communities Graph complement with 3 communities

6

slide-10
SLIDE 10

Definition

Definition Vertices C ⊆ V of graph G = (V, E) form an anti-community iff C forms a community in the graph complement ˆ G = (V, ˆ E) with ˆ E := (V × V ) \ E.

7

slide-11
SLIDE 11

Definition

Definition Vertices C ⊆ V of graph G = (V, E) form an anti-community iff C forms a community in the graph complement ˆ G = (V, ˆ E) with ˆ E := (V × V ) \ E. Conclusions: ◮ Not really unique (many definitions for communities) ◮ Many existing algorithms and methods can be reused

7

slide-12
SLIDE 12

Models and Algorithms

slide-13
SLIDE 13

Proposed Methods

Existing methods either slow or poor quality. Greedy algorithms ◮ using Modularity measure [NG04] ◮ using Anti-Modularity measure [CYC14] Vertex similarity ◮ Adjacency mapping ◮ Distance mapping

8

slide-14
SLIDE 14

Proposed Methods

Existing methods either slow or poor quality. Greedy algorithms ◮ using Modularity measure [NG04] ◮ using Anti-Modularity measure [CYC14] Vertex similarity ◮ Adjacency mapping ◮ Distance mapping Optimization problem Clustering problem

8

slide-15
SLIDE 15

Modularity Measure

Intuition: Number of internal edges in G = (V, E) minus number of edges in a random graph with same degree-distribution. Modularity of a graph M := 1 2m

  • ij
  • aij − didj

2m

  • δ(gi, gj)

m: Total number of edges A = [aij]: Adjacency matrix of G d = [di]: Vertex degrees δ(gi, gj): 1 iff vi and vj are both in same group

9

slide-16
SLIDE 16

Greedy Algorithms

Make locally optimal choice at each step.

  • 1. Initialization

Assign each vertex to a separate group

10

slide-17
SLIDE 17

Greedy Algorithms

Make locally optimal choice at each step.

  • 1. Initialization

Assign each vertex to a separate group

  • 2. Merge

Merge two groups, s.t. the Modularity is minimized (or the Anti-Modularity is maximized)

10

slide-18
SLIDE 18

Greedy Algorithms

Make locally optimal choice at each step.

  • 1. Initialization

Assign each vertex to a separate group

  • 2. Merge

Merge two groups, s.t. the Modularity is minimized (or the Anti-Modularity is maximized)

  • 3. Repeat

If more than one group is lef, go to step 2. Otherwise, return groups with best (Anti-)Modularity.

10

slide-19
SLIDE 19

Vertex Similarity

Based on the concept of structural equivalence.

  • 1. Mapping

Map vertices to feature vector representation

◮ Adjacency mapping: M(vi) := [aij]j ◮ Distance mapping: M(vi) := [d(vi, v1), . . . , d(vi, vn)]

11

slide-20
SLIDE 20

Vertex Similarity

Based on the concept of structural equivalence.

  • 1. Mapping

Map vertices to feature vector representation

◮ Adjacency mapping: M(vi) := [aij]j ◮ Distance mapping: M(vi) := [d(vi, v1), . . . , d(vi, vn)]

  • 2. Clustering

Compute clustering of feature vectors (k-Means, . . . )

11

slide-21
SLIDE 21

Runtime Evaluation

Label propagation Greedy Modularity Greedy Anti-modularity Vertex sim. Adjacency Vertex sim. Distance Graph Complement + Mod. Stochastic Block Model Nested Stochastic Block M.

Evaluation with Erdős-Rényi random graphs (sparse)

12

slide-22
SLIDE 22

Exploratory Analysis

slide-23
SLIDE 23

Spectral Line Networks

Goal: Encode energy states of a physical system (and their relation) in a network.

+Ze

n=1 n=2 n=3

ΔE ΔE=hf

13

slide-24
SLIDE 24

Spectral Line Networks

Goal: Encode energy states of a physical system (and their relation) in a network.

+Ze

n=1 n=2 n=3

ΔE ΔE=hf

E2 E1

13

slide-25
SLIDE 25

Example: Spectral Line Network of Helium

Spectral line network network of Helium [KRRN15] with |V | = 183, |E| = 2282. Colors show the anti-communities

  • btained with a vertex similarity method.

Circles show the ground-truth partition ◮ orbital angular momentum (ℓ), ◮ total angular momentum (j), and ◮ spin (s)

Parahelium S = 0 Orthohelium S = 1 ℓ = 0 ℓ = 1 ℓ = 2 ℓ = 3 ℓ = 4 ℓ = 5 ℓ = 6 ℓ = 7

14

slide-26
SLIDE 26

Example: Spectral Line Network of Helium

  • rk of Helium

2282. anti-communities similarity method.

Parahelium S = 0 Orthohelium S = 1 ℓ = 0 ℓ = 1 ℓ = 2

14

slide-27
SLIDE 27

Example: Adjectives and Nouns Network

adjective noun

|V | = 112, |E| = 425 Adjectives and Nouns network [New06]. Circles correspond to the anti-communities found by the greedy modularity minimization algorithm.

15

slide-28
SLIDE 28

Example: Adjectives and Nouns Network

perfect adjective noun [...] and made himself a perfect master of his profession [...] master

Adjectives and Nouns network [New06]. Circles correspond to the anti-communities found by the greedy modularity minimization algorithm.

15

slide-29
SLIDE 29

Example: Adjectives and Nouns Network

round morning light low money possible perfect anything arm eye mother half short beautiful bright great fancy strong pleasant adjective noun

Adjectives and Nouns network [New06]. Circles correspond to the anti-communities found by the greedy modularity minimization algorithm.

15

slide-30
SLIDE 30

Summary

slide-31
SLIDE 31

Summary

◮ Anti-community structures are present in many networks, including

◮ networks of spectral line transitions ◮ Zachary’s karate club network ◮ . . . and many more

◮ Many concepts of traditional community detection can be reused by computing the graph complement ◮ Specialized algorithms and measures are required if performance is important

16

slide-32
SLIDE 32

Further Reading

◮ Evaluation measures: Adaption of the adjusted Rand index and normalized mutual information measures for anti-communities. ◮ Random graphs: Algorithms to generate Erdős-Rényi and Barabási-Albert random graph model for graphs with (anti-)community structure. ◮ Performance evaluation: Qality comparison for graphs with known community structure.

17

slide-33
SLIDE 33

Resources

Implementations and datasets available at:

http://dbs.ifi.uni-heidelberg.de/ resources/anticommunity

Thank you!

18

slide-34
SLIDE 34

Bibliography

slide-35
SLIDE 35

Bibliography i

[CYC14]

  • L. Chen, Q. Yu, and B. Chen. “Anti-modularity and anti-community

detecting in complex networks”. In: Inf. Sci. 275 (2014), pp. 293–313. [For10]

  • S. Fortunato. “Community detection in graphs”. In: Phys. Rep. 486.3

(2010), pp. 75–174. [Hol04]

  • J. M. Hollas. Modern spectroscopy. John Wiley & Sons, 2004.

[KRRN15]

  • A. Kramida, Y. Ralchenko, J. Reader, and NIST ASD Team. NIST

Atomic Spectra Database (ver. 5.3), [Online]. Available:

http://physics.nist.gov/asd [2017, July 4]. National Institute

  • f Standards and Technology, Gaithersburg, MD. 2015.

19

slide-36
SLIDE 36

Bibliography ii

[New06]

  • M. E. J. Newman. “Finding community structure in networks using

the eigenvectors of matrices”. In: Phys. Rev. E 74.3 (2006). [NG04]

  • M. E. J. Newman and M. Girvan. “Finding and evaluating community

structure in networks”. In: Phys. Rev. E 69.2 (2004). [Pei14]

  • T. P. Peixoto. “Hierarchical block structures and high-resolution

model selection in large networks”. In: Phys. Rev. X 4 (1 2014). [Pei17]

  • T. P. Peixoto. “Bayesian stochastic blockmodeling”. In: (2017). url:

https://arxiv.org/abs/1705.10225.

[Zac77]

  • W. W. Zachary. “An information flow model for conflict and fission in

small groups”. In: J. Anthropol. Res. 33.4 (1977), pp. 452–473.

20

slide-37
SLIDE 37

Backup Slides

slide-38
SLIDE 38

Baseline Methods

◮ Graph complement + X Allows to reuse existing methods, but high memory usage / slow. ◮ Label propagation algorithm for anti-communities [CYC14] Fast, but poor quality ◮ Generic methods e.g., Stochastic block models [Pei14; Pei17]

slide-39
SLIDE 39

Complexity of Greedy Algorithms

◮ Community detection: Naive method O(n3) Skip unconnected edges O(n(n + m)) Use max-heap data structure O(n log2 n)1

1for graphs with strong hierarchical structure

slide-40
SLIDE 40

Complexity of Greedy Algorithms

◮ Community detection: Naive method O(n3) Skip unconnected edges O(n(n + m)) Use max-heap data structure O(n log2 n)1 ◮ Anti-community detection: Graph complement O(n3) Our method O(n(n + m))

1for graphs with strong hierarchical structure

slide-41
SLIDE 41

Complexity of Greedy Algorithms

◮ Community detection: Naive method O(n3) Skip unconnected edges O(n(n + m)) Use max-heap data structure O(n log2 n)1 ◮ Anti-community detection: Graph complement O(n3) Our method O(n(n + m)) Result can also be used to improve community detection!

1for graphs with strong hierarchical structure

slide-42
SLIDE 42

Basics of the Bohr Model

Goal: Encode energy states of a physical system (and their relation) in a network.

+Ze

n=1 n=2 n=3

ΔE ΔE=hf positively charged nucleus circular trajectories electron

slide-43
SLIDE 43

Basics of the Bohr Model

Goal: Encode energy states of a physical system (and their relation) in a network.

+Ze

n=1 n=2 n=3

ΔE ΔE=hf

◮ Energy states defined by possible orbits of electrons ◮ State transitions requires / releases energy ∆E → emission or absorption line

slide-44
SLIDE 44

Basics of the Bohr Model

Goal: Encode energy states of a physical system (and their relation) in a network.

+Ze

n=1 n=2 n=3

ΔE ΔE=hf

◮ Energy states defined by possible orbits of electrons ◮ State transitions requires / releases energy ∆E → emission or absorption line Simplified model!

slide-45
SLIDE 45

Spectral Line Networks

Source Absorption cell Entrance slit Exit slit Dispersing element Detector Display

Overview of an absorption experiment. Visualization based on Modern Spectroscopy by Hollas [Hol04].

slide-46
SLIDE 46

Spectral Line Networks

Source Absorption cell Entrance slit Exit slit Dispersing element Detector Display

Overview of an absorption experiment. Visualization based on Modern Spectroscopy by Hollas [Hol04]. Spectral lines ◮ State transitions ◮ Energy states ◮ Network

slide-47
SLIDE 47

Performance evaluation

1.0 0.5 0.0 0.5 1.0

∆p = pext − pint

0.0 0.2 0.4 0.6 0.8 1.0

ARI

GCM LP SBM NSBM GrM GrAM VSA VSD

Evaluation with Erdős-Rényi random graphs (k = 5)