Array Based Betweenness Centrality Eric Robinson Northeastern - - PowerPoint PPT Presentation

array based betweenness centrality
SMART_READER_LITE
LIVE PREVIEW

Array Based Betweenness Centrality Eric Robinson Northeastern - - PowerPoint PPT Presentation

Array Based Betweenness Centrality Eric Robinson Northeastern University MIT Lincoln Labs Jeremy Kepner MIT Lincoln Labs Vertex Betweenness Centrality Which Vertices are Important? Vertex Betweenness Centrality Which Vertices are


slide-1
SLIDE 1

Array Based Betweenness Centrality

Eric Robinson Northeastern University MIT Lincoln Labs Jeremy Kepner MIT Lincoln Labs

slide-2
SLIDE 2

Vertex Betweenness Centrality Which Vertices are Important?

slide-3
SLIDE 3

Loss Of Communication Slower Communication

Vertex Betweenness Centrality Which Vertices are Important?

slide-4
SLIDE 4

Vertex Betweenness Centrality How do we Measure Importance?

Number of Shortest Paths Through Node

50 32 45 8 8 9 9 1

slide-5
SLIDE 5

Vertex Betweenness Centrality Traditional Algorithm

50 32 45 8 8 9 9 1

∈ ≠ ≠

=

V v t s st st B

v v C

σ σ

) ( ) (

slide-6
SLIDE 6

Traditional Algorithm Theoretical Time and Space

Time: O(N3) Storage: O(N2)

50 32 45 8 8 9 9 1

slide-7
SLIDE 7

Vertex Betweenness Centrality Updating Algorithm

  • For each starting node:
  • Once you know:
  • The depth of each node in the BFS,

D

  • The centrality updates for nodes at depth d,

u

  • The shortest path counts from the root,

s

  • Can determine centrality of nodes at depth d-1:
  • For each node, v, at depth d-1, it's update is the sum:

uv=∑ ∀v ,w∈E ,w∈Dd ,1ud ∗sv/ sw

slide-8
SLIDE 8

Updating Algorithm An Example

Find Single Source Shortest Path Counts 7 8 6 1 5 4 9 2 3 10 11 1

O(N + M)

slide-9
SLIDE 9

Updating Algorithm An Example

Find Single Source Shortest Path Counts 7 8 6 1 5 4 9 2 3 10 11 1 1 1

O(N + M)

slide-10
SLIDE 10

Updating Algorithm An Example

Find Single Source Shortest Path Counts 7 8 6 1 5 4 9 2 3 10 11 1 1 1 2

O(N + M)

slide-11
SLIDE 11

Updating Algorithm An Example

Find Single Source Shortest Path Counts 7 8 6 1 5 4 9 2 3 10 11 1 1 1 2 2 2

O(N + M)

slide-12
SLIDE 12

Updating Algorithm An Example

Find Single Source Shortest Path Counts 7 8 6 1 5 4 9 2 3 10 11 1 1 1 2 2 2 2 2

O(N + M)

slide-13
SLIDE 13

Updating Algorithm An Example

Find Single Source Shortest Path Counts 7 8 6 1 5 4 9 2 3 10 11 1 1 1 2 2 2 2 2 2 2 2

O(N + M)

slide-14
SLIDE 14

Updating Algorithm An Example

7 8 6 1 5 4 9 2 3 10 11 Perform Updates in Reverse Depth Order

O(N + M)

1 2 3 4 5 6 7 8 9 10 11 2 2 2 2 2 2 2 2 1 1 1

Shortest Paths:

slide-15
SLIDE 15

Updating Algorithm An Example

7 8 6 1 5 4 9 2 3 10 11 Perform Updates in Reverse Depth Order

O(N + M)

3

1 2 3 4 5 6 7 8 9 10 11 2 2 2 2 2 2 2 2 1 1 1

Shortest Paths:

slide-16
SLIDE 16

Updating Algorithm An Example

7 8 6 1 5 4 9 2 3 10 11 Perform Updates in Reverse Depth Order

O(N + M)

3 1 4

1 2 3 4 5 6 7 8 9 10 11 2 2 2 2 2 2 2 2 1 1 1

Shortest Paths:

slide-17
SLIDE 17

Updating Algorithm An Example

7 8 6 1 5 4 9 2 3 10 11 Perform Updates in Reverse Depth Order

O(N + M)

3 1 4 7

1 2 3 4 5 6 7 8 9 10 11 2 2 2 2 2 2 2 2 1 1 1

Shortest Paths:

slide-18
SLIDE 18

Updating Algorithm An Example

7 8 6 1 5 4 9 2 3 10 11 Perform Updates in Reverse Depth Order

O(N + M)

3 1 4 7 4 4

1 2 3 4 5 6 7 8 9 10 11 2 2 2 2 2 2 2 2 1 1 1

Shortest Paths:

slide-19
SLIDE 19

Updating Algorithm Theoretical Time and Space

7 8 6 1 5 4 9 2 3 10 11

Time: O(N2+NM) Storage: O(N+M)

slide-20
SLIDE 20

Updating Algorithm Single Processor

Variables: Storage: V : set of vertices O(M+N) d : depth of vertices O(N) Q : BFS queue O(N) P : shortest path parents O(M+N) sig : number of paths O(N) S : order seen O(N) del : centrality update O(N)

Storage: O(M+N) Time: O(MN + N2)

slide-21
SLIDE 21

Updating Algorithm P processors

Variables: Storage: V : set of vertices O(PM+PN) d : depth of vertices O(PN) Q : BFS queue O(PN) P : shortest path parents O(PM+PN) sig : number of paths O(PN) S : order seen O(PN) del : centrality update O(PN)

Storage: O(PM+PN) Time: O((MN + N2)/P)

For each vertex, in parallel

slide-22
SLIDE 22

Updating Algorithm Array Based Version

Variables: Storage: A : sparse adjacency matrix BS(NxN) O(M+N) f : sparse fringe vector ZS(N) O(N) p : shortest path vector ZN O(N) S : sparse depth matrix BS(NxN) O(N) u : centrality update vector RN O(N)

Storage: O(M+N) Time: O(MN + N2)

slide-23
SLIDE 23

Updating Algorithm Array Based Version

Discover Paths: f = fA f = f .* ¬p p = p + f Sd = boolean(f)

Variables: A : sparse adjacency matrix f : sparse fringe vector p : shortest path vector S : sparse depth matrix u : centrality update vector

slide-24
SLIDE 24

Updating Algorithm Array Based Version

Update Centralities: w = Sd .* (1+u) ./ p w = Aw w = w .* Sd-1 .* p u = u + w

Variables: A : sparse adjacency matrix f : sparse fringe vector p : shortest path vector S : sparse depth matrix u : centrality update vector

slide-25
SLIDE 25

Array Based Version Single Processor Performance

SSCA#2 Kernel 4 (Betweenness Centrality on Kronecker Graph)

Data Courtesy of Prof. David Bader & Kamesh Madduri (Georgia Tech) (Traversed Edges Per Second) Nedge =8M Nvert =1M Napprox=256

Matlab Matlab achieves

  • 50% of C
  • 50% of sparse matmul
  • No hidden gotchas
slide-26
SLIDE 26

Array Based Version Why is it Useful?

  • Linear Performance within:

Factor of 2 of C code

  • Fewer Lines of Code

(More work behind-the-scenes)

  • Natural Implementation in:
  • Matlab
  • Maple
  • ...
  • Processes full depth at a time:
  • Low-level parallelism
slide-27
SLIDE 27

Array Based Version P Processors

Discover Paths in Parallel Update Centralities in Parallel Storage: O(M+N) Time: O((MN + N2)/P)

slide-28
SLIDE 28

Array Based P Processor Version Why is it Useful?

  • Performance

Currently Untested

  • Memory per machine

Scales as Expected

  • Fewer Lines of Code

(More work behind-the-scenes)

  • Natural Implementation in:
  • PMatlab
  • StarP
  • ...
slide-29
SLIDE 29

Matrix Based Version How does it work?

Variables: Storage: A : sparse adjacency matrix BS(NxN) O(M+N) f : sparse fringe vector ZS(VxN) O(VN) p : shortest path vector Z(VxN) O(VN) S : sparse depth matrix BS(VxNxN) O(VN) u : centrality update vector R(VxN) O(VN)

Choose a vertex block size V (Optimal Size in tests, V = 16)

Time: O(N2+MN) Storage: O(VN+M)

slide-30
SLIDE 30

Acknowledgements

  • Original Updating Algorithm:

Ulrik Brandes

  • Parallel Updating Algorithm:

David Bader Kamesh Madduri

  • Collaboration at Lincoln Labs: Jeremy Kepner
  • LA Graph Algorithms:

Jeremy Fineman Crystal Kahn