[PPT] - EFFICIENT MAXIMUM FLOW ALGORITHM Nikolay Sakharnykh Hugo Braun; May PowerPoint Presentation

SLIDE 1

Nikolay Sakharnykh – Hugo Braun; May 10, 2017

EFFICIENT MAXIMUM FLOW ALGORITHM

SLIDE 2

2

MAXIMUM FLOW

Definition

s t

Directed graph
Flow capacities on edges
Maximum flow from s to t?

Example: How much instant power can Palo Alto get using that electric grid?

Power plant

SF SJ PA 20 30 8 +∞ 25 12 1

SLIDE 3

3

Applications

MAXIMUM FLOW

Image segmentation Community detection

SLIDE 4

4

MAXIMUM FLOW SOLVERS

OTHER PREFLOW AUGMENTING PATHS

Iteratively find a new augmenting path

Ford–Fulkerson
Edmonds–Karp
Dinic’s/MPM

Linear programming Push flow locally in a preflow graph

Push relabel and its

variants

SLIDE 5

5

AGENDA

Edmonds-Karp Push-relabel MPM

SLIDE 6

6

FORD FULKERSON

Workflow

while a path p of capacity c > 0 exists from s to t: maxflow += c for all edges in path p: edge.capacity -= c add reverse edge from edge.destination to edge.source of capacity c

Augmenting path on first iteration, path of capacity min(1,3,1) = 1 s t a b c d 1 1 3 3 5 1 1

SLIDE 7

7

FORD FULKERSON

Workflow

Augmenting path on first iteration, path of capacity min(1,3,1) = 1 s t a b c d 1 2 1 5 1 1 1 3

while a path p of capacity c > 0 exists from s to t: maxflow += c for all edges in path p: edge.capacity -= c add reverse edge from edge.destination to edge.source of capacity c

SLIDE 8

8

FORD FULKERSON

Workflow

Augmenting path on first iteration, path of capacity min(1,3,1) = 1 s t a b c d 1 2 1 5 1 1 1 3 Augmenting path on second iteration, path of capacity min(1,1,1,3,5) = 1

while a path p of capacity c > 0 exists from s to t: maxflow += c for all edges in path p: edge.capacity -= c add reverse edge from edge.destination to edge.source of capacity c

SLIDE 9

9

EDMONDS-KARP

Edmonds-Karp: variation of Ford Fulkerson Main idea: use the shortest augmenting path One augmenting path needs one BFS Wikipedia graph: ~5000 augmenting paths s t a b c d 1 2 1 5 1 1 1 3 Ford-Fulkerson & variants use too many graph traversals

SLIDE 10

10

PUSH-RELABEL

SLIDE 11

11

PUSH-RELABEL

Workflow

Saturate all out-edges of s, create reverse edges 𝓂[s] = number of vertices 𝓂[v] = 0 for all v ∈ V \ {s} while there is an applicable push or relabel operation execute the operation push(u, v): if(e[u] > 0 and 𝓂[u] == 𝓂[v] + 1) push e[u] amount of flow from u to v relabel(u): if(e[u] > 0 and 𝓂[u] <= 𝓂[v] for all current neighbors) 𝓂[u] = minimum 𝓂[v] among neighbors + 1

s a b 1 1 l=? l=? e=? l=? e=? d l=? e=? 1

SLIDE 12

12

PUSH-RELABEL

Workflow

Saturate all out-edges of s, create reverse edges 𝓂[s] = number of vertices 𝓂[v] = 0 for all v ∈ V \ {s} while there is an applicable push or relabel operation execute the operation push(u, v): if(e[u] > 0 and 𝓂[u] == 𝓂[v] + 1) push e[u] amount of flow from u to v relabel(u): if(e[u] > 0 and 𝓂[u] <= 𝓂[v] for all current neighbors) 𝓂[u] = minimum 𝓂[v] among neighbors + 1

s a b l=6 l=0 e=1 l=0 e=1 1 1 d l=0 e=0 1

SLIDE 13

13

PUSH-RELABEL

Workflow

Saturate all out-edges of s, create reverse edges 𝓂[s] = number of vertices 𝓂[v] = 0 for all v ∈ V \ {s} while there is an applicable push or relabel operation execute the operation push(u, v): if(e[u] > 0 and 𝓂[u] == 𝓂[v] + 1) push e[u] amount of flow from u to v relabel(u): if(e[u] > 0 and 𝓂[u] <= 𝓂[v] for all current neighbors) 𝓂[u] = minimum 𝓂[v] among neighbors + 1

s a b l=6 l=0 e=1 l=1 e=1 1 1 d l=0 e=0 1

SLIDE 14

14

PUSH-RELABEL

Workflow

Saturate all out-edges of s, create reverse edges 𝓂[s] = number of vertices 𝓂[v] = 0 for all v ∈ V \ {s} while there is an applicable push or relabel operation execute the operation push(u, v): if(e[u] > 0 and 𝓂[u] == 𝓂[v] + 1) push e[u] amount of flow from u to v relabel(u): if(e[u] > 0 and 𝓂[u] <= 𝓂[v] for all current neighbors) 𝓂[u] = minimum 𝓂[v] among neighbors + 1

s a b l=6 l=0 e=1 l=1 e=0 1 1 d l=0 e=1 1

SLIDE 15

15

PUSH-RELABEL

Parallelism issues

while there is an applicable push or relabel operation execute the operation

s a b l=6 l=0 e=1 l=1 e=0 1 1 d l=0 e=1 1 At this step, we could relabel a or d. Which one? Complexity of heuristics : PRIORITY LARGEST L SMALLEST L FIFO Complexity O(V 2√E) O(V 2E) O(V 3) Source of parallelism:

Order affects convergence. Massive parallelism yields random order

SLIDE 16

16

PUSH-RELABEL

Parallelism issues

Parallelism drops. Not enough to saturate the GPU

Source : The University of Texas at Austin

In theory, number of threads = number of vertices In practice, number of active vertices << number of vertices

SLIDE 17

17

PUSH-RELABEL

Conclusion

s a b l=6 l=0 e=1 l=1 e=0 1 1 d l=0 e=1 1

Actual parallelism is low
Massive parallelism yields random order which

damages performance

We need graph traversals (BFS) for some critical

heuristics Push-relabel not suited for GPU implementation road_usa: GPU does 20 BFS, CPU does only 3 BFS CPU is faster since it requires fewer traversals

SLIDE 18

18

MPM

SLIDE 19

19

DINIC’S

Workflow

s t a b c d 2 1 3 3 2 1 1 Two augmenting paths of length 3 They have been discovered using just one BFS Avoid running BFS twice here Main idea of Dinic’s: reuse BFS results Edges on paths of length 3

SLIDE 20

20

DINIC’S

Workflow

s t a b c d 2 1 3 3 2 1 1

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find augmenting path from s to t in GL Push corresponding flow in GL, update edges

Graph G

SLIDE 21

21

DINIC’S

Workflow

s t a b c d 2 3 3 2 1

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find augmenting path from s to t in GL Push corresponding flow in GL, update edges

Graph GL(3)

SLIDE 22

22

DINIC’S

Workflow

s t a b c d 2 2 1

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find augmenting path from s to t in GL Push corresponding flow in GL, update edges

Graph GL(3) 1 1 1 2 1 DFS

SLIDE 23

23

DINIC’S

Workflow

s t a b c d 2 1

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find augmenting path from s to t in GL Push corresponding flow in GL, update edges

Graph GL(4) 1 1 1

SLIDE 24

24

DINIC’S

Workflow

s t a b c d 1

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find augmenting path from s to t in GL Push corresponding flow in GL, update edges

Graph GL(4) 1 1 1 1 1 DFS DFS traverse all vertices on GPU We lose all advantages of Dinic’s

SLIDE 25

25

MPM

Workflow

s t a b c d 2 1 3 3 2 1 1

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find vertex u with minimum potential m, with potential(u) = min(degreein(u), degreeout(u)) push m from u to t, pull m from s to u remove all vertex with min(degreein(u), degreeout(u))=0

Graph G

SLIDE 26

26

MPM

Workflow

s t b c d 2 3 3 2 1

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find vertex u with minimum potential m, with potential(u) = min(degreein(u), degreeout(u)) push m from u to t, pull m from s to u remove all vertex with min(degreein(u), degreeout(u))=0

Graph G

SLIDE 27

27

MPM

Workflow

s t b c d 2 3 3 2 1

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find vertex u with minimum potential m, with potential(u) = min(degreein(u), degreeout(u)) push m from u to t, pull m from s to u remove all vertex with min(degreein(u), degreeout(u))=0

Graph GL(3) c

is selected so that we know 1 amount of flow will pass through

SLIDE 28

28

MPM

Workflow

s t b c d 1 2 3 2

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find vertex u with minimum potential m, with potential(u) = min(degreein(u), degreeout(u)) push m from u to t, pull m from s to u remove all vertex with min(degreein(u), degreeout(u))=0

Graph GL(3) c

is selected so that we know 1 amount of flow will pass through

SLIDE 29

29

MPM

Workflow

s t b d 1 3 2

While s is still connected to t in G, do Create layer graph GL containing only shortest paths from s to t While s is still connected to t, do Find vertex u with minimum potential m, with potential(u) = min(degreein(u), degreeout(u)) push m from u to t, pull m from s to u remove all vertex with min(degreein(u), degreeout(u))=0

Graph GL(3)

SLIDE 30

30

MPM

Dinic’s vs MPM

s t a h Dinic’s - DFS e f b 3 5 7 3 2 3 1 g 2 Processed but useless edges Processed and acceptable edges s t h 7 3 1 c c MPM – Push/Pull/Prune Across the graph, min potential = 1 (vertex h) Pushing 1 to t, pulling 1 from s, using any edges d 3 4 d 3 4

SLIDE 31

31

MPM

Dinic’s vs MPM

s t h 7 3 1 c d 3 4 Saturating one augmenting path on GPU:

MPM: Push/pull/prune process 30us
Edmonds-Karp/Dinic’s: one BFS >1ms
Perf. bounded by kernel launch latency

Example: Wikipedia 2011

MPM: 5 BFS, 6000 augmenting paths
EK: 6000 BFS

SLIDE 32

32

MPM

GPU design

s t h 7 3 1 c d 3 4 MPM paper gives a high level implementation Most of the work went into GPU implementation design (2 out of 3 months)

SLIDE 33

33

MAXIMUM FLOW RESULTS

GRAPH N NNZ SPEED UP AVG MIN MAX wiki03

455436 3811198

9.1 1.7 15.3

wiki11

3721339 121043107

22.5 19.8 28.9

road_usa

23947347 57708624

2 0.7 4.2

road CA

1971281 5533214

2.3 0.8 4.9

Galois on dual socket Haswell 16 cores vs NVIDIA Titan X (Pascal)

SLIDE 34

34

EFFICIENT MAXIMUM FLOW ALGORITHM

Black-box solver: large variety of applications can be seen as the flow problem.
Data-dependent, irregular algorithm: how to create enough “real” parallelism

and how to avoid latency issues on the GPU.

Order of magnitude speed-ups on wide graphs. Long graphs require a more

efficient graph traversal implementation.

Takeaways

SLIDE 35

35

REFERENCES

An Experimental Comparison of Min

Cut/Max-Flow Algorithms for Energy

Minimization in Vision, Yuri Boykov, Vladimir Kolmogorov Finding Web Communities by Maximum Flow Algorithm using Well

Assigned Edge

Capacities, Noriko IMAFUJI, and Masaru KITSUREGAWA An O(|

V|3) algorithm for finding maximum flows in networks, V.M. Malhotra,

M.Pramodh Kumar, S.N. Maheshwari Parallizing the Push

Relabel Max Flow Algorithm, Victoria Popic, Javier Vélez

SLIDE 36