Algorithm Engineering for Optimal Graph Bipartization Falk H - - PowerPoint PPT Presentation

algorithm engineering for optimal graph bipartization
SMART_READER_LITE
LIVE PREVIEW

Algorithm Engineering for Optimal Graph Bipartization Falk H - - PowerPoint PPT Presentation

Algorithm Engineering for Optimal Graph Bipartization Falk H uffner Institut f ur Informatik Friedrich-Schiller-Universit at Jena 4th International Workshop on Efficient and Experimental Algorithms Outline DNA Sequence Assembly


slide-1
SLIDE 1

Algorithm Engineering for Optimal Graph Bipartization

Falk H¨ uffner

Institut f¨ ur Informatik Friedrich-Schiller-Universit¨ at Jena

4th International Workshop on Efficient and Experimental Algorithms

slide-2
SLIDE 2

Outline

slide-3
SLIDE 3

DNA Sequence Assembly

Diploid cells have two copies of each chromosome

slide-4
SLIDE 4

DNA Sequence Assembly

Chromosome assignments of the fragments in shotgun assembly are initially unknown

slide-5
SLIDE 5

DNA Sequence Assembly

Pairwise conflicts indicate that two fragments are from different copies

slide-6
SLIDE 6

DNA Sequence Assembly

Pairwise conflicts indicate that two fragments are from different copies

slide-7
SLIDE 7

DNA Sequence Assembly

Reconstruction of chromosome assignment from the bipartite con- flict graph

slide-8
SLIDE 8

Minimum Fragment Removal

In practise, contaminations occur.

slide-9
SLIDE 9

Minimum Fragment Removal

Contamination fragments will conflict with fragments from both copies.

slide-10
SLIDE 10

Minimum Fragment Removal

The task is to recognize contamination fragments.

slide-11
SLIDE 11

Formalization as Graph Bipartization

Graph Bipartization Input: An undirected graph G = (V , E) and a nonnegative integer k. Task: Find a subset C ⊆ V of vertices with |C| = k such that G[V \ C] is bipartite.

slide-12
SLIDE 12

Formalization as Graph Bipartization

Graph Bipartization Input: An undirected graph G = (V , E) and a nonnegative integer k. Task: Find a subset C ⊆ V of vertices with |C| = k such that G[V \ C] is bipartite. Equivalent formulation: Odd Cycle Cover Task: Find a subset C ⊆ V of vertices with |C| = k such that C touches every odd cycle in G.

slide-13
SLIDE 13

Graph Bipartization

◮ Graph Bipartization is NP-complete [Lewis and Yannakakis, JCSS 1980]; it has numerous applications, e. g. in VLSI design

and register allocation

slide-14
SLIDE 14

Graph Bipartization

◮ Graph Bipartization is NP-complete [Lewis and Yannakakis, JCSS 1980]; it has numerous applications, e. g. in VLSI design

and register allocation

◮ Graph Bipartization is MaxSNP-hard [Papadimitriou and Yannakakis, JCSS 1991]. The best known polynomial-time

approximation is by a factor of log |V |

[Garg, Vazirani, and Yannakakis, SIAM J. Comput. 1996]

slide-15
SLIDE 15

Parameterization

Approach: For Minimum Fragment Removal, k ≪ n. Try to confine the combinatorial explosion to k

slide-16
SLIDE 16

Parameterization

Approach: For Minimum Fragment Removal, k ≪ n. Try to confine the combinatorial explosion to k

Definition

For some parameter k of a problem, the problem is called fixed-parameter tractable with respect to k if there is an algorithm that solves it in f (k) · nO(1).

slide-17
SLIDE 17

Parameterization

Approach: For Minimum Fragment Removal, k ≪ n. Try to confine the combinatorial explosion to k

Definition

For some parameter k of a problem, the problem is called fixed-parameter tractable with respect to k if there is an algorithm that solves it in f (k) · nO(1). Graph Bipartization is fixed-parameter tractable with respect to k [Reed, Smith&Vetta, Oper. Res. Lett. 2004].

slide-18
SLIDE 18

Iterative Compression

Approach: use a compression routine iteratively. Compression routine: Given a size-(k + 1) solution, either computes a size-k solution or proves that there is no size-k solution.

slide-19
SLIDE 19

Compression Routine for Graph Bipartization

Idea: Convert the covering problem to a cut problem.

slide-20
SLIDE 20

Compression Routine for Graph Bipartization

Idea: Convert the covering problem to a cut problem.

slide-21
SLIDE 21

Compression Routine for Graph Bipartization

Idea: Convert the covering problem to a cut problem.

slide-22
SLIDE 22

Compression Routine for Graph Bipartization

Idea: Convert the covering problem to a cut problem.

slide-23
SLIDE 23

Compression Routine for Graph Bipartization

Idea: Convert the covering problem to a cut problem.

slide-24
SLIDE 24

Compression Routine for Graph Bipartization

Idea: Convert the covering problem to a cut problem.

slide-25
SLIDE 25

Valid Partitions

But: The resulting multi-cut problem is still NP-complete!

Definition

A valid partition divides the vertices into input vertices and

  • utput vertices

such that for each pair one is input and one is

  • utput.
slide-26
SLIDE 26

Valid Partitions

But: The resulting multi-cut problem is still NP-complete!

Definition

A valid partition divides the vertices into input vertices and

  • utput vertices

such that for each pair one is input and one is

  • utput.

A cut between the input vertices and the output vertices of a valid partition provides a smaller bipartization solution.

slide-27
SLIDE 27

Valid Partitions

But: The resulting multi-cut problem is still NP-complete!

Definition

A valid partition divides the vertices into input vertices and

  • utput vertices

such that for each pair one is input and one is

  • utput.

A cut between the input vertices and the output vertices of a valid partition provides a smaller bipartization solution.

Lemma ([Reed, Smith&Vetta 2004])

If there is a smaller bipartization solution, then there is a valid partition such that this solution is a cut between the input vertices and the output vertices.

slide-28
SLIDE 28

Valid Partitions

slide-29
SLIDE 29

Compression Routine Graph Bipartization

Compression Routine:

◮ Enumerate all 2k valid partition ◮ For each, find a vertex cut in k · m time

slide-30
SLIDE 30

Compression Routine Graph Bipartization

Compression Routine:

◮ Enumerate all 2k valid partition ◮ For each, find a vertex cut in k · m time

Theorem

Graph Bipartization can be solved in O(3k · kmn) time.

slide-31
SLIDE 31

Experimental Results

Run time in seconds for some Minimum Site Removal instances n m k ILP Reed A31 30 51 2 0.02 0.00 J24 142 387 4 0.97 0.00 A10 69 191 6 2.50 0.00 J18 71 296 9 47.86 0.05 A11 102 307 11 6248.12 0.79 A34 133 451 13 10.13 A22 167 641 16 350.00 A50 113 468 18 3072.82 A45 80 386 20 A40 136 620 22 A17 151 633 25 A28 167 854 27 A42 236 1110 30 A41 296 1620 40

[Data from Wernicke 2003]

slide-32
SLIDE 32

Using Gray Codes to enumerate Valid Partitions

◮ The flow problems for different valid partitions are “similar” in

such a way that we can “recycle” the flow networks for each problem

slide-33
SLIDE 33

Using Gray Codes to enumerate Valid Partitions

◮ The flow problems for different valid partitions are “similar” in

such a way that we can “recycle” the flow networks for each problem

◮ Using a Gray code, we can enumerate valid partitions such

that adjacent partitions differ in only one element

slide-34
SLIDE 34

Using Gray Codes to enumerate Valid Partitions

◮ The flow problems for different valid partitions are “similar” in

such a way that we can “recycle” the flow networks for each problem

◮ Using a Gray code, we can enumerate valid partitions such

that adjacent partitions differ in only one element

◮ Only O(m) time, as opposed to O(km) time for solving a flow

problem from scratch

slide-35
SLIDE 35

Using Gray Codes to enumerate Valid Partitions

◮ The flow problems for different valid partitions are “similar” in

such a way that we can “recycle” the flow networks for each problem

◮ Using a Gray code, we can enumerate valid partitions such

that adjacent partitions differ in only one element

◮ Only O(m) time, as opposed to O(km) time for solving a flow

problem from scratch

◮ Worst-case speedup by a factor of k

slide-36
SLIDE 36

Experimental Results

Run time in seconds for some Minimum Site Removal instances n m k ILP Reed Gray A31 30 51 2 0.02 0.00 0.00 J24 142 387 4 0.97 0.00 0.00 A10 69 191 6 2.50 0.00 0.00 J18 71 296 9 47.86 0.05 0.01 A11 102 307 11 6248.12 0.79 0.14 A34 133 451 13 10.13 1.04 A22 167 641 16 350.00 64.88 A50 113 468 18 3072.82 270.60 A45 80 386 20 2716.87 A40 136 620 22 A17 151 633 25 A28 167 854 27 A42 236 1110 30 A41 296 1620 40

[Data from Wernicke 2003]

slide-37
SLIDE 37

A Heuristic for Dense Graphs

◮ By examining the subgraph induced by the known odd cycle

cover, we can omit many valid partitions from consideration

slide-38
SLIDE 38

A Heuristic for Dense Graphs

◮ By examining the subgraph induced by the known odd cycle

cover, we can omit many valid partitions from consideration

◮ No worst-case speedup for general graphs, but very effective

in practice

slide-39
SLIDE 39

Experimental Results

Run time in seconds for some Minimum Site Removal instances n m k ILP Reed Gray Enum2Col A31 30 51 2 0.02 0.00 0.00 0.00 J24 142 387 4 0.97 0.00 0.00 0.00 A10 69 191 6 2.50 0.00 0.00 0.00 J18 71 296 9 47.86 0.05 0.01 0.00 A11 102 307 11 6248.12 0.79 0.14 0.00 A34 133 451 13 10.13 1.04 0.04 A22 167 641 16 350.00 64.88 0.08 A50 113 468 18 3072.82 270.60 0.05 A45 80 386 20 2716.87 0.14 A40 136 620 22 0.80 A17 151 633 25 5.68 A28 167 854 27 1.02 A42 236 1110 30 73.55 A41 296 1620 40 236.26

[Data from Wernicke 2003]

slide-40
SLIDE 40

Heuristic on Random Graphs

6 8 10 12 14 16 18 20 22 24 Size of odd cycle cover 10-2 10-1 1 101 102 103 run time in seconds average degree 3 average degree 16 average degree 64

n = 300

slide-41
SLIDE 41

Conclusions

◮ Iterative compression is a superior method for solving Graph

Bipartization in practice

◮ This makes the practical evaluation of iterative compression

for other applications (such as Feedback Vertex Set) appealing

slide-42
SLIDE 42

Conclusions

◮ Iterative compression is a superior method for solving Graph

Bipartization in practice

◮ This makes the practical evaluation of iterative compression

for other applications (such as Feedback Vertex Set) appealing Future work and open questions:

◮ Combination with data reductuction rules ◮ Application to Edge Bipartization ◮ Combination with heuristics