Algorithms for Compact Letter Displays: Comparison and Evaluation - - PowerPoint PPT Presentation

algorithms for compact letter displays comparison and
SMART_READER_LITE
LIVE PREVIEW

Algorithms for Compact Letter Displays: Comparison and Evaluation - - PowerPoint PPT Presentation

Introduction Algorithms Experiments Summary Algorithms for Compact Letter Displays: Comparison and Evaluation Jens Gramm 1 Jiong Guo 1 uffner 1 Falk H Rolf Niedermeier 1 Hans-Peter Piepho 2 Ramona Schmid 3 1 Friedrich-Schiller-Universit


slide-1
SLIDE 1

Introduction Algorithms Experiments Summary

Algorithms for Compact Letter Displays: Comparison and Evaluation

Jens Gramm1 Jiong Guo1 Falk H¨ uffner1 Rolf Niedermeier1 Hans-Peter Piepho2 Ramona Schmid3

1Friedrich-Schiller-Universit¨

at Jena Institut f¨ ur Informatik

2Universit¨

at Hohenheim Institut f¨ ur Pflanzenbau und Gr¨ unland

3Universit¨

at Bielefeld AG Praktische Informatik

Statistik unter einem Dach 30 March 2007

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 1/23

slide-2
SLIDE 2

Introduction Algorithms Experiments Summary

Outline

1

Introduction All-pairwise comparisons Line displays Letter displays Clique Cover

2

Algorithms Insert-Absorb heuristic Clique-Growing heuristic Search-Tree algorithm

3

Experiments Real data Simulated data

4

Summary

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 2/23

slide-3
SLIDE 3

Introduction Algorithms Experiments Summary

All-pairwise comparisons

Multiple pairwise comparisons among all pairs in a set of n treatments: common task in routine analyses based on analysis of variance (ANOVA) techniques Need a way to visualize the ∼ n2 pairwise comparison results (significantly different or not significantly different)

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 3/23

slide-4
SLIDE 4

Introduction Algorithms Experiments Summary

Line displays

Line display

Exactly those pairwise comparisons among treatments are non-significant that are connected by a common line.

Example

Given treatments t1, . . . , t5, let the comparison of t1 and t5 is significant and all other comparisons non-significant. t1 t2 t3 t4 t5

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 4/23

slide-5
SLIDE 5

Introduction Algorithms Experiments Summary

Line displays

Line display

Exactly those pairwise comparisons among treatments are non-significant that are connected by a common line.

Example

Given treatments t1, . . . , t5, let the comparison of t1 and t5 is significant and all other comparisons non-significant. t1 t2 t3 t4 t5 Disadvantage: not always possible to find a line display

[Piepho, Biometrical J. 2000]

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 4/23

slide-6
SLIDE 6

Introduction Algorithms Experiments Summary

Letter displays

Letter display

Exactly those pairwise comparisons among treatments are non-significant that have a common letter.

Example

Given treatments t1, . . . , t5, let the significant comparisons be {{t1, t5}, {t1, t3}, {t2, t4}}. t1 a b t2 b d t3 c d t4 a c t5 c d

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 5/23

slide-7
SLIDE 7

Introduction Algorithms Experiments Summary

Line displays vs. letter displays

Letter displays generalize line displays t1 t2 t3 t4 t5 t1 a t2 a b t3 a b t4 a b t5 b

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 6/23

slide-8
SLIDE 8

Introduction Algorithms Experiments Summary

Letter display

Always possible to find?

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 7/23

slide-9
SLIDE 9

Introduction Algorithms Experiments Summary

Letter display

Always possible to find? Yes: Create a new column with two letters for each pair of not significantly different treatments.

Example

Given treatments t1, . . . , t5, let the significant comparisons be {{t1, t5}, {t1, t3}, {t2, t4}}. t1 a b t2 a c d t3 c e f t4 b e g t5 d f g

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 7/23

slide-10
SLIDE 10

Introduction Algorithms Experiments Summary

Letter display

Always possible to find? Yes: Create a new column with two letters for each pair of not significantly different treatments.

Example

Given treatments t1, . . . , t5, let the significant comparisons be {{t1, t5}, {t1, t3}, {t2, t4}}. t1 a b t2 a c d t3 c e f t4 b e g t5 d f g ∼ n2 columns: too large.

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 7/23

slide-11
SLIDE 11

Introduction Algorithms Experiments Summary

Compact letter displays

Goal

Find a compact letter display (that is, with minimum number of columns).

Questions

How large can the letter display get? How easy is it to calculate a letter display? What is a good algorithm for calculating letter displays?

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 8/23

slide-12
SLIDE 12

Introduction Algorithms Experiments Summary

Compact letter displays

Goal

Find a compact letter display (that is, with minimum number of columns).

Questions

How large can the letter display get? unknown How easy is it to calculate a letter display? unknown What is a good algorithm for calculating letter displays? Heuristic [Piepho, J. Comput. Graph. Stat. 2004]

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 8/23

slide-13
SLIDE 13

Introduction Algorithms Experiments Summary

Theoretical computer science

We approach these questions with the tools of theoretical computer science: Focus on provable worst-case running time and provable solution guarantee

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 9/23

slide-14
SLIDE 14

Introduction Algorithms Experiments Summary

Theoretical computer science

We approach these questions with the tools of theoretical computer science: Focus on provable worst-case running time and provable solution guarantee Asymptotic algorithm running time analysis

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 9/23

slide-15
SLIDE 15

Introduction Algorithms Experiments Summary

Theoretical computer science

We approach these questions with the tools of theoretical computer science: Focus on provable worst-case running time and provable solution guarantee Asymptotic algorithm running time analysis

Running time is stated not in absolute terms, but in relation to the input size n

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 9/23

slide-16
SLIDE 16

Introduction Algorithms Experiments Summary

Theoretical computer science

We approach these questions with the tools of theoretical computer science: Focus on provable worst-case running time and provable solution guarantee Asymptotic algorithm running time analysis

Running time is stated not in absolute terms, but in relation to the input size n Constant factors are ignored

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 9/23

slide-17
SLIDE 17

Introduction Algorithms Experiments Summary

Theoretical computer science

We approach these questions with the tools of theoretical computer science: Focus on provable worst-case running time and provable solution guarantee Asymptotic algorithm running time analysis

Running time is stated not in absolute terms, but in relation to the input size n Constant factors are ignored

Classification into computational complexity classes captures “intrinsic difficulty”

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 9/23

slide-18
SLIDE 18

Introduction Algorithms Experiments Summary

Compact letter display: formal definition

Compact Letter Display

Input: Set T of n treatments, and a set H of m unordered pairs from T. Task: Find a binary n × k matrix M with minimum k such that {t1, t2} ∈ H ⇐ ⇒ ∃j : Mt1,j = Mt2,j = 1.

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 10/23

slide-19
SLIDE 19

Introduction Algorithms Experiments Summary

Clique Cover

Clique Cover

Input: An undirected graph G = (V , E). Task: Find a minimum number k of cliques (subgraphs with all edges present) such that each edge is contained in at least

  • ne clique.

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 11/23

slide-20
SLIDE 20

Introduction Algorithms Experiments Summary

Clique Cover

Clique Cover

Input: An undirected graph G = (V , E). Task: Find a minimum number k of cliques (subgraphs with all edges present) such that each edge is contained in at least

  • ne clique.

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 11/23

slide-21
SLIDE 21

Introduction Algorithms Experiments Summary

Clique Cover

Also known as

Keyword Conflict [Kellerman, IBM 1973] Intersection Graph Basis [Garey&Johnson 1979]

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 12/23

slide-22
SLIDE 22

Introduction Algorithms Experiments Summary

Clique Cover

Also known as

Keyword Conflict [Kellerman, IBM 1973] Intersection Graph Basis [Garey&Johnson 1979]

Applications

compiler optimization, computational geometry, . . .

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 12/23

slide-23
SLIDE 23

Introduction Algorithms Experiments Summary

Equivalence of Compact Letter Display and Clique Cover

Compact Letter Display ˆ = Clique Cover treatment ˆ = vertex not sign. diff. ˆ = edge column ˆ = clique a × b × c × d × × e × × f × × g × × h ×

a d b c e g f h

There is a letter display with k columns ⇐ ⇒ there is a clique cover with k cliques.

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 13/23

slide-24
SLIDE 24

Introduction Algorithms Experiments Summary

Known results on Clique Cover

Every graph has a clique cover of size at most n2/4 (sharp)

[Erd˝

  • s et al., Canad. J. Math. 1966]

Heuristic [Kellerman, IBM 1973] NP-hard [Garey&Johnson 1979] Immediately transferable to Compact Letter Display!

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 14/23

slide-25
SLIDE 25

Introduction Algorithms Experiments Summary

NP-hardness of Clique Cover

Theoretical computer science equates “efficiently solvable” with “solvable in polynomial time” (that is, there is some constant c such that solving a problem of size n takes at most nc time)

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 15/23

slide-26
SLIDE 26

Introduction Algorithms Experiments Summary

NP-hardness of Clique Cover

Theoretical computer science equates “efficiently solvable” with “solvable in polynomial time” (that is, there is some constant c such that solving a problem of size n takes at most nc time) For none of the several thousand known NP-hard problems has such an efficient algorithm been found

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 15/23

slide-27
SLIDE 27

Introduction Algorithms Experiments Summary

NP-hardness of Clique Cover

Theoretical computer science equates “efficiently solvable” with “solvable in polynomial time” (that is, there is some constant c such that solving a problem of size n takes at most nc time) For none of the several thousand known NP-hard problems has such an efficient algorithm been found If we can solve one of them efficiently, we can solve all of them efficiently

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 15/23

slide-28
SLIDE 28

Introduction Algorithms Experiments Summary

NP-hardness of Clique Cover

Theoretical computer science equates “efficiently solvable” with “solvable in polynomial time” (that is, there is some constant c such that solving a problem of size n takes at most nc time) For none of the several thousand known NP-hard problems has such an efficient algorithm been found If we can solve one of them efficiently, we can solve all of them efficiently So we probably cannot solve any of them efficiently. . .

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 15/23

slide-29
SLIDE 29

Introduction Algorithms Experiments Summary

NP-hardness of Clique Cover

Theoretical computer science equates “efficiently solvable” with “solvable in polynomial time” (that is, there is some constant c such that solving a problem of size n takes at most nc time) For none of the several thousand known NP-hard problems has such an efficient algorithm been found If we can solve one of them efficiently, we can solve all of them efficiently So we probably cannot solve any of them efficiently. . . . . . but there is no proof for this yet!

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 15/23

slide-30
SLIDE 30

Introduction Algorithms Experiments Summary

Insert-Absorb heuristic

Idea

Initially, consider all treatments as not significantly different, and then successively take significantly different pairs into account

1 1 2 1 3 1 4 1 5 1 6 1 {1,2}

1 1 1 1 1 1 1 1 1 1 {3,4}

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Redundant columns are “absorbed”

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 16/23

slide-31
SLIDE 31

Introduction Algorithms Experiments Summary

Insert-Absorb heuristic

Idea

Initially, consider all treatments as not significantly different, and then successively take significantly different pairs into account

1 1 2 1 3 1 4 1 5 1 6 1 {1,2}

1 1 1 1 1 1 1 1 1 1 {3,4}

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Redundant columns are “absorbed” Can produce very large letter displays Can run very slowly (exponential time)

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 16/23

slide-32
SLIDE 32

Introduction Algorithms Experiments Summary

Clique-Growing heuristic

Idea

Initially, consider all treatments as significantly different, and then successively take not significantly different pairs into account

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 17/23

slide-33
SLIDE 33

Introduction Algorithms Experiments Summary

Clique-Growing heuristic

Idea

Initially, consider all treatments as significantly different, and then successively take not significantly different pairs into account Can produce very large letter displays Provable running time bound: n3

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 17/23

slide-34
SLIDE 34

Introduction Algorithms Experiments Summary

Search-Tree algorithm

Idea

Data reduction rules: replace the instance by a smaller equivalent one. Enumerate all possibility of adapting the letter display to a not significantly different pair, and branch accordingly.

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 18/23

slide-35
SLIDE 35

Introduction Algorithms Experiments Summary

Search-Tree algorithm

Idea

Data reduction rules: replace the instance by a smaller equivalent one. Enumerate all possibility of adapting the letter display to a not significantly different pair, and branch accordingly. Produces optimal letter displays Can run very slowly (exponential time)

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 18/23

slide-36
SLIDE 36

Introduction Algorithms Experiments Summary

Algorithm analysis: summary

Algorithm runtime

  • ptimality

Insert-Absorb exponential no guarantee Clique-Growing polynomial no guarantee Search-Tree exponential guaranteed

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 19/23

slide-37
SLIDE 37

Introduction Algorithms Experiments Summary

Crop yield trials

Insert-Absorb Clique-Growing Search-Tree Dataset n |H| cols time [s] cols time [s] cols time [s] Triticale 17 86 5 0.00 5 0.00 5 0.00 Rapeseed 74 1758 29 0.15 27 0.03 25 0.35 Wheat 124 4847 56 1.93 50 0.20 49 4.00 Running time tolerable for all algorithms Clique-Growing seems to give better results than Insert-Absorb

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 20/23

slide-38
SLIDE 38

Introduction Algorithms Experiments Summary

Simulated Data

Data generated for arbitrary number of trials n by a simulation with parameters chosen to give similar results as the rapeseed data sets

50 100 150 200 250 300 350 400 Number of treatments 50 100 150 200 250 Number of letter display columns 3 1 2 1 Search-Tree 2 Clique-Growing 3 Insert-Absorb 50 100 150 200 250 300 350 400 Number of treatments 10 20 30 40 50 60 70 80 Runtime [s] 1 2 3 1 Search-Tree 2 Clique-Growing 3 Insert-Absorb

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 21/23

slide-39
SLIDE 39

Introduction Algorithms Experiments Summary

Summary

Methods of theoretical computer science give insight into the problem of finding compact letter displays Finding compact letter displays is hard An optimal algorithm (Search-Tree) is fast enough for small to medium size real-world instances A heuristic initially developed for the Clique Cover problem (Clique-Growing) has a worst-case time bound and gives good results

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 22/23

slide-40
SLIDE 40

Introduction Algorithms Experiments Summary

Open question

It is also desirable to minimize the number of entries in the letter

  • display. (In the Clique Cover model, this is the sum of the

clique sizes.)

Question

Is there a solution that minimizes the number of entries in the letter display, but not the number of columns?

Gramm et al. Algorithms for Compact Letter Displays: Comparison and Evaluation 23/23