Anthropocentric visualization of optimal cover of association rules - - PowerPoint PPT Presentation

anthropocentric visualization of optimal cover of
SMART_READER_LITE
LIVE PREVIEW

Anthropocentric visualization of optimal cover of association rules - - PowerPoint PPT Presentation

Outline Anthropocentric visualization of optimal cover of association rules Amira Mouakher Sadok Ben Yahia High Institute of Computer Science, Tunisia Faculty of Sciences of Tunis, Tunisia Concept Lattices and Their Applications, CLA 2010


slide-1
SLIDE 1

Outline

Anthropocentric visualization of optimal cover of association rules

Amira Mouakher Sadok Ben Yahia

High Institute of Computer Science, Tunisia Faculty of Sciences of Tunis, Tunisia

Concept Lattices and Their Applications, CLA 2010

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 1 / 30

slide-2
SLIDE 2

Outline

Outilne

1

Introduction and motivation

2

Extraction of a minimal cover from a binary relation

3

Virtual reality based visualization of association rules

4

Conclusion and future work

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 2 / 30

slide-3
SLIDE 3

Outline

Outilne

1

Introduction and motivation

2

Extraction of a minimal cover from a binary relation

3

Virtual reality based visualization of association rules

4

Conclusion and future work

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 2 / 30

slide-4
SLIDE 4

Outline

Outilne

1

Introduction and motivation

2

Extraction of a minimal cover from a binary relation

3

Virtual reality based visualization of association rules

4

Conclusion and future work

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 2 / 30

slide-5
SLIDE 5

Outline

Outilne

1

Introduction and motivation

2

Extraction of a minimal cover from a binary relation

3

Virtual reality based visualization of association rules

4

Conclusion and future work

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 2 / 30

slide-6
SLIDE 6

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

KDD process

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 3 / 30

slide-7
SLIDE 7

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Extraction of association rules

Definition Association rules : Form : premise ⇒ conclusion (support, confidence) Example : "cheese" ⇒ "bread"(50%, 90%). Having A dataset, that every transaction is described by an identifier and a list

  • f items.

The agreement Discovering all the association rules that express correlations between two itemsets.

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 4 / 30

slide-8
SLIDE 8

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Statistical Metrics

Support : probability that a transaction contains X and Y (frequency of appearance of XY). Confidence : conditional probability that a transaction containing X also contains Y. (support(XY)/support(X)). Confidence(R)

  • = 1 → Exact association rule.

< 1 → Approximate association rule.

Aim : Extracting all the association rules minSup and minConf (User-given thresholds). How to do that ?

1

Collecting the frequent itemsets (support ≥ minsupp),

2

Extracting valid association rules (confidence ≥ minconf ).

However ! ! ! !

The extraction cost is exponential with the number of items, i.e., 2|I|.

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 5 / 30

slide-9
SLIDE 9

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Association rules : Statistics

Dataset minsup Exact rules T10I4D100K 0.5% Mushroom 30% 7 476 C73D10K 90% 52 035 Dataset minsup minconf Approximate rules T10I4D100K 0.5% 70% 20 419 50% 21 686 Mushroom 30% 70% 37 671 50% 56 703 C73D10K 90% 95% 1 606 726 85% 2 053 936

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 6 / 30

slide-10
SLIDE 10

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Our approach

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 7 / 30

slide-11
SLIDE 11

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Concept lattice structure Advantages : Compact representation. Without loss of information. Drawbacks : Big redundancy. Complexity of the lattice’s construction. Extraction of a minimal cover NP-hard problem. Several related works. Are mainly based on heuristics.

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 8 / 30

slide-12
SLIDE 12

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Extraction of a minimal cover Belkhiter and al. (1994) : Optimal rectangular decomposition of a binary relation. An application to documentary databases. The gain function of a formal concept C = (E, I) : gain(C) = (|E| × |I|) − (|E| + |I|)

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 9 / 30

slide-13
SLIDE 13

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Extraction of a minimal cover Kcherif and al. (2000) : A rectangular decomposition approach based on the Riguet’s difunctional relation. A set of isolated points allowing the determination of the minimal set of formal concepts covering a given binary relation : Rd = R ◦ R−1 ◦ R ∩ R

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 10 / 30

slide-14
SLIDE 14

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Extraction of a minimal cover Belohlavek & Vychodil (2009) : A new method of decomposition of an n × m binary matrix I into a boolean product A ◦ B of an n × k binary matrix A and a k × m binary matrix B, with k as small as possible. Formal concept E, I, I can be expressed like I=

i∈I{ψ ◦ φ(i)}.

The election of the column y which maximizes the following value : |I ⊕ y| = (φ(I ∪ y) × ψ ◦ φ(I ∪ y) ∩ K

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 11 / 30

slide-15
SLIDE 15

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Criticism of these approaches Static metrics : cardinality of the concept, Cover is extracted regardless of the quality of knowledge.

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 12 / 30

slide-16
SLIDE 16

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Proposed solution Relies on the formal concept lattice representation. A greedy algorithm for discovering a reduced cover of "pertinent" concepts. Pertinence of a concept is based on the strength of the association rules. The user is an actor of the extraction process and validates pertinent concepts.

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 13 / 30

slide-17
SLIDE 17

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Correlation measures Many correlation measures have been proposed in the literature. The discovery of frequent patterns will be of benefit in the reduction

  • f high added value association rule number.

Only informative frequent patterns will be derived from the highly correlated patterns. Some of the most used correlation measures.

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 14 / 30

slide-18
SLIDE 18

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Correlation measures

The confidence correlation measure The ratio of the number of transactions that include all items in the consequent as well as the antecedent (namely, the support) to the number of transactions that include all items in the antecedent. Conf (R) = Supp(XY ) Supp(X)

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 15 / 30

slide-19
SLIDE 19

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Correlation measures

The Lift correlation measure The lift Ratio of an association rule is defined as follows : Lift(R) = Conf(R) Pr(R) Pr(R) is called the expected confidence. The number of transactions having the consequent items divided by the total number of transactions.

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 16 / 30

slide-20
SLIDE 20

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Correlation measures

The any-confidence correlation measure An association is interesting if any rule that can be produced from that association has a confidence greater than or equal to our minimum any-confidence value. The any-confidence measure of a non empty pattern X ⊆ I is defined as follows : any-conf (X) = Supp( ∧ X) min{Supp( ∧ i)|i ∈ X}

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 17 / 30

slide-21
SLIDE 21

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Correlation measures

The all-confidence correlation measure An association is interesting if all rules that can be produced from that association have a confidence greater than or equal to our minimum all-confidence value. The all-confidence measure of a non empty pattern X ⊆ I is defined as follows : all-conf (X) = Supp( ∧ X) max{Supp( ∧ i)|i ∈ X}

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 18 / 30

slide-22
SLIDE 22

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Correlation measures

The bond correlation measure The bond measure computes the ratio between the conjunctive support and the disjunctive one. The bond measure of a non empty pattern X ⊆ I is defined as follows : bond(X) = Supp( ∧ X) Supp( ∨ X)

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 19 / 30

slide-23
SLIDE 23

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Algorithm’s description For each couple belonging to the formal context, compute its pseudo-concept. From each pseudo-concept, extract its enclosed formal concepts. Validation or rejection of the chosen formal concept by the user. Elimination of the couples formed by the chosen concept.

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 20 / 30

slide-24
SLIDE 24

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Algorithm’s description For each couple belonging to the formal context, compute its pseudo-concept. From each pseudo-concept, extract its enclosed formal concepts. Validation or rejection of the chosen formal concept by the user. Elimination of the couples formed by the chosen concept.

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 20 / 30

slide-25
SLIDE 25

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Algorithm’s description Input A formal context K = (O, I, R). The correlation measure H. The minimum threshold minH. For each couple belonging to the formal context, compute its pseudo-concept. From each pseudo-concept, extract its enclosed formal concepts. Validation or rejection of the chosen formal concept by the user. Elimination of the couples formed by the chosen concept.

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 20 / 30

slide-26
SLIDE 26

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Algorithm’s description Input A formal context K = (O, I, R). The correlation measure H. The minimum threshold minH. Output A pertinent cover of formal concepts Fc. For each couple belonging to the formal context, compute its pseudo-concept. From each pseudo-concept, extract its enclosed formal concepts. Validation or rejection of the chosen formal concept by the user. Elimination of the couples formed by the chosen concept.

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 20 / 30

slide-27
SLIDE 27

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Definition : Pseudo-concept The pseudo-concept, denoted PFCab, is computed by getting the restriction of R to the set of examples described by b, i.e., φ(b) and the set of properties describing the object a, i.e., ψ(a), where (φ, ψ) are the Galois connexion operators. Example A B C D E F G 1 1 1 1 2 1 1 1 1 3 1 1 1 4 1 1 5 1 1 1 6 1 1 1 7 1 1 1 1

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 21 / 30

slide-28
SLIDE 28

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Definition : Pseudo-concept The pseudo-concept, denoted PFCab, is computed by getting the restriction of R to the set of examples described by b, i.e., φ(b) and the set of properties describing the object a, i.e., ψ(a), where (φ, ψ) are the Galois connexion operators. Example A B C D E F G 1 1 1 1 2 1 1 1 1 3 1 1 1 4 1 1 5 1 1 1 6 1 1 1 7 1 1 1 1 PFC(1,C) = (12345, CEF) ;

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 21 / 30

slide-29
SLIDE 29

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Definition : Pseudo-concept The pseudo-concept, denoted PFCab, is computed by getting the restriction of R to the set of examples described by b, i.e., φ(b) and the set of properties describing the object a, i.e., ψ(a), where (φ, ψ) are the Galois connexion operators. Example A B C D E F G 1 1 1 1 2 1 1 1 1 3 1 1 1 4 1 1 5 1 1 1 6 1 1 1 7 1 1 1 1 PFC(1,C) = (12345, CEF) ; PFC(1,F) = (124567, CEF)

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 21 / 30

slide-30
SLIDE 30

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Illustrative example A B C D E F G 1 1 1 1 2 1 1 1 1 3 1 1 1 4 1 1 5 1 1 1 6 1 1 1 7 1 1 1 1 Running track of the OptCover algorithm Iteration 1 :PFC(1,C) = (12345, CEF) ⇒ Lc = {(12,CEF), (12345, C), (1245, CF)}

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 22 / 30

slide-31
SLIDE 31

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Illustrative example A B C D E F G 1 1 1 1 2 1 1 1 1 3 1 1 1 4 1 1 5 1 1 1 6 1 1 1 7 1 1 1 1 Running track of the OptCover algorithm Iteration 2 :PFC(1,F) = (124567, CEF) ⇒ Lc = {(123,CE), (124567, F), (1245, CF)}

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 22 / 30

slide-32
SLIDE 32

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Illustrative example A B C D E F G 1 1 1 1 2 1 1 1 1 3 1 1 1 4 1 1 5 1 1 1 6 1 1 1 7 1 1 1 1 Running track of the OptCover algorithm Iteration 3 :PFC(2,G) = (2367, CEFG) ⇒ Lc = {(23,CEG), (2367, G), (267, FG)}

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 22 / 30

slide-33
SLIDE 33

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Illustrative example A B C D E F G 1 1 1 1 2 1 1 1 1 3 1 1 1 4 1 1 5 1 1 1 6 1 1 1 7 1 1 1 1 Running track of the OptCover algorithm Iteration 4 :PFC(4,C) = (12345, CF) ⇒ Lc = {(12345,C), (1245, CF)}

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 22 / 30

slide-34
SLIDE 34

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Illustrative example A B C D E F G 1 1 1 1 2 1 1 1 1 3 1 1 1 4 1 1 5 1 1 1 6 1 1 1 7 1 1 1 1 Running track of the OptCover algorithm Iteration 5 :PFC(4,F) = (124567, CF) ⇒ Lc = {(124567,F), (1245, CF)}

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 22 / 30

slide-35
SLIDE 35

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Illustrative example A B C D E F G 1 1 1 1 2 1 1 1 1 3 1 1 1 4 1 1 5 1 1 1 6 1 1 1 7 1 1 1 1 Running track of the OptCover algorithm Iteration 6 :PFC(5,D) = (5, CDF) ⇒ Lc = {(5,CDF)}

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 22 / 30

slide-36
SLIDE 36

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Illustrative example A B C D E F G 1 1 1 1 2 1 1 1 1 3 1 1 1 4 1 1 5 1 1 1 6 1 1 1 7 1 1 1 1 Running track of the OptCover algorithm Iteration 7 :PFC(6,F) = (67, BFG) ⇒ Lc = {(67,BFG)}

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 22 / 30

slide-37
SLIDE 37

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Illustrative example A B C D E F G 1 1 1 1 2 1 1 1 1 3 1 1 1 4 1 1 5 1 1 1 6 1 1 1 7 1 1 1 1 Running track of the OptCover algorithm Iteration 8 :PFC(7,A) = (7, ABFG) ⇒ Lc = {(7,ABFG)}

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 22 / 30

slide-38
SLIDE 38

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Illustrative example A B C D E F G 1 1 1 1 2 1 1 1 1 3 1 1 1 4 1 1 5 1 1 1 6 1 1 1 7 1 1 1 1 Running track of the OptCover algorithm Fc = {(123, CE), (12, CEF), (23, CEG), (12345, C), (124567, F), (5, CDF), (67, BFG), (7, ABFG)}

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 22 / 30

slide-39
SLIDE 39

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Experimental results

Dataset characteristics and the obtained covers

Dataset #items #obj. #f.concepts #(Lc) #(Cc) #(Bc) shuttle- landing- control 15 24 52 18 23 19 adult- stretch 20 10 89 14 16 9 lenses 24 12 128 23 32 12 zoo 101 28 377 52 40 25 hayes- roth 132 18 380 53 45 17 servo 167 19 432 59 72 19 post-

  • perative

90 25 1521 60 55 22

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 23 / 30

slide-40
SLIDE 40

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Experimental results

Comparaison with literature approaches in terms of cover compacity

Dataset #f. # "Bond" # Belkhiter’s # Belohlavek’s concepts cover cover cover shuttle- landing- control 52 19 20 22 adult- stretch 89 9 14 10 lenses 128 12 21 12 zoo 377 25 43 26 hayes- roth 380 17 49 17 servo 432 19 60 19 post-

  • perative

1521 22 52 22

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 24 / 30

slide-41
SLIDE 41

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Visual datamning (VDM) Plays an important role in knowledge discovery process Helps miners to create and validate hypotheses about the data. Track and understand the behavior of mining algorithms Allows a faster data exploration and often provides better results.

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 25 / 30

slide-42
SLIDE 42

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Visualization of association rules Two types of visualization : Assist end user in exploring the extracted rule set : [Wong et al. (1999),Hofmann et al., Kuntz et al. (2000), Unwin et al. (2001), Ong et al. (2002), Blanchard et al. (2003), Couturier et al. (2005)] Help end user along the execution of the mining algorithm : [Zaki & Phoophakdee, Yang (2003), Mahanti & Alhajj (2005), Liu & Salvendy (2006)]

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 26 / 30

slide-43
SLIDE 43

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Virtual reality based solution Sets the target to edit the result of a data mining process. The user gets a widespread and clear view of the relevant data interactively visualized in 3D. Allows the user to navigate trough and manipulate a scene in which the extracted knowledge is represented.

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 27 / 30

slide-44
SLIDE 44

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Demonstration of visualization tool

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 28 / 30

slide-45
SLIDE 45

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Conclusion A new anthropocentric visualization of a cover of association rules. This cover is obtained after the extraction of an optimal cover based

  • n the assessment of the correlation.

Virtual reality based visualization tool permitted a fully user-driven extraction process. Future work Extensive evaluation of the intuitiveness and ease of use of the visualization prototype. Study of the derivation of generic basis of association rules from the induced covers. Tackle the "folksonomies" seen as triadic contexts.

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 29 / 30

slide-46
SLIDE 46

Introduction Extraction of a minimal cover from a binary relation Virtual reality based visualization of association rules Conclusion and future work

Thanks for your attention.

Mouakher, Ben Yahia Anthropocentric visualization of optimal cover of association rules CLA 2010 30 / 30