Presentation Slides Scott : Structure Canonisation using - - PDF document

presentation slides scott structure canonisation using
SMART_READER_LITE
LIVE PREVIEW

Presentation Slides Scott : Structure Canonisation using - - PDF document

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/337890015 Presentation Slides Scott : Structure Canonisation using Ordered-Tree Translation Presentation December 2019


slide-1
SLIDE 1

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/337890015

— Presentation Slides — Scott : Structure Canonisation using Ordered-Tree Translation

Presentation · December 2019

CITATIONS READS

29

3 authors: Some of the authors of this publication are also working on these related projects: Gesture Analysis, SYnthesis and Recognition View project Phd project: Modeling neutrino radiative transfer in Type II core-collapse supernovae View project Nicolas Bloyet Université Bretagne Sud

4 PUBLICATIONS 0 CITATIONS

SEE PROFILE

Pierre-Francois Marteau Université Bretagne Sud, Vannes, France

129 PUBLICATIONS 887 CITATIONS

SEE PROFILE

Emmanuel Frénod Université Bretagne Sud

83 PUBLICATIONS 630 CITATIONS

SEE PROFILE

All content following this page was uploaded by Nicolas Bloyet on 11 December 2019.

The user has requested enhancement of the downloaded file.

slide-2
SLIDE 2

A Method for Representing Graphs as Rooted Trees for Graph Canonization

Scott : Structure Canonisation using Ordered-Tree Translation

Nicolas Bloyet1, 2, 3 Pierre-François Marteau1 Emmanuel Frénod2, 3

1IRISA - Université Bretagne Sud - Vannes, France 2LMBA - Université Bretagne Sud - Vannes, France 3See-d - Parc d’Innovation Bretagne Sud - Vannes, France

Complex Networks 2019, Lisbon, December 2019

slide-3
SLIDE 3

Outline

. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . .

0/15

slide-4
SLIDE 4

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1/15

Outline Graph Isomorphism Graph Canonization State of the Art Scott Implementation

Outline

We present Scott, a method to handle Graph Canonization (hence Graph Isomorphism) on fully labeled graphs (edges and vertices). The method produces :

  • a string representation of the input graph (trace), unique for an isomorphism class, which

can be used to identify any (sub)structure (hash)

  • a canonical adjacency matrix derived from that trace, which can be used to standardize

(up to an isomorphism) any graph computing We provide an open-source Python implementation.

Scott : Structure Canonisation using Ordered-Tree Translation

  • N. Bloyet, P-F. Marteau, E. Frénod
slide-5
SLIDE 5

Graph Isomorphism

. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . .

1/15

slide-6
SLIDE 6

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2/15

Outline Graph Isomorphism Graph Canonization State of the Art Scott Implementation

Graph Isomorphism problem

Let G ∈ G and H ∈ G be two graphs. They are said to be isomorphic if there exists a bijection between their respective vertices sets preserving edges. f

a e b g c h d i 1 3 4 2 5 6 7 8

f(a) = 1 f(b) = 6 f(c) = 7 f(d) = 4 f(e) = 5 f(g) = 3 f(h) = 2 f(i) = 8

f : VG → VH H G

Scott : Structure Canonisation using Ordered-Tree Translation

  • N. Bloyet, P-F. Marteau, E. Frénod
slide-7
SLIDE 7

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3/15

Outline Graph Isomorphism Graph Canonization State of the Art Scott Implementation

Isomorphism class

Isomorphism is an equivalence relation, and so partitions the ensemble of graphs into equivalence classes, where all graphs belonging to a class represent the same structure.

G/ ≃

...

C1 C2

In many graph-related tasks and applications, we want to consider graphs belonging to the same isomorphism class as equals.

Scott : Structure Canonisation using Ordered-Tree Translation

  • N. Bloyet, P-F. Marteau, E. Frénod
slide-8
SLIDE 8

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4/15

Outline Graph Isomorphism Graph Canonization State of the Art Scott Implementation

Problem Complexity

While determining if two graphs are isomorphic seems trivial for small graphs, it is actually a problem which remains unresolved in polynomial time (very costly). In some applications where we have to make a lot of isomorphism testing, we would prefer a re-usable method.

Scott : Structure Canonisation using Ordered-Tree Translation

  • N. Bloyet, P-F. Marteau, E. Frénod
slide-9
SLIDE 9

Graph Canonization

. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . .

4/15

slide-10
SLIDE 10

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5/15

Outline Graph Isomorphism Graph Canonization State of the Art Scott Implementation

Graph Canonization Problem

The graph canonization is a related problem, consisting in fjnding for a graph a canonical representant, usually a graph, unique for its isomorphism class. Two graphs are isomorphic if and only if they have the same canonical representant. G ≃ H ⇐ ⇒ Canon(G) = Canon(H) This problem is at least as diffjcult1 as graph isomorphism, as it answers to it explicitly, but provides a re-usable result.

1and actually more diffjcult in many cases

Scott : Structure Canonisation using Ordered-Tree Translation

  • N. Bloyet, P-F. Marteau, E. Frénod
slide-11
SLIDE 11

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6/15

Outline Graph Isomorphism Graph Canonization State of the Art Scott Implementation

Graph Canonization Problem

For example, Smiles notation [Weininger, 1988] maps any molecular graph to a canonical string encoding.

CN1C=NC2=C1C(=O)N(C(=O)N2C)C

SMILES

Once the canonical representant is computed, subsequent tasks like comparisons are trivial.

Scott : Structure Canonisation using Ordered-Tree Translation

  • N. Bloyet, P-F. Marteau, E. Frénod
slide-12
SLIDE 12

State of the Art

. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . .

6/15

slide-13
SLIDE 13

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7/15

Outline Graph Isomorphism Graph Canonization State of the Art Scott Implementation

State of the Art

Isomorphism testing

  • conauto [López-Presa et al., 2011]
  • saucy [Darga et al., 2008]

Canonization

  • nauty [McKay et al., 1981, McKay and Piperno, 2014]
  • bliss [Junttila and Kaski, 2007]
  • traces [Piperno, 2008, McKay and Piperno, 2014]

None of them is able to natively deal with labeled edges otherwise than rewriting the graph in an edge-unlabeled way, due to their utilization of equitable vertice-coloration, undefjned for edges-labeled graphs.

Scott : Structure Canonisation using Ordered-Tree Translation

  • N. Bloyet, P-F. Marteau, E. Frénod
slide-14
SLIDE 14

Scott

. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . .

7/15

slide-15
SLIDE 15

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8/15

Outline Graph Isomorphism Graph Canonization State of the Art Scott Implementation

Key Idea

We propose here an algorithm based on graph rewriting instead of equitable coloration, to handle natively edges-labeled graphs. Scott execution follows three main steps, illustrated below.

  • Levelling of vertices, according to an elected root
  • Re-writing of cycles without information loss
  • Exploitation of the resulting tree (as trace or as matrix)

Scott : Structure Canonisation using Ordered-Tree Translation

  • N. Bloyet, P-F. Marteau, E. Frénod
slide-16
SLIDE 16

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9/15

Outline Graph Isomorphism Graph Canonization State of the Art Scott Implementation

Tree encoding

It is known [Neveu, 1986] that we can formally encode a tree as a string of symbols. We propose a canonic notation σT : T → Σ∗, encoding any tree into a string. We can thus compress any tree t into a unique vertice of label σT(t).

r A B C D E Neveu(tr) = { r, A : a, B : b, BD : d, BE : e, C : c } ≤ level N

  • rder relation

σT(tr) = (A : a , (D : d, E : e) B : b, C : c )r

tr

a b c d e

Scott : Structure Canonisation using Ordered-Tree Translation

  • N. Bloyet, P-F. Marteau, E. Frénod
slide-17
SLIDE 17

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10/15

Outline Graph Isomorphism Graph Canonization State of the Art Scott Implementation

Vertices levelling

The fjrst step consists in ordering each vertex in levels, according to their minimum distance with a vertex identifjed as root. In the best case, the root identity is obvious (label, degree, etc.), otherwise it is determined a posteriori.

r r 1 1 2 2 3

Scott : Structure Canonisation using Ordered-Tree Translation

  • N. Bloyet, P-F. Marteau, E. Frénod
slide-18
SLIDE 18

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11/15

Outline Graph Isomorphism Graph Canonization State of the Art Scott Implementation

Cycle rewriting

We want to remove any form of cycle in the levelled graph, without any loss of information. Tree → Vertex compression : if N fjrst levels are computed, then any cycle encountered at level N + 1 belongs to one of the three following cases.

A B σA σB A B ∗1 ∗1 σA σB pc c a b c a b c

LHSc RHSc

A

&

a A a ps

LHSs RHSs

A B σC a b c C

LHSi

A B a b c σC C #1 c #1

RHSi

c σC C c

Self − bound Co − bound In − bound

pi Scott : Structure Canonisation using Ordered-Tree Translation

  • N. Bloyet, P-F. Marteau, E. Frénod
slide-19
SLIDE 19

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12/15

Outline Graph Isomorphism Graph Canonization State of the Art Scott Implementation

Cycle rewriting

To ensure the tree we are producing is representative of the isomorphism class, rewritings must be computed following a rigourous ordering, making the process deterministic for any graph encoding :

  • by level
  • by bound type
  • by lexicographic order of the tree implied in the rewriting

r A B C D E

a b c d

e

h f g i

r A B C D E

a b c d e f g i

§

h *{1} g

r A B C D E

a b c d e f i

§

h *{1} g

A B C D E

#{2}

*{3} *{3}

§

*{1}

*{1} a b c c d f e g g i h

#{2}

f

r D E §

*{1} g i h f *{1} g

r A B C

a b d e f *{3} c

D E §

*{1} g i h *{3} c

Scott : Structure Canonisation using Ordered-Tree Translation

  • N. Bloyet, P-F. Marteau, E. Frénod
slide-20
SLIDE 20

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13/15

Outline Graph Isomorphism Graph Canonization State of the Art Scott Implementation

Encoding by trace

We can fjnally encode the tree representative of the input graph’s isomorphism class. This trace can be used as an identifjer, or we can use it to induce an order on the input graph’s vertice set, to get a canonical adjacency matrix.

r A B C D E A B C D E

a b c d e h f g i

#{2}

*{3} *{3}

§

*{1} *{1} a b c c d f e g g i h

(((*{1}:g)C:d, ((E:i , *{1}:g ,§:h)D:f)#{2}:f, *{3}:c)A:a, (((E:i , *{1}:g ,§:h)D:f)#{2}:e , *{3}:c)B:b)r

#{2}

f

1 2 3 A B D C E

a b d c g e f h i

r r 1 2 3 D E §

*{1} g i h f

N Scott : Structure Canonisation using Ordered-Tree Translation

  • N. Bloyet, P-F. Marteau, E. Frénod
slide-21
SLIDE 21

Implementation

. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . .

13/15

slide-22
SLIDE 22

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14/15

Outline Graph Isomorphism Graph Canonization State of the Art Scott Implementation

Python library

We provide a Python library2 (MIT licence) implementing Scott. Time complexity is exponential in worst cases3, but much more reasonable on simple graphs like molecules.

200 300 400 500 10−1 101 103 105 number of vertices time (s) (a) shrunken multipedes (combinatorics graphs) scott bliss nauty traces 200 400 600 800 1,000 20 40 60 80 number of vertices time (s) (b) pubchem (chemical graphs) scott loess smoothing

2github.com/theplatypus/scott 3a lot of vertices may be created

Scott : Structure Canonisation using Ordered-Tree Translation

  • N. Bloyet, P-F. Marteau, E. Frénod
slide-23
SLIDE 23

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15/15

Outline Graph Isomorphism Graph Canonization State of the Art Scott Implementation

Conclusion

Scott: an algorithm that provides a canonical representant to any fully labeled graph

  • canonical string representation
  • canonical adjacency matrix

Python implementation available at github.com/theplatypus/scott. Can be used for :

  • graph indexing and retrieval
  • graph fragmentation and embedding
  • classifjcation/regression for graph data in Graph Neural Networks

Scott : Structure Canonisation using Ordered-Tree Translation

  • N. Bloyet, P-F. Marteau, E. Frénod
slide-24
SLIDE 24

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15/15

Outline Graph Isomorphism Graph Canonization State of the Art Scott Implementation

Thanks for your attention

github.com/theplatypus/scott

Scott : Structure Canonisation using Ordered-Tree Translation

  • N. Bloyet, P-F. Marteau, E. Frénod
slide-25
SLIDE 25

References

. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . .

0/4

slide-26
SLIDE 26

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1/4

References i

Babai, L. (2016). Graph isomorphism in quasipolynomial time. In Proceedings of the forty-eighth annual ACM symposium on Theory of Computing, pages 684–697. ACM. Babai, L. and Luks, E. M. (1983). Canonical labeling of graphs. In Proceedings of the fjfteenth annual ACM symposium on Theory of computing, pages 171–183. ACM. Darga, P. T., Sakallah, K. A., and Markov, I. L. (2008). Faster symmetry discovery using sparsity of symmetries. In Design Automation Conference, 2008. DAC 2008. 45th ACM/IEEE, pages 149–154. IEEE.

slide-27
SLIDE 27

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2/4

References ii

Harary, F. and Palmer, E. M. (2014). Graphical enumeration. Elsevier. Junttila, T. and Kaski, P. (2007). Engineering an effjcient canonical labeling tool for large and sparse graphs. In 2007 Proceedings of the Ninth Workshop on Algorithm Engineering and Experiments (ALENEX), pages 135–149. SIAM. López-Presa, J. L., Anta, A. F., and Chiroque, L. N. (2011). Conauto-2.0: Fast isomorphism testing and automorphism group computation. arXiv preprint arXiv:1108.1060. McKay, B. D. et al. (1981). Practical graph isomorphism.

slide-28
SLIDE 28

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3/4

References iii

McKay, B. D. and Piperno, A. (2014). Practical graph isomorphism, ii. Journal of Symbolic Computation, 60:94–112. Neuen, D. and Schweitzer, P. (2017). Benchmark graphs for practical graph isomorphism. arXiv preprint arXiv:1705.03686. Neveu, J. (1986). Arbres et processus de galton-watson. In Annales de l’IHP Probabilités et statistiques, volume 22, pages 199–207. Olsen, G. (1990). Gary olsen’s interpretation of the “newick’s 8: 45” tree format standard. URL http://evolution. genetics. washington. edu/phylip/newick doc. html.

slide-29
SLIDE 29

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4/4

References iv

Piperno, A. (2008). Search space contraction in canonical labeling of graphs. arXiv preprint arXiv:0804.4881. Velasco, P. P. P. (2008). Matrix Graph Grammars. Velasco, P. P. P. and de Lara, J. (2006). Matrix Approach to Graph Transformation: Matching and Sequences. Lecture Notes in Computer Science, 4178:122. Weininger, D. (1988). Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules. Journal of chemical information and computer sciences, 28(1):31–36.

View publication stats View publication stats