[PPT] - Verified Decision Procedures for Equivalence of Regular Expressions PowerPoint Presentation

SLIDE 1

Verified Decision Procedures for Equivalence of Regular Expressions

Tobias Nipkow & Dmitriy Traytel

Fakult¨ at f¨ ur Informatik Technische Universit¨ at M¨ unchen

SLIDE 2

Background

Recent series of papers presenting such decision procedures verified in Coq, Isabelle or Matita:

SLIDE 3

Background

Recent series of papers presenting such decision procedures verified in Coq, Isabelle or Matita: Braibant & Pous 2010, Krauss & Nipkow 2011, Coquand & Siles 2011, Asperti 2012, Moreira et al. 2013

SLIDE 4

Background

Recent series of papers presenting such decision procedures verified in Coq, Isabelle or Matita: Braibant & Pous 2010, Krauss & Nipkow 2011, Coquand & Siles 2011, Asperti 2012, Moreira et al. 2013 They all operate on regular expressions, not automata

SLIDE 5

Background

Recent series of papers presenting such decision procedures verified in Coq, Isabelle or Matita: Braibant & Pous 2010, Krauss & Nipkow 2011, Coquand & Siles 2011, Asperti 2012, Moreira et al. 2013 They all operate on regular expressions, not automata They all look different but related . . .

SLIDE 6

This talk

Unified framework

SLIDE 7

This talk

Unified framework
Derivation of all previous procedures

as instances

SLIDE 8

This talk

Unified framework
Derivation of all previous procedures

as instances

Verification in Isabelle

SLIDE 9

1 The Unified Framework 2 Derivatives of Regular Expressions 3 Partial Derivatives of Regular Expressions 4 Marked regular expressions 5 Empirical Comparison

SLIDE 10

Regular expressions

datatype α rexp = 0 | 1 | Atom α | α rexp + α rexp | α rexp · α rexp | α rexp ∗

SLIDE 11

Regular expressions

datatype α rexp = 0 | 1 | Atom α | α rexp + α rexp | α rexp · α rexp | α rexp ∗ Semantics: L :: α rexp → α lang where α lang = α list set

SLIDE 12

How to prove r ≡ s

SLIDE 13

How to prove r ≡ s

1 Translate to DFAs A and B

SLIDE 14

How to prove r ≡ s

1 Translate to DFAs A and B 2 Compare A and B

SLIDE 15

How to prove r ≡ s

1 Translate to DFAs A and B 2 Compare A and B

Standard algorithm:

Minimize A and B, check isomorphism.

SLIDE 16

How to prove r ≡ s

1 Translate to DFAs A and B 2 Compare A and B

Standard algorithm:

Minimize A and B, check isomorphism.

Easy alternative:

Check for all reachable states (p, q) of A × B that p is final iff q is final.

SLIDE 17

Framework parameters

SLIDE 18

Framework parameters

Type σ

SLIDE 19

Framework parameters

Type σ Init init :: α rexp → σ

SLIDE 20

Framework parameters

Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ

SLIDE 21

Framework parameters

Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ Final fin :: σ → bool

SLIDE 22

Framework parameters

Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ Final fin :: σ → bool Language L :: σ → α lang

SLIDE 23

Framework parameters

Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ Final fin :: σ → bool Language L :: σ → α lang Assumptions: L(init(r)) = L(r)

SLIDE 24

Framework parameters

Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ Final fin :: σ → bool Language L :: σ → α lang Assumptions: L(init(r)) = L(r) L(δ x s) = {w | xw ∈ L(s)}

SLIDE 25

Framework parameters

Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ Final fin :: σ → bool Language L :: σ → α lang Assumptions: L(init(r)) = L(r) L(δ x s) = {w | xw ∈ L(s)} fin(s) ⇔ [] ∈ L(s)

SLIDE 26

Equivalence checker

SLIDE 27

Equivalence checker

eqv :: α rexp → α rexp → bool

SLIDE 28

Equivalence checker

eqv :: α rexp → α rexp → bool eqv r s = case closure (init(r), init(s)) of Some([], ) ⇒ True | ⇒ False

SLIDE 29

Equivalence checker

eqv :: α rexp → α rexp → bool eqv r s = case closure (init(r), init(s)) of Some([], ) ⇒ True | ⇒ False

Theorem

eqv r s = ⇒ L(r) = L(s)

SLIDE 30

Equivalence checker

eqv :: α rexp → α rexp → bool eqv r s = case closure (init(r), init(s)) of Some([], ) ⇒ True | ⇒ False

Theorem

eqv r s = ⇒ L(r) = L(s) If the set of reachable states is finite:

Theorem

L(r) = L(s) = ⇒ eqv r s

SLIDE 31

1 The Unified Framework 2 Derivatives of Regular Expressions 3 Partial Derivatives of Regular Expressions 4 Marked regular expressions 5 Empirical Comparison

SLIDE 32

Derivatives (Brzozowski 1964)

d :: α → α rexp → α rexp

SLIDE 33

Derivatives (Brzozowski 1964)

d :: α → α rexp → α rexp

d x r is the derivative of r wrt x

SLIDE 34

Derivatives (Brzozowski 1964)

d :: α → α rexp → α rexp

d x r is the derivative of r wrt x
d x r = “what is left after x has been read”

SLIDE 35

Derivatives (Brzozowski 1964)

d :: α → α rexp → α rexp

d x r is the derivative of r wrt x
d x r = “what is left after x has been read”
Example: d a (Atom(a) · r) = 1 · r

SLIDE 36

Derivatives (Brzozowski 1964)

d :: α → α rexp → α rexp

d x r is the derivative of r wrt x
d x r = “what is left after x has been read”
Example: d a (Atom(a) · r) = 1 · r
Semantics is left-quotient:

L(d x r) = {w | xw ∈ L(r)}

SLIDE 37

d x 0 = 0

SLIDE 38

d x 0 = 0 d x 1 = 0

SLIDE 39

d x 0 = 0 d x 1 = 0 d x (Atom y) = if x = y then 1 else 0

SLIDE 40

d x 0 = 0 d x 1 = 0 d x (Atom y) = if x = y then 1 else 0 d x (r + s) = d x r + d x s

SLIDE 41

d x 0 = 0 d x 1 = 0 d x (Atom y) = if x = y then 1 else 0 d x (r + s) = d x r + d x s d x (r · s) = if ε(r) then d x r · s + d x s else d x r · s

SLIDE 42

d x 0 = 0 d x 1 = 0 d x (Atom y) = if x = y then 1 else 0 d x (r + s) = d x r + d x s d x (r · s) = if ε(r) then d x r · s + d x s else d x r · s d x (r∗) = d x r · r∗

SLIDE 43

Regular Expression DFA

a · a∗

SLIDE 44

Regular Expression DFA

a · a∗ 1 · a∗ a

SLIDE 45

Regular Expression DFA

a · a∗ 1 · a∗ a 0 · a∗ + 1 · a∗ a

SLIDE 46

Regular Expression DFA

a · a∗ 1 · a∗ a 0 · a∗ + 1 · a∗ a a

SLIDE 47

Finiteness

SLIDE 48

Finiteness

Let ≡ACI be the equivalence induced by ACI of +

SLIDE 49

Finiteness

Let ≡ACI be the equivalence induced by ACI of + Theorem (Brzozowski 1964) The set {fold d w r | w ∈ Σ∗}/≡ACI is finite.

SLIDE 50

Finiteness

Let ≡ACI be the equivalence induced by ACI of + Theorem (Brzozowski 1964) The set {fold d w r | w ∈ Σ∗}/≡ACI is finite. How large?

SLIDE 51

Finiteness

Let ≡ACI be the equivalence induced by ACI of + Theorem (Brzozowski 1964) The set {fold d w r | w ∈ Σ∗}/≡ACI is finite. How large? Brzozowski’s proof yields O(2...2n )

SLIDE 52

Instantiation of framework

SLIDE 53

Instantiation of framework

σ = α rexp

SLIDE 54

Instantiation of framework

σ = α rexp init(r) = r

SLIDE 55

Instantiation of framework

σ = α rexp init(r) = r δ x r = normACI(d x r)

SLIDE 56

Instantiation of framework

σ = α rexp init(r) = r δ x r = normACI(d x r) fin = ε

SLIDE 57

Instantiation of framework

σ = α rexp init(r) = r δ x r = normACI(d x r) fin = ε L = L

SLIDE 58

Instantiation of framework

σ = α rexp init(r) = r δ x r = normACI(d x r) fin = ε L = L Finiteness:

SLIDE 59

Instantiation of framework

σ = α rexp init(r) = r δ x r = normACI(d x r) fin = ε L = L Finiteness:

Not immediate from Brzozowski’s theorem

SLIDE 60

Instantiation of framework

σ = α rexp init(r) = r δ x r = normACI(d x r) fin = ε L = L Finiteness:

Not immediate from Brzozowski’s theorem
Open for stronger normalization functions

SLIDE 61

1 The Unified Framework 2 Derivatives of Regular Expressions 3 Partial Derivatives of Regular Expressions 4 Marked regular expressions 5 Empirical Comparison

SLIDE 62

Antimirov 1996

SLIDE 63

Antimirov 1996

Idea: build some of ≡ into the data structure set:

SLIDE 64

Antimirov 1996

Idea: build some of ≡ into the data structure set: d : α → α rexp → α rexp

SLIDE 65

Antimirov 1996

Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set

SLIDE 66

Antimirov 1996

Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set d x (r + s) = d x r + d x s

SLIDE 67

Antimirov 1996

Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set D x (r + s) = D x r ∪ D x s

SLIDE 68

Antimirov 1996

Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set D x (r + s) = D x r ∪ D x s d x (r · s) = if ε(r) then d x r · s + d x s else d x r · s

SLIDE 69

Antimirov 1996

Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set D x (r + s) = D x r ∪ D x s D x (r · s) = if ε(r) then D x r ⊙ s ∪ D x s else D x r ⊙ s

SLIDE 70

Antimirov 1996

Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set D x (r + s) = D x r ∪ D x s D x (r · s) = if ε(r) then D x r ⊙ s ∪ D x s else D x r ⊙ s where {r1, . . . , rn} ⊙ s = {r1 · s, . . . , rn · s}

SLIDE 71

Antimirov 1996

Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set D x (r + s) = D x r ∪ D x s D x (r · s) = if ε(r) then D x r ⊙ s ∪ D x s else D x r ⊙ s . . . where {r1, . . . , rn} ⊙ s = {r1 · s, . . . , rn · s}

SLIDE 72

Instantiation of framework

SLIDE 73

Instantiation of framework

σ = α rexp set

SLIDE 74

Instantiation of framework

σ = α rexp set init(r) = {r}

SLIDE 75

Instantiation of framework

σ = α rexp set init(r) = {r} δ x R =

r∈R

D x r

SLIDE 76

Instantiation of framework

σ = α rexp set init(r) = {r} δ x R =

r∈R

D x r fin(R) = ∃r ∈ R. ε(r)

SLIDE 77

Instantiation of framework

σ = α rexp set init(r) = {r} δ x R =

r∈R

D x r fin(R) = ∃r ∈ R. ε(r) L(R) =

r∈R

L(r)

SLIDE 78

Finiteness

Theorem (Antimirov 1996) Starting from a regular expression r

SLIDE 79

Finiteness

Theorem (Antimirov 1996) Starting from a regular expression r at most |r|at + 1 regular expressions are reachable

SLIDE 80

Finiteness

Theorem (Antimirov 1996) Starting from a regular expression r at most |r|at + 1 regular expressions are reachable where |r|at is the number of occurrences of atoms in r.

SLIDE 81

Finiteness

Theorem (Antimirov 1996) Starting from a regular expression r at most |r|at + 1 regular expressions are reachable where |r|at is the number of occurrences of atoms in r. = ⇒ 2|r|at+1 sets of regular expressions reachable

SLIDE 82

1 The Unified Framework 2 Derivatives of Regular Expressions 3 Partial Derivatives of Regular Expressions 4 Marked regular expressions 5 Empirical Comparison

SLIDE 83

History

McNaughton & Yamada 1960, Glushkov 1961:

SLIDE 84

History

McNaughton & Yamada 1960, Glushkov 1961:

Translation of regular expression to N/DFA

SLIDE 85

History

McNaughton & Yamada 1960, Glushkov 1961:

Translation of regular expression to N/DFA
Atoms in regular expression are indexed, eg

a1 · a2 + a3 · b1

SLIDE 86

History

McNaughton & Yamada 1960, Glushkov 1961:

Translation of regular expression to N/DFA
Atoms in regular expression are indexed, eg

a1 · a2 + a3 · b1

States are (sets of) indexed atoms, eg {a1, a3}

SLIDE 87

History

McNaughton & Yamada 1960, Glushkov 1961:

Translation of regular expression to N/DFA
Atoms in regular expression are indexed, eg

a1 · a2 + a3 · b1

States are (sets of) indexed atoms, eg {a1, a3}

Functional implementation by Fischer, Huch & Wilke [ICFP 2009]:

Replace sets of positions

by marked regular expressions: Atom(bool, α)

SLIDE 88

History

McNaughton & Yamada 1960, Glushkov 1961:

Translation of regular expression to N/DFA
Atoms in regular expression are indexed, eg

a1 · a2 + a3 · b1

States are (sets of) indexed atoms, eg {a1, a3}

Functional implementation by Fischer, Huch & Wilke [ICFP 2009]:

Replace sets of positions

by marked regular expressions: Atom(bool, α)

Only matching, not ≡, no proofs

SLIDE 89

Example: (a · a + a · b)∗

SLIDE 90

Example: (a · a + a · b)∗

q0

SLIDE 91

Example: (a · a + a · b)∗

q0 (a · a + a · b)∗ a

SLIDE 92

Example: (a · a + a · b)∗

q0 (a · a + a · b)∗ a (a · a + a · b)∗ a

SLIDE 93

Example: (a · a + a · b)∗

q0 (a · a + a · b)∗ a (a · a + a · b)∗ a a

SLIDE 94

Example: (a · a + a · b)∗

q0 (a · a + a · b)∗ a (a · a + a · b)∗ a a (a · a + a · b)∗ b

SLIDE 95

Example: (a · a + a · b)∗

q0 (a · a + a · b)∗ a (a · a + a · b)∗ a a (a · a + a · b)∗ b a

SLIDE 96

Instantiation of framework

SLIDE 97

Instantiation of framework

σ = bool × (bool × α) rexp

SLIDE 98

Instantiation of framework

σ = bool × (bool × α) rexp init(r) = (True, map (λa. (False, a)) r)

SLIDE 99

Instantiation of framework

σ = bool × (bool × α) rexp init(r) = (True, map (λa. (False, a)) r) δ x (m, r) = (False, read x (follow m r))

SLIDE 100

Instantiation of framework

σ = bool × (bool × α) rexp init(r) = (True, map (λa. (False, a)) r) δ x (m, r) = (False, read x (follow m r)) fin(m, r) = . . . L(m, r) = . . .

SLIDE 101

Conceptually, the marks in McNaugton/Glushkov/Fisher are after the atoms

SLIDE 102

Marked regular expressions II

Asperti [ITP 2012]:

Verified ≡-checker via marked rexp in Matita

SLIDE 103

Marked regular expressions II

Asperti [ITP 2012]:

Verified ≡-checker via marked rexp in Matita
Says he has formalised McNaughton & Yamada

SLIDE 104

Marked regular expressions II

Asperti [ITP 2012]:

Verified ≡-checker via marked rexp in Matita
Says he has formalised McNaughton & Yamada
. . . but he invented his own variation:

SLIDE 105

Marked regular expressions II

Asperti [ITP 2012]:

Verified ≡-checker via marked rexp in Matita
Says he has formalised McNaughton & Yamada
. . . but he invented his own variation:

Puts the mark before the atom

SLIDE 106

Example: (a · a + a · b)∗

SLIDE 107

Example: (a · a + a · b)∗

(a · a + a · b)∗

SLIDE 108

Example: (a · a + a · b)∗

(a · a + a · b)∗ (a · a + a · b)∗ a

SLIDE 109

Example: (a · a + a · b)∗

(a · a + a · b)∗ (a · a + a · b)∗ a a, b

SLIDE 110

Instantiation of framework

Similar but a bit more complicated

SLIDE 111

Before vs After

SLIDE 112

Before vs After

Transitions can be decomposed into two steps:

Before: read; follow

SLIDE 113

Before vs After

Transitions can be decomposed into two steps:

Before: read; follow
After: follow; read

SLIDE 114

Before vs After

Transitions can be decomposed into two steps:

Before: read; follow
After: follow; read

Theorem

The before-automaton is a homorphic image of the after-automaton.

SLIDE 115

Before vs After

Transitions can be decomposed into two steps:

Before: read; follow
After: follow; read

Theorem

The before-automaton is a homorphic image of the after-automaton. Proof idea due to Helmut Seidl.

SLIDE 116

1 The Unified Framework 2 Derivatives of Regular Expressions 3 Partial Derivatives of Regular Expressions 4 Marked regular expressions 5 Empirical Comparison

SLIDE 117

200 400 600 2 4 6 8 n Time (s)

(a0 + · · · + an−1) · (an)∗ ≡ a∗

Deriv. Marked Part.Deriv.

SLIDE 118

200 400 600 2 4 6 8 n Time (s)

(a0 + · · · + an−1) · (an)∗ ≡ a∗

Deriv. Marked Part.Deriv.

For randomly generated examples:

Deriv. ≫ Part.Deriv. ≫ Fischer, Asperti

SLIDE 119

Extended regular expressions

SLIDE 120

Extended regular expressions

Complement and intersection:

SLIDE 121

Extended regular expressions

Complement and intersection:

Trivial for derivatives (Brzozowski)

SLIDE 122

Extended regular expressions

Complement and intersection:

Trivial for derivatives (Brzozowski)
Harder for partial derivatives

(Champarnaud and Mignot)

SLIDE 123

Extended regular expressions

Complement and intersection:

Trivial for derivatives (Brzozowski)
Harder for partial derivatives

(Champarnaud and Mignot)

Unclear for marked regular expressions

SLIDE 124

Extended regular expressions

Complement and intersection:

Trivial for derivatives (Brzozowski)
Harder for partial derivatives

(Champarnaud and Mignot)

Unclear for marked regular expressions

. . . and projection:

SLIDE 125

Extended regular expressions

Complement and intersection:

Trivial for derivatives (Brzozowski)
Harder for partial derivatives

(Champarnaud and Mignot)

Unclear for marked regular expressions

. . . and projection:

Traytel & N. [ICFP 13] extend derivatives

SLIDE 126

Extended regular expressions

Complement and intersection:

Trivial for derivatives (Brzozowski)
Harder for partial derivatives

(Champarnaud and Mignot)

Unclear for marked regular expressions

. . . and projection:

Traytel & N. [ICFP 13] extend derivatives

decision procedure for MSO on finite strings

SLIDE 127

Summary

Equivalence-checkers for regular expressions can be defined purely functionally

SLIDE 128

Summary

Equivalence-checkers for regular expressions can be defined purely functionally via (partial) derivatives

r marked regular expressions

SLIDE 129

Summary

Equivalence-checkers for regular expressions can be defined purely functionally via (partial) derivatives

r marked regular expressions