SLIDE 1 Verified Decision Procedures for Equivalence of Regular Expressions
Tobias Nipkow & Dmitriy Traytel
Fakult¨ at f¨ ur Informatik Technische Universit¨ at M¨ unchen
SLIDE 2
Background
Recent series of papers presenting such decision procedures verified in Coq, Isabelle or Matita:
SLIDE 3
Background
Recent series of papers presenting such decision procedures verified in Coq, Isabelle or Matita: Braibant & Pous 2010, Krauss & Nipkow 2011, Coquand & Siles 2011, Asperti 2012, Moreira et al. 2013
SLIDE 4
Background
Recent series of papers presenting such decision procedures verified in Coq, Isabelle or Matita: Braibant & Pous 2010, Krauss & Nipkow 2011, Coquand & Siles 2011, Asperti 2012, Moreira et al. 2013 They all operate on regular expressions, not automata
SLIDE 5
Background
Recent series of papers presenting such decision procedures verified in Coq, Isabelle or Matita: Braibant & Pous 2010, Krauss & Nipkow 2011, Coquand & Siles 2011, Asperti 2012, Moreira et al. 2013 They all operate on regular expressions, not automata They all look different but related . . .
SLIDE 7 This talk
- Unified framework
- Derivation of all previous procedures
as instances
SLIDE 8 This talk
- Unified framework
- Derivation of all previous procedures
as instances
SLIDE 9
1 The Unified Framework 2 Derivatives of Regular Expressions 3 Partial Derivatives of Regular Expressions 4 Marked regular expressions 5 Empirical Comparison
SLIDE 10
Regular expressions
datatype α rexp = 0 | 1 | Atom α | α rexp + α rexp | α rexp · α rexp | α rexp ∗
SLIDE 11
Regular expressions
datatype α rexp = 0 | 1 | Atom α | α rexp + α rexp | α rexp · α rexp | α rexp ∗ Semantics: L :: α rexp → α lang where α lang = α list set
SLIDE 12
How to prove r ≡ s
SLIDE 13 How to prove r ≡ s
1 Translate to DFAs A and B
SLIDE 14 How to prove r ≡ s
1 Translate to DFAs A and B 2 Compare A and B
SLIDE 15 How to prove r ≡ s
1 Translate to DFAs A and B 2 Compare A and B
Minimize A and B, check isomorphism.
SLIDE 16 How to prove r ≡ s
1 Translate to DFAs A and B 2 Compare A and B
Minimize A and B, check isomorphism.
Check for all reachable states (p, q) of A × B that p is final iff q is final.
SLIDE 17
Framework parameters
SLIDE 18
Framework parameters
Type σ
SLIDE 19
Framework parameters
Type σ Init init :: α rexp → σ
SLIDE 20
Framework parameters
Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ
SLIDE 21
Framework parameters
Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ Final fin :: σ → bool
SLIDE 22
Framework parameters
Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ Final fin :: σ → bool Language L :: σ → α lang
SLIDE 23
Framework parameters
Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ Final fin :: σ → bool Language L :: σ → α lang Assumptions: L(init(r)) = L(r)
SLIDE 24
Framework parameters
Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ Final fin :: σ → bool Language L :: σ → α lang Assumptions: L(init(r)) = L(r) L(δ x s) = {w | xw ∈ L(s)}
SLIDE 25
Framework parameters
Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ Final fin :: σ → bool Language L :: σ → α lang Assumptions: L(init(r)) = L(r) L(δ x s) = {w | xw ∈ L(s)} fin(s) ⇔ [] ∈ L(s)
SLIDE 26
Equivalence checker
SLIDE 27
Equivalence checker
eqv :: α rexp → α rexp → bool
SLIDE 28
Equivalence checker
eqv :: α rexp → α rexp → bool eqv r s = case closure (init(r), init(s)) of Some([], ) ⇒ True | ⇒ False
SLIDE 29
Equivalence checker
eqv :: α rexp → α rexp → bool eqv r s = case closure (init(r), init(s)) of Some([], ) ⇒ True | ⇒ False
Theorem
eqv r s = ⇒ L(r) = L(s)
SLIDE 30
Equivalence checker
eqv :: α rexp → α rexp → bool eqv r s = case closure (init(r), init(s)) of Some([], ) ⇒ True | ⇒ False
Theorem
eqv r s = ⇒ L(r) = L(s) If the set of reachable states is finite:
Theorem
L(r) = L(s) = ⇒ eqv r s
SLIDE 31
1 The Unified Framework 2 Derivatives of Regular Expressions 3 Partial Derivatives of Regular Expressions 4 Marked regular expressions 5 Empirical Comparison
SLIDE 32
Derivatives (Brzozowski 1964)
d :: α → α rexp → α rexp
SLIDE 33 Derivatives (Brzozowski 1964)
d :: α → α rexp → α rexp
- d x r is the derivative of r wrt x
SLIDE 34 Derivatives (Brzozowski 1964)
d :: α → α rexp → α rexp
- d x r is the derivative of r wrt x
- d x r = “what is left after x has been read”
SLIDE 35 Derivatives (Brzozowski 1964)
d :: α → α rexp → α rexp
- d x r is the derivative of r wrt x
- d x r = “what is left after x has been read”
- Example: d a (Atom(a) · r) = 1 · r
SLIDE 36 Derivatives (Brzozowski 1964)
d :: α → α rexp → α rexp
- d x r is the derivative of r wrt x
- d x r = “what is left after x has been read”
- Example: d a (Atom(a) · r) = 1 · r
- Semantics is left-quotient:
L(d x r) = {w | xw ∈ L(r)}
SLIDE 37
d x 0 = 0
SLIDE 38
d x 0 = 0 d x 1 = 0
SLIDE 39
d x 0 = 0 d x 1 = 0 d x (Atom y) = if x = y then 1 else 0
SLIDE 40
d x 0 = 0 d x 1 = 0 d x (Atom y) = if x = y then 1 else 0 d x (r + s) = d x r + d x s
SLIDE 41
d x 0 = 0 d x 1 = 0 d x (Atom y) = if x = y then 1 else 0 d x (r + s) = d x r + d x s d x (r · s) = if ε(r) then d x r · s + d x s else d x r · s
SLIDE 42
d x 0 = 0 d x 1 = 0 d x (Atom y) = if x = y then 1 else 0 d x (r + s) = d x r + d x s d x (r · s) = if ε(r) then d x r · s + d x s else d x r · s d x (r∗) = d x r · r∗
SLIDE 43
Regular Expression DFA
a · a∗
SLIDE 44
Regular Expression DFA
a · a∗ 1 · a∗ a
SLIDE 45
Regular Expression DFA
a · a∗ 1 · a∗ a 0 · a∗ + 1 · a∗ a
SLIDE 46
Regular Expression DFA
a · a∗ 1 · a∗ a 0 · a∗ + 1 · a∗ a a
SLIDE 47
Finiteness
SLIDE 48
Finiteness
Let ≡ACI be the equivalence induced by ACI of +
SLIDE 49
Finiteness
Let ≡ACI be the equivalence induced by ACI of + Theorem (Brzozowski 1964) The set {fold d w r | w ∈ Σ∗}/≡ACI is finite.
SLIDE 50
Finiteness
Let ≡ACI be the equivalence induced by ACI of + Theorem (Brzozowski 1964) The set {fold d w r | w ∈ Σ∗}/≡ACI is finite. How large?
SLIDE 51
Finiteness
Let ≡ACI be the equivalence induced by ACI of + Theorem (Brzozowski 1964) The set {fold d w r | w ∈ Σ∗}/≡ACI is finite. How large? Brzozowski’s proof yields O(2...2n )
SLIDE 52
Instantiation of framework
SLIDE 53
Instantiation of framework
σ = α rexp
SLIDE 54
Instantiation of framework
σ = α rexp init(r) = r
SLIDE 55
Instantiation of framework
σ = α rexp init(r) = r δ x r = normACI(d x r)
SLIDE 56
Instantiation of framework
σ = α rexp init(r) = r δ x r = normACI(d x r) fin = ε
SLIDE 57
Instantiation of framework
σ = α rexp init(r) = r δ x r = normACI(d x r) fin = ε L = L
SLIDE 58
Instantiation of framework
σ = α rexp init(r) = r δ x r = normACI(d x r) fin = ε L = L Finiteness:
SLIDE 59 Instantiation of framework
σ = α rexp init(r) = r δ x r = normACI(d x r) fin = ε L = L Finiteness:
- Not immediate from Brzozowski’s theorem
SLIDE 60 Instantiation of framework
σ = α rexp init(r) = r δ x r = normACI(d x r) fin = ε L = L Finiteness:
- Not immediate from Brzozowski’s theorem
- Open for stronger normalization functions
SLIDE 61
1 The Unified Framework 2 Derivatives of Regular Expressions 3 Partial Derivatives of Regular Expressions 4 Marked regular expressions 5 Empirical Comparison
SLIDE 62
Antimirov 1996
SLIDE 63
Antimirov 1996
Idea: build some of ≡ into the data structure set:
SLIDE 64
Antimirov 1996
Idea: build some of ≡ into the data structure set: d : α → α rexp → α rexp
SLIDE 65
Antimirov 1996
Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set
SLIDE 66
Antimirov 1996
Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set d x (r + s) = d x r + d x s
SLIDE 67
Antimirov 1996
Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set D x (r + s) = D x r ∪ D x s
SLIDE 68
Antimirov 1996
Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set D x (r + s) = D x r ∪ D x s d x (r · s) = if ε(r) then d x r · s + d x s else d x r · s
SLIDE 69
Antimirov 1996
Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set D x (r + s) = D x r ∪ D x s D x (r · s) = if ε(r) then D x r ⊙ s ∪ D x s else D x r ⊙ s
SLIDE 70
Antimirov 1996
Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set D x (r + s) = D x r ∪ D x s D x (r · s) = if ε(r) then D x r ⊙ s ∪ D x s else D x r ⊙ s where {r1, . . . , rn} ⊙ s = {r1 · s, . . . , rn · s}
SLIDE 71
Antimirov 1996
Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set D x (r + s) = D x r ∪ D x s D x (r · s) = if ε(r) then D x r ⊙ s ∪ D x s else D x r ⊙ s . . . where {r1, . . . , rn} ⊙ s = {r1 · s, . . . , rn · s}
SLIDE 72
Instantiation of framework
SLIDE 73
Instantiation of framework
σ = α rexp set
SLIDE 74
Instantiation of framework
σ = α rexp set init(r) = {r}
SLIDE 75 Instantiation of framework
σ = α rexp set init(r) = {r} δ x R =
D x r
SLIDE 76 Instantiation of framework
σ = α rexp set init(r) = {r} δ x R =
D x r fin(R) = ∃r ∈ R. ε(r)
SLIDE 77 Instantiation of framework
σ = α rexp set init(r) = {r} δ x R =
D x r fin(R) = ∃r ∈ R. ε(r) L(R) =
L(r)
SLIDE 78
Finiteness
Theorem (Antimirov 1996) Starting from a regular expression r
SLIDE 79
Finiteness
Theorem (Antimirov 1996) Starting from a regular expression r at most |r|at + 1 regular expressions are reachable
SLIDE 80
Finiteness
Theorem (Antimirov 1996) Starting from a regular expression r at most |r|at + 1 regular expressions are reachable where |r|at is the number of occurrences of atoms in r.
SLIDE 81
Finiteness
Theorem (Antimirov 1996) Starting from a regular expression r at most |r|at + 1 regular expressions are reachable where |r|at is the number of occurrences of atoms in r. = ⇒ 2|r|at+1 sets of regular expressions reachable
SLIDE 82
1 The Unified Framework 2 Derivatives of Regular Expressions 3 Partial Derivatives of Regular Expressions 4 Marked regular expressions 5 Empirical Comparison
SLIDE 83
History
McNaughton & Yamada 1960, Glushkov 1961:
SLIDE 84 History
McNaughton & Yamada 1960, Glushkov 1961:
- Translation of regular expression to N/DFA
SLIDE 85 History
McNaughton & Yamada 1960, Glushkov 1961:
- Translation of regular expression to N/DFA
- Atoms in regular expression are indexed, eg
a1 · a2 + a3 · b1
SLIDE 86 History
McNaughton & Yamada 1960, Glushkov 1961:
- Translation of regular expression to N/DFA
- Atoms in regular expression are indexed, eg
a1 · a2 + a3 · b1
- States are (sets of) indexed atoms, eg {a1, a3}
SLIDE 87 History
McNaughton & Yamada 1960, Glushkov 1961:
- Translation of regular expression to N/DFA
- Atoms in regular expression are indexed, eg
a1 · a2 + a3 · b1
- States are (sets of) indexed atoms, eg {a1, a3}
Functional implementation by Fischer, Huch & Wilke [ICFP 2009]:
- Replace sets of positions
by marked regular expressions: Atom(bool, α)
SLIDE 88 History
McNaughton & Yamada 1960, Glushkov 1961:
- Translation of regular expression to N/DFA
- Atoms in regular expression are indexed, eg
a1 · a2 + a3 · b1
- States are (sets of) indexed atoms, eg {a1, a3}
Functional implementation by Fischer, Huch & Wilke [ICFP 2009]:
- Replace sets of positions
by marked regular expressions: Atom(bool, α)
- Only matching, not ≡, no proofs
SLIDE 89
Example: (a · a + a · b)∗
SLIDE 90
Example: (a · a + a · b)∗
q0
SLIDE 91
Example: (a · a + a · b)∗
q0 (a · a + a · b)∗ a
SLIDE 92
Example: (a · a + a · b)∗
q0 (a · a + a · b)∗ a (a · a + a · b)∗ a
SLIDE 93
Example: (a · a + a · b)∗
q0 (a · a + a · b)∗ a (a · a + a · b)∗ a a
SLIDE 94
Example: (a · a + a · b)∗
q0 (a · a + a · b)∗ a (a · a + a · b)∗ a a (a · a + a · b)∗ b
SLIDE 95
Example: (a · a + a · b)∗
q0 (a · a + a · b)∗ a (a · a + a · b)∗ a a (a · a + a · b)∗ b a
SLIDE 96
Instantiation of framework
SLIDE 97
Instantiation of framework
σ = bool × (bool × α) rexp
SLIDE 98
Instantiation of framework
σ = bool × (bool × α) rexp init(r) = (True, map (λa. (False, a)) r)
SLIDE 99
Instantiation of framework
σ = bool × (bool × α) rexp init(r) = (True, map (λa. (False, a)) r) δ x (m, r) = (False, read x (follow m r))
SLIDE 100
Instantiation of framework
σ = bool × (bool × α) rexp init(r) = (True, map (λa. (False, a)) r) δ x (m, r) = (False, read x (follow m r)) fin(m, r) = . . . L(m, r) = . . .
SLIDE 101
Conceptually, the marks in McNaugton/Glushkov/Fisher are after the atoms
SLIDE 102 Marked regular expressions II
Asperti [ITP 2012]:
- Verified ≡-checker via marked rexp in Matita
SLIDE 103 Marked regular expressions II
Asperti [ITP 2012]:
- Verified ≡-checker via marked rexp in Matita
- Says he has formalised McNaughton & Yamada
SLIDE 104 Marked regular expressions II
Asperti [ITP 2012]:
- Verified ≡-checker via marked rexp in Matita
- Says he has formalised McNaughton & Yamada
- . . . but he invented his own variation:
SLIDE 105 Marked regular expressions II
Asperti [ITP 2012]:
- Verified ≡-checker via marked rexp in Matita
- Says he has formalised McNaughton & Yamada
- . . . but he invented his own variation:
Puts the mark before the atom
SLIDE 106
Example: (a · a + a · b)∗
SLIDE 107
Example: (a · a + a · b)∗
(a · a + a · b)∗
SLIDE 108
Example: (a · a + a · b)∗
(a · a + a · b)∗ (a · a + a · b)∗ a
SLIDE 109
Example: (a · a + a · b)∗
(a · a + a · b)∗ (a · a + a · b)∗ a a, b
SLIDE 110
Instantiation of framework
Similar but a bit more complicated
SLIDE 111
Before vs After
SLIDE 112 Before vs After
Transitions can be decomposed into two steps:
SLIDE 113 Before vs After
Transitions can be decomposed into two steps:
- Before: read; follow
- After: follow; read
SLIDE 114 Before vs After
Transitions can be decomposed into two steps:
- Before: read; follow
- After: follow; read
Theorem
The before-automaton is a homorphic image of the after-automaton.
SLIDE 115 Before vs After
Transitions can be decomposed into two steps:
- Before: read; follow
- After: follow; read
Theorem
The before-automaton is a homorphic image of the after-automaton. Proof idea due to Helmut Seidl.
SLIDE 116
1 The Unified Framework 2 Derivatives of Regular Expressions 3 Partial Derivatives of Regular Expressions 4 Marked regular expressions 5 Empirical Comparison
SLIDE 117
200 400 600 2 4 6 8 n Time (s)
(a0 + · · · + an−1) · (an)∗ ≡ a∗
Deriv. Marked Part.Deriv.
SLIDE 118 200 400 600 2 4 6 8 n Time (s)
(a0 + · · · + an−1) · (an)∗ ≡ a∗
Deriv. Marked Part.Deriv.
For randomly generated examples:
- Deriv. ≫ Part.Deriv. ≫ Fischer, Asperti
SLIDE 119
Extended regular expressions
SLIDE 120
Extended regular expressions
Complement and intersection:
SLIDE 121 Extended regular expressions
Complement and intersection:
- Trivial for derivatives (Brzozowski)
SLIDE 122 Extended regular expressions
Complement and intersection:
- Trivial for derivatives (Brzozowski)
- Harder for partial derivatives
(Champarnaud and Mignot)
SLIDE 123 Extended regular expressions
Complement and intersection:
- Trivial for derivatives (Brzozowski)
- Harder for partial derivatives
(Champarnaud and Mignot)
- Unclear for marked regular expressions
SLIDE 124 Extended regular expressions
Complement and intersection:
- Trivial for derivatives (Brzozowski)
- Harder for partial derivatives
(Champarnaud and Mignot)
- Unclear for marked regular expressions
. . . and projection:
SLIDE 125 Extended regular expressions
Complement and intersection:
- Trivial for derivatives (Brzozowski)
- Harder for partial derivatives
(Champarnaud and Mignot)
- Unclear for marked regular expressions
. . . and projection:
- Traytel & N. [ICFP 13] extend derivatives
SLIDE 126 Extended regular expressions
Complement and intersection:
- Trivial for derivatives (Brzozowski)
- Harder for partial derivatives
(Champarnaud and Mignot)
- Unclear for marked regular expressions
. . . and projection:
- Traytel & N. [ICFP 13] extend derivatives
decision procedure for MSO on finite strings
SLIDE 127
Summary
Equivalence-checkers for regular expressions can be defined purely functionally
SLIDE 128 Summary
Equivalence-checkers for regular expressions can be defined purely functionally via (partial) derivatives
- r marked regular expressions
SLIDE 129 Summary
Equivalence-checkers for regular expressions can be defined purely functionally via (partial) derivatives
- r marked regular expressions
Perfect proof assistant fodder