Verified Decision Procedures for Equivalence of Regular Expressions - - PowerPoint PPT Presentation

verified decision procedures for equivalence of regular
SMART_READER_LITE
LIVE PREVIEW

Verified Decision Procedures for Equivalence of Regular Expressions - - PowerPoint PPT Presentation

Verified Decision Procedures for Equivalence of Regular Expressions Tobias Nipkow & Dmitriy Traytel Fakult at f ur Informatik Technische Universit at M unchen Background Recent series of papers presenting such decision


slide-1
SLIDE 1

Verified Decision Procedures for Equivalence of Regular Expressions

Tobias Nipkow & Dmitriy Traytel

Fakult¨ at f¨ ur Informatik Technische Universit¨ at M¨ unchen

slide-2
SLIDE 2

Background

Recent series of papers presenting such decision procedures verified in Coq, Isabelle or Matita:

slide-3
SLIDE 3

Background

Recent series of papers presenting such decision procedures verified in Coq, Isabelle or Matita: Braibant & Pous 2010, Krauss & Nipkow 2011, Coquand & Siles 2011, Asperti 2012, Moreira et al. 2013

slide-4
SLIDE 4

Background

Recent series of papers presenting such decision procedures verified in Coq, Isabelle or Matita: Braibant & Pous 2010, Krauss & Nipkow 2011, Coquand & Siles 2011, Asperti 2012, Moreira et al. 2013 They all operate on regular expressions, not automata

slide-5
SLIDE 5

Background

Recent series of papers presenting such decision procedures verified in Coq, Isabelle or Matita: Braibant & Pous 2010, Krauss & Nipkow 2011, Coquand & Siles 2011, Asperti 2012, Moreira et al. 2013 They all operate on regular expressions, not automata They all look different but related . . .

slide-6
SLIDE 6

This talk

  • Unified framework
slide-7
SLIDE 7

This talk

  • Unified framework
  • Derivation of all previous procedures

as instances

slide-8
SLIDE 8

This talk

  • Unified framework
  • Derivation of all previous procedures

as instances

  • Verification in Isabelle
slide-9
SLIDE 9

1 The Unified Framework 2 Derivatives of Regular Expressions 3 Partial Derivatives of Regular Expressions 4 Marked regular expressions 5 Empirical Comparison

slide-10
SLIDE 10

Regular expressions

datatype α rexp = 0 | 1 | Atom α | α rexp + α rexp | α rexp · α rexp | α rexp ∗

slide-11
SLIDE 11

Regular expressions

datatype α rexp = 0 | 1 | Atom α | α rexp + α rexp | α rexp · α rexp | α rexp ∗ Semantics: L :: α rexp → α lang where α lang = α list set

slide-12
SLIDE 12

How to prove r ≡ s

slide-13
SLIDE 13

How to prove r ≡ s

1 Translate to DFAs A and B

slide-14
SLIDE 14

How to prove r ≡ s

1 Translate to DFAs A and B 2 Compare A and B

slide-15
SLIDE 15

How to prove r ≡ s

1 Translate to DFAs A and B 2 Compare A and B

  • Standard algorithm:

Minimize A and B, check isomorphism.

slide-16
SLIDE 16

How to prove r ≡ s

1 Translate to DFAs A and B 2 Compare A and B

  • Standard algorithm:

Minimize A and B, check isomorphism.

  • Easy alternative:

Check for all reachable states (p, q) of A × B that p is final iff q is final.

slide-17
SLIDE 17

Framework parameters

slide-18
SLIDE 18

Framework parameters

Type σ

slide-19
SLIDE 19

Framework parameters

Type σ Init init :: α rexp → σ

slide-20
SLIDE 20

Framework parameters

Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ

slide-21
SLIDE 21

Framework parameters

Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ Final fin :: σ → bool

slide-22
SLIDE 22

Framework parameters

Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ Final fin :: σ → bool Language L :: σ → α lang

slide-23
SLIDE 23

Framework parameters

Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ Final fin :: σ → bool Language L :: σ → α lang Assumptions: L(init(r)) = L(r)

slide-24
SLIDE 24

Framework parameters

Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ Final fin :: σ → bool Language L :: σ → α lang Assumptions: L(init(r)) = L(r) L(δ x s) = {w | xw ∈ L(s)}

slide-25
SLIDE 25

Framework parameters

Type σ Init init :: α rexp → σ Transition δ :: α → σ → σ Final fin :: σ → bool Language L :: σ → α lang Assumptions: L(init(r)) = L(r) L(δ x s) = {w | xw ∈ L(s)} fin(s) ⇔ [] ∈ L(s)

slide-26
SLIDE 26

Equivalence checker

slide-27
SLIDE 27

Equivalence checker

eqv :: α rexp → α rexp → bool

slide-28
SLIDE 28

Equivalence checker

eqv :: α rexp → α rexp → bool eqv r s = case closure (init(r), init(s)) of Some([], ) ⇒ True | ⇒ False

slide-29
SLIDE 29

Equivalence checker

eqv :: α rexp → α rexp → bool eqv r s = case closure (init(r), init(s)) of Some([], ) ⇒ True | ⇒ False

Theorem

eqv r s = ⇒ L(r) = L(s)

slide-30
SLIDE 30

Equivalence checker

eqv :: α rexp → α rexp → bool eqv r s = case closure (init(r), init(s)) of Some([], ) ⇒ True | ⇒ False

Theorem

eqv r s = ⇒ L(r) = L(s) If the set of reachable states is finite:

Theorem

L(r) = L(s) = ⇒ eqv r s

slide-31
SLIDE 31

1 The Unified Framework 2 Derivatives of Regular Expressions 3 Partial Derivatives of Regular Expressions 4 Marked regular expressions 5 Empirical Comparison

slide-32
SLIDE 32

Derivatives (Brzozowski 1964)

d :: α → α rexp → α rexp

slide-33
SLIDE 33

Derivatives (Brzozowski 1964)

d :: α → α rexp → α rexp

  • d x r is the derivative of r wrt x
slide-34
SLIDE 34

Derivatives (Brzozowski 1964)

d :: α → α rexp → α rexp

  • d x r is the derivative of r wrt x
  • d x r = “what is left after x has been read”
slide-35
SLIDE 35

Derivatives (Brzozowski 1964)

d :: α → α rexp → α rexp

  • d x r is the derivative of r wrt x
  • d x r = “what is left after x has been read”
  • Example: d a (Atom(a) · r) = 1 · r
slide-36
SLIDE 36

Derivatives (Brzozowski 1964)

d :: α → α rexp → α rexp

  • d x r is the derivative of r wrt x
  • d x r = “what is left after x has been read”
  • Example: d a (Atom(a) · r) = 1 · r
  • Semantics is left-quotient:

L(d x r) = {w | xw ∈ L(r)}

slide-37
SLIDE 37

d x 0 = 0

slide-38
SLIDE 38

d x 0 = 0 d x 1 = 0

slide-39
SLIDE 39

d x 0 = 0 d x 1 = 0 d x (Atom y) = if x = y then 1 else 0

slide-40
SLIDE 40

d x 0 = 0 d x 1 = 0 d x (Atom y) = if x = y then 1 else 0 d x (r + s) = d x r + d x s

slide-41
SLIDE 41

d x 0 = 0 d x 1 = 0 d x (Atom y) = if x = y then 1 else 0 d x (r + s) = d x r + d x s d x (r · s) = if ε(r) then d x r · s + d x s else d x r · s

slide-42
SLIDE 42

d x 0 = 0 d x 1 = 0 d x (Atom y) = if x = y then 1 else 0 d x (r + s) = d x r + d x s d x (r · s) = if ε(r) then d x r · s + d x s else d x r · s d x (r∗) = d x r · r∗

slide-43
SLIDE 43

Regular Expression DFA

a · a∗

slide-44
SLIDE 44

Regular Expression DFA

a · a∗ 1 · a∗ a

slide-45
SLIDE 45

Regular Expression DFA

a · a∗ 1 · a∗ a 0 · a∗ + 1 · a∗ a

slide-46
SLIDE 46

Regular Expression DFA

a · a∗ 1 · a∗ a 0 · a∗ + 1 · a∗ a a

slide-47
SLIDE 47

Finiteness

slide-48
SLIDE 48

Finiteness

Let ≡ACI be the equivalence induced by ACI of +

slide-49
SLIDE 49

Finiteness

Let ≡ACI be the equivalence induced by ACI of + Theorem (Brzozowski 1964) The set {fold d w r | w ∈ Σ∗}/≡ACI is finite.

slide-50
SLIDE 50

Finiteness

Let ≡ACI be the equivalence induced by ACI of + Theorem (Brzozowski 1964) The set {fold d w r | w ∈ Σ∗}/≡ACI is finite. How large?

slide-51
SLIDE 51

Finiteness

Let ≡ACI be the equivalence induced by ACI of + Theorem (Brzozowski 1964) The set {fold d w r | w ∈ Σ∗}/≡ACI is finite. How large? Brzozowski’s proof yields O(2...2n )

slide-52
SLIDE 52

Instantiation of framework

slide-53
SLIDE 53

Instantiation of framework

σ = α rexp

slide-54
SLIDE 54

Instantiation of framework

σ = α rexp init(r) = r

slide-55
SLIDE 55

Instantiation of framework

σ = α rexp init(r) = r δ x r = normACI(d x r)

slide-56
SLIDE 56

Instantiation of framework

σ = α rexp init(r) = r δ x r = normACI(d x r) fin = ε

slide-57
SLIDE 57

Instantiation of framework

σ = α rexp init(r) = r δ x r = normACI(d x r) fin = ε L = L

slide-58
SLIDE 58

Instantiation of framework

σ = α rexp init(r) = r δ x r = normACI(d x r) fin = ε L = L Finiteness:

slide-59
SLIDE 59

Instantiation of framework

σ = α rexp init(r) = r δ x r = normACI(d x r) fin = ε L = L Finiteness:

  • Not immediate from Brzozowski’s theorem
slide-60
SLIDE 60

Instantiation of framework

σ = α rexp init(r) = r δ x r = normACI(d x r) fin = ε L = L Finiteness:

  • Not immediate from Brzozowski’s theorem
  • Open for stronger normalization functions
slide-61
SLIDE 61

1 The Unified Framework 2 Derivatives of Regular Expressions 3 Partial Derivatives of Regular Expressions 4 Marked regular expressions 5 Empirical Comparison

slide-62
SLIDE 62

Antimirov 1996

slide-63
SLIDE 63

Antimirov 1996

Idea: build some of ≡ into the data structure set:

slide-64
SLIDE 64

Antimirov 1996

Idea: build some of ≡ into the data structure set: d : α → α rexp → α rexp

slide-65
SLIDE 65

Antimirov 1996

Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set

slide-66
SLIDE 66

Antimirov 1996

Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set d x (r + s) = d x r + d x s

slide-67
SLIDE 67

Antimirov 1996

Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set D x (r + s) = D x r ∪ D x s

slide-68
SLIDE 68

Antimirov 1996

Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set D x (r + s) = D x r ∪ D x s d x (r · s) = if ε(r) then d x r · s + d x s else d x r · s

slide-69
SLIDE 69

Antimirov 1996

Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set D x (r + s) = D x r ∪ D x s D x (r · s) = if ε(r) then D x r ⊙ s ∪ D x s else D x r ⊙ s

slide-70
SLIDE 70

Antimirov 1996

Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set D x (r + s) = D x r ∪ D x s D x (r · s) = if ε(r) then D x r ⊙ s ∪ D x s else D x r ⊙ s where {r1, . . . , rn} ⊙ s = {r1 · s, . . . , rn · s}

slide-71
SLIDE 71

Antimirov 1996

Idea: build some of ≡ into the data structure set: D : α → α rexp → α rexp set D x (r + s) = D x r ∪ D x s D x (r · s) = if ε(r) then D x r ⊙ s ∪ D x s else D x r ⊙ s . . . where {r1, . . . , rn} ⊙ s = {r1 · s, . . . , rn · s}

slide-72
SLIDE 72

Instantiation of framework

slide-73
SLIDE 73

Instantiation of framework

σ = α rexp set

slide-74
SLIDE 74

Instantiation of framework

σ = α rexp set init(r) = {r}

slide-75
SLIDE 75

Instantiation of framework

σ = α rexp set init(r) = {r} δ x R =

  • r∈R

D x r

slide-76
SLIDE 76

Instantiation of framework

σ = α rexp set init(r) = {r} δ x R =

  • r∈R

D x r fin(R) = ∃r ∈ R. ε(r)

slide-77
SLIDE 77

Instantiation of framework

σ = α rexp set init(r) = {r} δ x R =

  • r∈R

D x r fin(R) = ∃r ∈ R. ε(r) L(R) =

  • r∈R

L(r)

slide-78
SLIDE 78

Finiteness

Theorem (Antimirov 1996) Starting from a regular expression r

slide-79
SLIDE 79

Finiteness

Theorem (Antimirov 1996) Starting from a regular expression r at most |r|at + 1 regular expressions are reachable

slide-80
SLIDE 80

Finiteness

Theorem (Antimirov 1996) Starting from a regular expression r at most |r|at + 1 regular expressions are reachable where |r|at is the number of occurrences of atoms in r.

slide-81
SLIDE 81

Finiteness

Theorem (Antimirov 1996) Starting from a regular expression r at most |r|at + 1 regular expressions are reachable where |r|at is the number of occurrences of atoms in r. = ⇒ 2|r|at+1 sets of regular expressions reachable

slide-82
SLIDE 82

1 The Unified Framework 2 Derivatives of Regular Expressions 3 Partial Derivatives of Regular Expressions 4 Marked regular expressions 5 Empirical Comparison

slide-83
SLIDE 83

History

McNaughton & Yamada 1960, Glushkov 1961:

slide-84
SLIDE 84

History

McNaughton & Yamada 1960, Glushkov 1961:

  • Translation of regular expression to N/DFA
slide-85
SLIDE 85

History

McNaughton & Yamada 1960, Glushkov 1961:

  • Translation of regular expression to N/DFA
  • Atoms in regular expression are indexed, eg

a1 · a2 + a3 · b1

slide-86
SLIDE 86

History

McNaughton & Yamada 1960, Glushkov 1961:

  • Translation of regular expression to N/DFA
  • Atoms in regular expression are indexed, eg

a1 · a2 + a3 · b1

  • States are (sets of) indexed atoms, eg {a1, a3}
slide-87
SLIDE 87

History

McNaughton & Yamada 1960, Glushkov 1961:

  • Translation of regular expression to N/DFA
  • Atoms in regular expression are indexed, eg

a1 · a2 + a3 · b1

  • States are (sets of) indexed atoms, eg {a1, a3}

Functional implementation by Fischer, Huch & Wilke [ICFP 2009]:

  • Replace sets of positions

by marked regular expressions: Atom(bool, α)

slide-88
SLIDE 88

History

McNaughton & Yamada 1960, Glushkov 1961:

  • Translation of regular expression to N/DFA
  • Atoms in regular expression are indexed, eg

a1 · a2 + a3 · b1

  • States are (sets of) indexed atoms, eg {a1, a3}

Functional implementation by Fischer, Huch & Wilke [ICFP 2009]:

  • Replace sets of positions

by marked regular expressions: Atom(bool, α)

  • Only matching, not ≡, no proofs
slide-89
SLIDE 89

Example: (a · a + a · b)∗

slide-90
SLIDE 90

Example: (a · a + a · b)∗

q0

slide-91
SLIDE 91

Example: (a · a + a · b)∗

q0 (a · a + a · b)∗ a

slide-92
SLIDE 92

Example: (a · a + a · b)∗

q0 (a · a + a · b)∗ a (a · a + a · b)∗ a

slide-93
SLIDE 93

Example: (a · a + a · b)∗

q0 (a · a + a · b)∗ a (a · a + a · b)∗ a a

slide-94
SLIDE 94

Example: (a · a + a · b)∗

q0 (a · a + a · b)∗ a (a · a + a · b)∗ a a (a · a + a · b)∗ b

slide-95
SLIDE 95

Example: (a · a + a · b)∗

q0 (a · a + a · b)∗ a (a · a + a · b)∗ a a (a · a + a · b)∗ b a

slide-96
SLIDE 96

Instantiation of framework

slide-97
SLIDE 97

Instantiation of framework

σ = bool × (bool × α) rexp

slide-98
SLIDE 98

Instantiation of framework

σ = bool × (bool × α) rexp init(r) = (True, map (λa. (False, a)) r)

slide-99
SLIDE 99

Instantiation of framework

σ = bool × (bool × α) rexp init(r) = (True, map (λa. (False, a)) r) δ x (m, r) = (False, read x (follow m r))

slide-100
SLIDE 100

Instantiation of framework

σ = bool × (bool × α) rexp init(r) = (True, map (λa. (False, a)) r) δ x (m, r) = (False, read x (follow m r)) fin(m, r) = . . . L(m, r) = . . .

slide-101
SLIDE 101

Conceptually, the marks in McNaugton/Glushkov/Fisher are after the atoms

slide-102
SLIDE 102

Marked regular expressions II

Asperti [ITP 2012]:

  • Verified ≡-checker via marked rexp in Matita
slide-103
SLIDE 103

Marked regular expressions II

Asperti [ITP 2012]:

  • Verified ≡-checker via marked rexp in Matita
  • Says he has formalised McNaughton & Yamada
slide-104
SLIDE 104

Marked regular expressions II

Asperti [ITP 2012]:

  • Verified ≡-checker via marked rexp in Matita
  • Says he has formalised McNaughton & Yamada
  • . . . but he invented his own variation:
slide-105
SLIDE 105

Marked regular expressions II

Asperti [ITP 2012]:

  • Verified ≡-checker via marked rexp in Matita
  • Says he has formalised McNaughton & Yamada
  • . . . but he invented his own variation:

Puts the mark before the atom

slide-106
SLIDE 106

Example: (a · a + a · b)∗

slide-107
SLIDE 107

Example: (a · a + a · b)∗

(a · a + a · b)∗

slide-108
SLIDE 108

Example: (a · a + a · b)∗

(a · a + a · b)∗ (a · a + a · b)∗ a

slide-109
SLIDE 109

Example: (a · a + a · b)∗

(a · a + a · b)∗ (a · a + a · b)∗ a a, b

slide-110
SLIDE 110

Instantiation of framework

Similar but a bit more complicated

slide-111
SLIDE 111

Before vs After

slide-112
SLIDE 112

Before vs After

Transitions can be decomposed into two steps:

  • Before: read; follow
slide-113
SLIDE 113

Before vs After

Transitions can be decomposed into two steps:

  • Before: read; follow
  • After: follow; read
slide-114
SLIDE 114

Before vs After

Transitions can be decomposed into two steps:

  • Before: read; follow
  • After: follow; read

Theorem

The before-automaton is a homorphic image of the after-automaton.

slide-115
SLIDE 115

Before vs After

Transitions can be decomposed into two steps:

  • Before: read; follow
  • After: follow; read

Theorem

The before-automaton is a homorphic image of the after-automaton. Proof idea due to Helmut Seidl.

slide-116
SLIDE 116

1 The Unified Framework 2 Derivatives of Regular Expressions 3 Partial Derivatives of Regular Expressions 4 Marked regular expressions 5 Empirical Comparison

slide-117
SLIDE 117

200 400 600 2 4 6 8 n Time (s)

(a0 + · · · + an−1) · (an)∗ ≡ a∗

Deriv. Marked Part.Deriv.

slide-118
SLIDE 118

200 400 600 2 4 6 8 n Time (s)

(a0 + · · · + an−1) · (an)∗ ≡ a∗

Deriv. Marked Part.Deriv.

For randomly generated examples:

  • Deriv. ≫ Part.Deriv. ≫ Fischer, Asperti
slide-119
SLIDE 119

Extended regular expressions

slide-120
SLIDE 120

Extended regular expressions

Complement and intersection:

slide-121
SLIDE 121

Extended regular expressions

Complement and intersection:

  • Trivial for derivatives (Brzozowski)
slide-122
SLIDE 122

Extended regular expressions

Complement and intersection:

  • Trivial for derivatives (Brzozowski)
  • Harder for partial derivatives

(Champarnaud and Mignot)

slide-123
SLIDE 123

Extended regular expressions

Complement and intersection:

  • Trivial for derivatives (Brzozowski)
  • Harder for partial derivatives

(Champarnaud and Mignot)

  • Unclear for marked regular expressions
slide-124
SLIDE 124

Extended regular expressions

Complement and intersection:

  • Trivial for derivatives (Brzozowski)
  • Harder for partial derivatives

(Champarnaud and Mignot)

  • Unclear for marked regular expressions

. . . and projection:

slide-125
SLIDE 125

Extended regular expressions

Complement and intersection:

  • Trivial for derivatives (Brzozowski)
  • Harder for partial derivatives

(Champarnaud and Mignot)

  • Unclear for marked regular expressions

. . . and projection:

  • Traytel & N. [ICFP 13] extend derivatives
slide-126
SLIDE 126

Extended regular expressions

Complement and intersection:

  • Trivial for derivatives (Brzozowski)
  • Harder for partial derivatives

(Champarnaud and Mignot)

  • Unclear for marked regular expressions

. . . and projection:

  • Traytel & N. [ICFP 13] extend derivatives

decision procedure for MSO on finite strings

slide-127
SLIDE 127

Summary

Equivalence-checkers for regular expressions can be defined purely functionally

slide-128
SLIDE 128

Summary

Equivalence-checkers for regular expressions can be defined purely functionally via (partial) derivatives

  • r marked regular expressions
slide-129
SLIDE 129

Summary

Equivalence-checkers for regular expressions can be defined purely functionally via (partial) derivatives

  • r marked regular expressions

Perfect proof assistant fodder