[PPT] - The Cayley-Hamilton Theorem For Finite Automata Radu Grosu SUNY at PowerPoint Presentation

SLIDE 1

Radu Grosu SUNY at Stony Brook

The Cayley-Hamilton Theorem For Finite Automata

SLIDE 2

How did I get interested in this topic?

SLIDE 3

Hybrid Systems Computation and Control:
convergence between control and automata theory.
Hybrid Automata: an outcome of this convergence
modeling formalism for systems exhibiting both

discrete and continuous behavior,

successfully used to model and analyze embedded

and biological systems.

Convergence of Theories

SLIDE 4

Lack of Common Foundation for HA

Mode dynamics:
Linear system (LS)
Mode switching:
Finite automaton (FA)
Different techniques:
LS reduction
FA minimization

Stimulated

 

U

V

s v 

U

V

v 

E

V

v





/

R

V t

v

R

x Ax Bu v v V Cx     / di t

s

 voltage(mv) time(ms)

LS & FA taught separately: No common foundation!

SLIDE 5

Finite automata can be conveniently regarded as

time invariant linear systems over semimodules:

linear systems techniques generalize to automata
Examples of such techniques include:
linear transformations of automata,
minimization and determinization of automata as
bservability and reachability reductions
Z-transform of automata to compute associated

regular expression through Gaussian elimination.

Main Conjecture

SLIDE 6

Minimal DFA are Not Minimal NFA

(Arnold, Dicky and Nivat’s Example) x1 a b x3 x4 x2 c b c x1 a x2 x3 a b c L = a (b* + c*)

SLIDE 7

c a x1 x2 x3 b a b a x4 x5 c c b a b x2 x3 b c x4 x5 a a c b c x1

Minimal NFA: How are they Related?

(Arnold, Dicky and Nivat’s Example) L = ab+ac + ba+bc + ca+cb No homomorphism of either automaton onto the other.

SLIDE 8

c a x1 x2 x3 b a b a x4 x8 c c b a b x5 x6 b c x7 x8 a a c b c x1

Minimal NFA: How are they Related?

(Arnold, Dicky and Nivat’s Example) Carrez’s solution: Take both in a terminal NFA. Is this the best one can do? No! One can use use linear (similarity) transformations.

SLIDE 9

Observability Reduction HSCC’09

(Arnold, Dicky and Nivat’s Example)

Define linear transformation xt = xtT: T  x1 x2 x3 x4 x5 x1 1 x2 0 1 1 x3 0 1 1 x4 0 1 1 x5 0 1                

A = [AT]T (T1AT)

x0

t = x0 tT

C = [C]T (T1C)

a b x23 x24 b c x34 x5 a a c b c x1

A

c a x1 x2 x3 b a b a x4 x5 c c b

A

SLIDE 10

Reachability Reduction HSCC’09

(Arnold, Dicky and Nivat’s Example)

Define linear transformation xt = xtT: T  x1 x2 x3 x4 x5 x1 1 x2 0 1 1 x3 0 1 1 x4 0 1 1 x5 0 1                

At = [AtT]T (T1AtT)

x0

t = x0 tT

C = [C]T (T1C)

a b x2 x3 b c x4 x5 a a c b c x1

A

c a x1 x23 x24 b a b a x34 x5 c c b

A

SLIDE 11

SLIDE 12

SLIDE 13

SLIDE 14

Observability and minimization

SLIDE 15

Finite Automata as Linear Systems

 Consider a finite automaton M = (X,,,S,F) with:

finite set of states X, finite input alphabet ,
transition relation   X    X,
starting and final sets of states S,F  X

SLIDE 16

Finite Automata as Linear Systems

 Consider a finite automaton M = (X,,,S,F) with:

finite set of states X, finite input alphabet ,
transition relation   X    X,
starting and final sets of states S,F  X

 Let X denote row and column indices. Then:

 defines a matrix A,
S and F define corresponding vectors

SLIDE 17

Finite Automata as Linear Systems

 Now define the linear system LM= [S,A,C]:

x

t(n+1) = x t(n)A, x0

= S y

t(n)

= x

t(n)C, C = F

SLIDE 18

Finite Automata as Linear Systems

 Now define the linear system LM= [S,A,C]:

x

t(n+1)

= x

t(n)A,

x0 = S y

t(n)

= x

t(n)C,

C = F

 Example: consider following automaton:

x3 x2 x1 a a b b

A = 0 a b 0 a 0 0 0 b

       

x0 = 

       

C =  

       

SLIDE 19

Semimodule of Languages

 (

*) is an idempotent semiring (quantale):

((

*),+,0) is a commutative idempotent monoid (union),

((

*),,1) is a monoid (concatenation),

multiplication distributes over addition,
0 is an annihilator: 0  a = 0

 ((

*))n is a semimodule over scalars in ( *):

r(x+y) = rx + ry, (r+s)x = rx + sx, (rs)x = r(sx),
1x = x, 0x = 0

 Note: No additive and multiplicative inverses!

SLIDE 20

Semimodule of Languages

 (

*) is an idempotent semiring (quantale):

((

*),+,0) is a commutative idempotent monoid (union),

((

*),,1) is a monoid (concatenation),

multiplication distributes over addition,
0 is an annihilator: 0  a = 0

 ((

*))n is a semimodule over scalars in ( *):

r(x+y) = rx + ry, (r+s)x = rx + sx, (rs)x = r(sx),
1x = x, 0x = 0

 Note: No additive and multiplicative inverses!

SLIDE 21

Observability

 Let L = [S,A,C]. Observe its output upto n-1:

[y(0) y(1) ... y(n-1)] = x0

t [C AC ... A n-1C] = x0 tO (1)

 If L operates on a vector space:

L is observable if: x0 is uniquely determined by (1),
Observability matrix O: has rank n,
n-outputs suffice: A

nC = s1A n-1C + s2A n-2C + ... + snC

 If L operates on a semimodule:

L is observable if: x0 is uniquely determined by (1)

SLIDE 22

Observability

 Let L = [S,A,C]. Observe its output upto n-1:

[y(0) y(1) ... y(n-1)] = x0

t [C AC ... A n-1C] = x0 tO (1)

 If L operates on a vector space:

L is observable if: x0 is uniquely determined by (1),
Observability matrix O: has rank n,
n-outputs suffice: A

nC = s1A n-1C + s2A n-2C + ... + snC

(Cayley-Hamilton Theorem)

 If L operates on a semimodule:

L is observable if: x0 is uniquely determined by (1)

SLIDE 23

Observability

 Let L = [S,A,C]. Observe the output upto n-1:

[y(0) y(1) ... y(n-1)] = x0

t [C AC ... A n-1C] = x0 tO (1)

 If L operates on a vector space:

L is observable if: x0 is uniquely determined by (1),
Observability matrix O: has rank n,
n-outputs suffice: A

nC = s1A n-1C + s2A n-2C + ... + snC

 If L operates on a semimodule:

L is observable if: x0 is uniquely determined by (1)

SLIDE 24

The Cayley-Hamilton Theorem

( An = s1An-1 + s2An-2 + ... + snI )

SLIDE 25

 Permutations are bijections of {1,...,n}:

Example:  = {(1,2),(2,3),(3,4),(4,1),(5,7),(6,6),(7,5)}

 The graph G() of a permutation :

G() decomposes into: elementary cycles,

 The sign of a permutation:

Pos/Neg: even/odd number of even length cycles,
P

n  / P n : all positive/negative permutations.

Permutations

SLIDE 26

 Permutations are bijections of {1,...,n}:

Example:  = {(1,2),(2,3),(3,4),(4,1),(5,7),(6,6),(7,5)}

 The graph G() of a permutation :

G() decomposes into: elementary cycles

 The sign of a permutation:

Pos/Neg: even/odd number of even length cycles
P

n  / P n : all positive/negative permutations.

Permutations

3 4 2 1 7 5 6

SLIDE 27

 Permutations are bijections of {1,...,n}:

Example:  = {(1,2),(2,3),(3,4),(4,1),(5,7),(6,6),(7,5)}

 The graph G() of a permutation :

G() decomposes into: elementary cycles

 The sign of a permutation :

Pos/Neg: even/odd number of even length cycles
P

n  / P n : all positive/negative permutations

Permutations

3 4 2 1 7 5 6

SLIDE 28

Eigenvalues in Vector Spaces

 The eigenvalues of a square matrix A:

Eigenvector equation: xtA = xts

 The characteristic equation of A:

The characteristic polynomial: cpA(s) = |sI-A|
The characteristic equation: cpA(s) = 0

 The determinant of A:

The determinant: |A| =

(A)

Pn





(A)

Pn





,

Permutation application: (A) =

A(i,(i))

i1 n



eigenvalue eigenvector

SLIDE 29

Matrix-Eigenspaces in Vector Spaces

 The eigenvalues of a square matrix A:

Eigenvector equation: xt(sI-A) = 0

 The characteristic equation of A:

The characteristic polynomial: cpA(s) = |sI-A|
The characteristic equation: cpA(s) = 0

 The determinant of A:

The determinant: |A| =

(A)

Pn





(A)

Pn





,

Permutation application: (A) =

A(i,(i))

i1 n



SLIDE 30

Matrix-Eigenspaces in Vector Spaces

 The eigenvalues of a square matrix A:

Eigenvector equation: xt(sI-A) = 0

 The characteristic equation of A:

The characteristic polynomial: cpA(s) = |sI-A|
The characteristic equation: cpA(s) = 0

 The determinant of A:

The determinant: |A| =

(A)

Pn





(A)

Pn





,

Permutation application: (A) =

A(i,(i))

i1 n



SLIDE 31

Matrix-Eigenspaces in Vector Spaces

 The eigenvalues of a square matrix A:

Eigenvector equation: xt(sI-A) = 0

 The characteristic equation of A:

The characteristic polynomial: cpA(s) = |sI-A|
The characteristic equation: cpA(s) = 0

 The determinant of A:

The determinant: |A| =

(A)

Pn





(A)

Pn





,

Weight of a permutation: (A) =

A(i,(i))

i1 n



SLIDE 32

 A satisfies its characteristic equation: cpA(A) = 0

A = 0 a12 0 a21 0 a23 a31 0 a33

          sI-A =

s -a12

a21 s
a23
a31 0 s-a33

         

|sI-A| = s

3 - a33s 2 - a12a21s + a12a21 a33- a12a23a31 = 0

s

3 + a12a21 a33 = a33s 2 + a12a21s + a12a23a31

A

3 + a12a21 a33I = a33A 2 + a12a21A + a12a23a31I

3 2 1 a12 a31 a23 a33 a21

The Cayley-Hamilton Theorem (CHT)

SLIDE 33

A = 0 a12 0 a21 0 a23 a31 0 a33

          sI-A =

s -a12

a21 s
a23
a31 0 s-a33

         

|sI-A| = s

3 - a33s 2 - a12a21s + a12a21 a33- a12a23a31 = 0

s

3 + a12a21 a33 = a33s 2 + a12a21s + a12a23a31

A

3 + a12a21 a33I = a33A 2 + a12a21A + a12a23a31I

3 2 1 a12 a31 a23 a33 a21

The Cayley-Hamilton Theorem (CHT)

 A satisfies its characteristic equation: cpA(A) = 0

SLIDE 34

A = 0 a12 0 a21 0 a23 a31 0 a33

          sI-A =

s -a12

a21 s
a23
a31 0 s-a33

         

|sI-A| = s

3 - a33s 2 - a12a21s + a12a21 a33- a12a23a31 = 0

s

3 + a12a21 a33 = a33s 2 + a12a21s + a12a23a31

A

3 + a12a21 a33I = a33A 2 + a12a21A + a12a23a31I

3 2 1 a12 a31 a23 a33 a21

The Cayley-Hamilton Theorem (CHT)

 A satisfies its characteristic equation: cpA(A) = 0

SLIDE 35

A = 0 a12 0 a21 0 a23 a31 0 a33

          sI-A =

s -a12

a21 s
a23
a31 0 s-a33

         

|sI-A| = s

3 - a33s 2 - a12a21s + a12a21 a33- a12a23a31 = 0

s

3 + a12a21 a33 = a33s 2 + a12a21s + a12a23a31

A

3 + a12a21 a33I = a33A 2 + a12a21A + a12a23a31I

3 2 1 a12 a31 a23 a33 a21

The Cayley-Hamilton Theorem (CHT)

 A satisfies its characteristic equation: cpA(A) = 0

SLIDE 36

A = 0 a12 0 a21 0 a23 a31 0 a33

          sI-A =

s -a12

a21 s
a23
a31 0 s-a33

         

|sI-A| = s

3 - a33s 2 - a12a21s + a12a21 a33- a12a23a31 = 0

s

3 + a12a21 a33 = a33s 2 + a12a21s + a12a23a31

A

3 + a12a21 a33I = a33A 2 + a12a21A + a12a23a31I

3 2 1 a12 a31 a23 a33 a21

cycle cycle cycle cycle cycle

The Cayley-Hamilton Theorem (CHT)

 A satisfies its characteristic equation: cpA(A) = 0

SLIDE 37

 A satisfies its characteristic equation: cpA(A) = 0  Implicit assumptions in CHT:

Subtraction is available
Multiplication is commutative

 Does CHT hold in semirings?

Subtraction not indispensible (Rutherford, Straubing)
Commutativity still problematic

The Cayley-Hamilton Theorem (CHT)

SLIDE 38

The Cayley-Hamilton Theorem (CHT)

 A satisfies its characteristic equation: cpA(A) = 0  Implicit assumptions in CHT:

Subtraction is available
Multiplication is commutative

 Does CHT hold in semirings?

Subtraction not indispensible (Rutherford, Straubing)
Commutativity problematic

SLIDE 39

CHT in Commutative Semirings

(Straubing’s Proof)

 Lift original semiring to the semiring of paths:

Matrix A is lifted to a matrix GA of paths 

A = 0 a12 a21 0 a23 a31 0 a33

          GA =

(1,2) (2,1) (2,3) (3,1) (3,3)

         

SLIDE 40

 Lift original semiring to the semiring of paths:

Matrix A is lifted to a matrix GA of paths 
Permutation cycles  lifted cyclic paths 

 = {(1,2),(2,1)}  = (1,2)(2,1)

CHT in Commutative Semirings

(Straubing’s Proof)

SLIDE 41

 Lift original semiring to the semiring of paths:

Matrix A is lifted to a matrix GA of paths 
Permutation cycles lifted cyclic paths 

 Prove CHT in the semiring of paths:

GA

n-q P

q 



q0 n



= GA

n-q P

q 



q0 n



(CHT holds?)

CHT in Commutative Semirings

(Straubing’s Proof)

SLIDE 42

 Lift original semiring to the semiring of paths:

Matrix A is lifted to a matrix GA of paths 
Permutation cycles lifted cyclic paths 

 Prove CHT in the semiring of paths:

Show bijection between pos/neg products 

GA

P3





= GA

2 P

1 



3 2 1

(1,2) (2,1) (2,3) (3,1) (3,3)

3 2 1

(1,2) (2,1) (2,3) (3,1) (3,3)

(3,3)(1,2)(2,1) (3,3)(1,2)(2,1)

CHT in Commutative Semirings

(Straubing’s Proof)

SLIDE 43

 Lift original semiring to the semiring of paths:

Matrix A is lifted to a matrix GA of paths 
Permutation cycles lifted cyclic paths 

 Prove CHT in the semiring of paths:

Show bijection between pos/neg products 

 Port results back to the original semiring:

Apply products: (A)
Path application: (1...n)(A) = A(1)...A(n)

CHT in Commutative Semirings

(Straubing’s Proof)

SLIDE 44

CHT in Idempotent Semirings

 Lift original semiring to the semiring of paths:

Matrix A: order in paths  important
Permutation cycles: rotations are distinct

SLIDE 45

CHT in Idempotent Semirings

 Lift original semiring to the semiring of paths:

Matrix A: order in paths  important
Permutation cycles: rotations are distinct

 = {(1,2),(2,1)}  =

(1,2)(2,1) (2,1)(1,2)0

       

SLIDE 46

 Lift original semiring to the semiring of paths:

Matrix A: order in paths  important
Permutation cycles: rotations are distinct

 Prove CHT in the semiring of paths:

Products Gn-||: cycles to be properly inserted

CHT in Idempotent Semirings

SLIDE 47

 Lift original semiring to the semiring of paths:

Matrix A: order in paths  important
Permutation cycles: rotations are distinct

 Prove CHT in the semiring of paths:

Products Gn-||: cycles to be properly inserted

  Gn-|| = Gn-|| + GGn-||-1 +...+ Gn-||

CHT in Idempotent Semirings

SLIDE 48

 Lift original semiring to the semiring of paths:

Matrix A: order in paths  important
Permutation cycles: rotations are distinct

 Prove CHT in the semiring of paths:

Products Gn-||: cycles to be properly inserted

 Port results back to the original semiring:

Apply products: Gn-||(A)

CHT in Idempotent Semirings

SLIDE 49

 Theorem: Gn =

 

Pq  n



q1 n



GA

n-||

Proof:

LHS  RHS: Let   LHS

Pidgeon-hole:

 has at least one cycle  in s

Structural:

 is a simple cycle of length k

Remove  in : [s/ ] is in Gn-||
Shuffle-product:   Gn-|| reinserts 

RHS  LHS: Let   RHS

No wrong path: The shuffle is sound
Idempotence:

Takes care of multiple copies

CHT in Idempotent Semirings

SLIDE 50

 Theorem: Gn =

 

Pq  n



q1 n



GA

n-||

Proof:

LHS  RHS: Let   LHS

Pidgeon-hole:

 has at least one cycle  in s

Structural:

 is also a simple cycle

Remove  in : [s/ ] is in Gn-||
Shuffle-product:   Gn-|| reinserts 

RHS  LHS: Let   RHS

No wrong path: The shuffle is sound
Idempotence:

Takes care of multiple copies

CHT in Idempotent Semirings

SLIDE 51

 Theorem: Gn =

 

Pq  n



q1 n



GA

n-||

Proof:

LHS  RHS: Let   LHS

Pidgeon-hole:

 has at least one cycle  in s

Structural:

 is also a simple cycle

Remove  in : [s/ ] is in Gn-||
Shuffle-product:   Gn-|| reinserts 

RHS  LHS: Let   RHS

No wrong path: The shuffle is sound
Idempotence:

Takes care of multiple copies

CHT in Idempotent Semirings

SLIDE 52

 Define: (i,i) =

 if (i,i) = 0 0 if (i,i) =      

 Theorem: classic CHT can be derived by using:

 Gn-|| =   G

n-|| +   G n-||

application of CHT to G

n-|| and G n-||

 Matrix CHT: can be regarded as a constructive

version of the pumping lemma.

CHT in Idempotent Semirings

SLIDE 53

 Define: (i,i) =

 if (i,i) = 0 0 if (i,i) =      

 Theorem: classic CHT can be derived by using:

 Gn-|| =   G

n-|| +   G n-||

application of CHT to G

n-|| and G n-||

 Matrix CHT: can be regarded as a constructive

version of the pumping lemma.

CHT in Idempotent Semirings

SLIDE 54

 Define: (i,i) =

 if (i,i) = 0 0 if (i,i) =      

 Theorem: classic CHT can be derived by using:

 Gn-|| =   G

n-|| +   G n-||

application of CHT to G

n-|| and G n-||

 Matrix CHT: can be regarded as a constructive

version of the pumping lemma.

CHT in Idempotent Semirings

SLIDE 55

SLIDE 56

Finite Automata as Linear Systems

 Now define the linear system LM= [S,A,C]:

xt(n+1) = xt(n)A, x0 = S() yt(n) = xt(n)C, C = F()

 Example: consider following automaton:

x3 x2 x1 a a b b 1 1 A(a) = 0 1 0 , x ( ) = 0 1 A(b) = 0 0 , C( ) = 1 1 1                                  

L1

A = A(a)a + A(b)b

SLIDE 57

Observability

t n-1 t

[y(0) y(1) ... y(n-1)] = x C AC ... A C] = x (1)

Let L = [S,A,C] be an n-state automaton. It's output: [ O L is observable if x is uniquely determin Exampl t ed by ( e:

bser

h v 1). l e abi i

 

n 1 2 3

1

b b b b b

a a ε a a a x A C 1 1 1 1 O = 1 1 1 1 1 x x 1

ty matrix

f