The Cayley-Hamilton Theorem For Finite Automata Radu Grosu SUNY at - - PowerPoint PPT Presentation
The Cayley-Hamilton Theorem For Finite Automata Radu Grosu SUNY at - - PowerPoint PPT Presentation
The Cayley-Hamilton Theorem For Finite Automata Radu Grosu SUNY at Stony Brook How did I get interested in this topic? Convergence of Theories Hybrid Systems Computation and Control: - convergence between control and automata theory.
How did I get interested in this topic?
- Hybrid Systems Computation and Control:
- convergence between control and automata theory.
- Hybrid Automata: an outcome of this convergence
- modeling formalism for systems exhibiting both
discrete and continuous behavior,
- successfully used to model and analyze embedded
and biological systems.
Convergence of Theories
Lack of Common Foundation for HA
- Mode dynamics:
- Linear system (LS)
- Mode switching:
- Finite automaton (FA)
- Different techniques:
- LS reduction
- FA minimization
Stimulated
U
V
s v
U
V
v
E
V
v
/
R
V t
v
R
x Ax Bu v v V Cx / di t
s
voltage(mv) time(ms)
- LS & FA taught separately: No common foundation!
- Finite automata can be conveniently regarded as
time invariant linear systems over semimodules:
- linear systems techniques generalize to automata
- Examples of such techniques include:
- linear transformations of automata,
- minimization and determinization of automata as
- bservability and reachability reductions
- Z-transform of automata to compute associated
regular expression through Gaussian elimination.
Main Conjecture
Minimal DFA are Not Minimal NFA
(Arnold, Dicky and Nivat’s Example) x1 a b x3 x4 x2 c b c x1 a x2 x3 a b c L = a (b* + c*)
c a x1 x2 x3 b a b a x4 x5 c c b a b x2 x3 b c x4 x5 a a c b c x1
Minimal NFA: How are they Related?
(Arnold, Dicky and Nivat’s Example) L = ab+ac + ba+bc + ca+cb No homomorphism of either automaton onto the other.
c a x1 x2 x3 b a b a x4 x8 c c b a b x5 x6 b c x7 x8 a a c b c x1
Minimal NFA: How are they Related?
(Arnold, Dicky and Nivat’s Example) Carrez’s solution: Take both in a terminal NFA. Is this the best one can do? No! One can use use linear (similarity) transformations.
Observability Reduction HSCC’09
(Arnold, Dicky and Nivat’s Example)
Define linear transformation xt = xtT: T x1 x2 x3 x4 x5 x1 1 x2 0 1 1 x3 0 1 1 x4 0 1 1 x5 0 1
A = [AT]T (T1AT)
x0
t = x0 tT
C = [C]T (T1C)
a b x23 x24 b c x34 x5 a a c b c x1
A
c a x1 x2 x3 b a b a x4 x5 c c b
A
Reachability Reduction HSCC’09
(Arnold, Dicky and Nivat’s Example)
Define linear transformation xt = xtT: T x1 x2 x3 x4 x5 x1 1 x2 0 1 1 x3 0 1 1 x4 0 1 1 x5 0 1
At = [AtT]T (T1AtT)
x0
t = x0 tT
C = [C]T (T1C)
a b x2 x3 b c x4 x5 a a c b c x1
A
c a x1 x23 x24 b a b a x34 x5 c c b
A
Observability and minimization
Finite Automata as Linear Systems
Consider a finite automaton M = (X,,,S,F) with:
- finite set of states X, finite input alphabet ,
- transition relation X X,
- starting and final sets of states S,F X
Finite Automata as Linear Systems
Consider a finite automaton M = (X,,,S,F) with:
- finite set of states X, finite input alphabet ,
- transition relation X X,
- starting and final sets of states S,F X
Let X denote row and column indices. Then:
- defines a matrix A,
- S and F define corresponding vectors
Finite Automata as Linear Systems
Now define the linear system LM= [S,A,C]:
x
t(n+1) = x t(n)A, x0
= S y
t(n)
= x
t(n)C, C = F
Finite Automata as Linear Systems
Now define the linear system LM= [S,A,C]:
x
t(n+1)
= x
t(n)A,
x0 = S y
t(n)
= x
t(n)C,
C = F
Example: consider following automaton:
x3 x2 x1 a a b b
A = 0 a b 0 a 0 0 0 b
x0 =
C =
Semimodule of Languages
(
*) is an idempotent semiring (quantale):
- ((
*),+,0) is a commutative idempotent monoid (union),
- ((
*),,1) is a monoid (concatenation),
- multiplication distributes over addition,
- 0 is an annihilator: 0 a = 0
((
*))n is a semimodule over scalars in ( *):
- r(x+y) = rx + ry, (r+s)x = rx + sx, (rs)x = r(sx),
- 1x = x, 0x = 0
Note: No additive and multiplicative inverses!
Semimodule of Languages
(
*) is an idempotent semiring (quantale):
- ((
*),+,0) is a commutative idempotent monoid (union),
- ((
*),,1) is a monoid (concatenation),
- multiplication distributes over addition,
- 0 is an annihilator: 0 a = 0
((
*))n is a semimodule over scalars in ( *):
- r(x+y) = rx + ry, (r+s)x = rx + sx, (rs)x = r(sx),
- 1x = x, 0x = 0
Note: No additive and multiplicative inverses!
Observability
Let L = [S,A,C]. Observe its output upto n-1:
[y(0) y(1) ... y(n-1)] = x0
t [C AC ... A n-1C] = x0 tO (1)
If L operates on a vector space:
- L is observable if: x0 is uniquely determined by (1),
- Observability matrix O: has rank n,
- n-outputs suffice: A
nC = s1A n-1C + s2A n-2C + ... + snC
If L operates on a semimodule:
- L is observable if: x0 is uniquely determined by (1)
Observability
Let L = [S,A,C]. Observe its output upto n-1:
[y(0) y(1) ... y(n-1)] = x0
t [C AC ... A n-1C] = x0 tO (1)
If L operates on a vector space:
- L is observable if: x0 is uniquely determined by (1),
- Observability matrix O: has rank n,
- n-outputs suffice: A
nC = s1A n-1C + s2A n-2C + ... + snC
(Cayley-Hamilton Theorem)
If L operates on a semimodule:
- L is observable if: x0 is uniquely determined by (1)
Observability
Let L = [S,A,C]. Observe the output upto n-1:
[y(0) y(1) ... y(n-1)] = x0
t [C AC ... A n-1C] = x0 tO (1)
If L operates on a vector space:
- L is observable if: x0 is uniquely determined by (1),
- Observability matrix O: has rank n,
- n-outputs suffice: A
nC = s1A n-1C + s2A n-2C + ... + snC
If L operates on a semimodule:
- L is observable if: x0 is uniquely determined by (1)
The Cayley-Hamilton Theorem
( An = s1An-1 + s2An-2 + ... + snI )
Permutations are bijections of {1,...,n}:
- Example: = {(1,2),(2,3),(3,4),(4,1),(5,7),(6,6),(7,5)}
The graph G() of a permutation :
- G() decomposes into: elementary cycles,
The sign of a permutation:
- Pos/Neg: even/odd number of even length cycles,
- P
n / P n : all positive/negative permutations.
Permutations
Permutations are bijections of {1,...,n}:
- Example: = {(1,2),(2,3),(3,4),(4,1),(5,7),(6,6),(7,5)}
The graph G() of a permutation :
- G() decomposes into: elementary cycles
The sign of a permutation:
- Pos/Neg: even/odd number of even length cycles
- P
n / P n : all positive/negative permutations.
Permutations
3 4 2 1 7 5 6
Permutations are bijections of {1,...,n}:
- Example: = {(1,2),(2,3),(3,4),(4,1),(5,7),(6,6),(7,5)}
The graph G() of a permutation :
- G() decomposes into: elementary cycles
The sign of a permutation :
- Pos/Neg: even/odd number of even length cycles
- P
n / P n : all positive/negative permutations
Permutations
3 4 2 1 7 5 6
Eigenvalues in Vector Spaces
The eigenvalues of a square matrix A:
- Eigenvector equation: xtA = xts
The characteristic equation of A:
- The characteristic polynomial: cpA(s) = |sI-A|
- The characteristic equation: cpA(s) = 0
The determinant of A:
- The determinant: |A| =
(A)
Pn
- (A)
Pn
,
- Permutation application: (A) =
A(i,(i))
i1 n
eigenvalue eigenvector
Matrix-Eigenspaces in Vector Spaces
The eigenvalues of a square matrix A:
- Eigenvector equation: xt(sI-A) = 0
The characteristic equation of A:
- The characteristic polynomial: cpA(s) = |sI-A|
- The characteristic equation: cpA(s) = 0
The determinant of A:
- The determinant: |A| =
(A)
Pn
- (A)
Pn
,
- Permutation application: (A) =
A(i,(i))
i1 n
Matrix-Eigenspaces in Vector Spaces
The eigenvalues of a square matrix A:
- Eigenvector equation: xt(sI-A) = 0
The characteristic equation of A:
- The characteristic polynomial: cpA(s) = |sI-A|
- The characteristic equation: cpA(s) = 0
The determinant of A:
- The determinant: |A| =
(A)
Pn
- (A)
Pn
,
- Permutation application: (A) =
A(i,(i))
i1 n
Matrix-Eigenspaces in Vector Spaces
The eigenvalues of a square matrix A:
- Eigenvector equation: xt(sI-A) = 0
The characteristic equation of A:
- The characteristic polynomial: cpA(s) = |sI-A|
- The characteristic equation: cpA(s) = 0
The determinant of A:
- The determinant: |A| =
(A)
Pn
- (A)
Pn
,
- Weight of a permutation: (A) =
A(i,(i))
i1 n
A satisfies its characteristic equation: cpA(A) = 0
A = 0 a12 0 a21 0 a23 a31 0 a33
sI-A =
s -a12
- a21 s
- a23
- a31 0 s-a33
|sI-A| = s
3 - a33s 2 - a12a21s + a12a21 a33- a12a23a31 = 0
s
3 + a12a21 a33 = a33s 2 + a12a21s + a12a23a31
A
3 + a12a21 a33I = a33A 2 + a12a21A + a12a23a31I
3 2 1 a12 a31 a23 a33 a21
The Cayley-Hamilton Theorem (CHT)
A = 0 a12 0 a21 0 a23 a31 0 a33
sI-A =
s -a12
- a21 s
- a23
- a31 0 s-a33
|sI-A| = s
3 - a33s 2 - a12a21s + a12a21 a33- a12a23a31 = 0
s
3 + a12a21 a33 = a33s 2 + a12a21s + a12a23a31
A
3 + a12a21 a33I = a33A 2 + a12a21A + a12a23a31I
3 2 1 a12 a31 a23 a33 a21
The Cayley-Hamilton Theorem (CHT)
A satisfies its characteristic equation: cpA(A) = 0
A = 0 a12 0 a21 0 a23 a31 0 a33
sI-A =
s -a12
- a21 s
- a23
- a31 0 s-a33
|sI-A| = s
3 - a33s 2 - a12a21s + a12a21 a33- a12a23a31 = 0
s
3 + a12a21 a33 = a33s 2 + a12a21s + a12a23a31
A
3 + a12a21 a33I = a33A 2 + a12a21A + a12a23a31I
3 2 1 a12 a31 a23 a33 a21
The Cayley-Hamilton Theorem (CHT)
A satisfies its characteristic equation: cpA(A) = 0
A = 0 a12 0 a21 0 a23 a31 0 a33
sI-A =
s -a12
- a21 s
- a23
- a31 0 s-a33
|sI-A| = s
3 - a33s 2 - a12a21s + a12a21 a33- a12a23a31 = 0
s
3 + a12a21 a33 = a33s 2 + a12a21s + a12a23a31
A
3 + a12a21 a33I = a33A 2 + a12a21A + a12a23a31I
3 2 1 a12 a31 a23 a33 a21
The Cayley-Hamilton Theorem (CHT)
A satisfies its characteristic equation: cpA(A) = 0
A = 0 a12 0 a21 0 a23 a31 0 a33
sI-A =
s -a12
- a21 s
- a23
- a31 0 s-a33
|sI-A| = s
3 - a33s 2 - a12a21s + a12a21 a33- a12a23a31 = 0
s
3 + a12a21 a33 = a33s 2 + a12a21s + a12a23a31
A
3 + a12a21 a33I = a33A 2 + a12a21A + a12a23a31I
3 2 1 a12 a31 a23 a33 a21
cycle cycle cycle cycle cycle
The Cayley-Hamilton Theorem (CHT)
A satisfies its characteristic equation: cpA(A) = 0
A satisfies its characteristic equation: cpA(A) = 0 Implicit assumptions in CHT:
- Subtraction is available
- Multiplication is commutative
Does CHT hold in semirings?
- Subtraction not indispensible (Rutherford, Straubing)
- Commutativity still problematic
The Cayley-Hamilton Theorem (CHT)
The Cayley-Hamilton Theorem (CHT)
A satisfies its characteristic equation: cpA(A) = 0 Implicit assumptions in CHT:
- Subtraction is available
- Multiplication is commutative
Does CHT hold in semirings?
- Subtraction not indispensible (Rutherford, Straubing)
- Commutativity problematic
CHT in Commutative Semirings
(Straubing’s Proof)
Lift original semiring to the semiring of paths:
- Matrix A is lifted to a matrix GA of paths
A = 0 a12 a21 0 a23 a31 0 a33
GA =
(1,2) (2,1) (2,3) (3,1) (3,3)
Lift original semiring to the semiring of paths:
- Matrix A is lifted to a matrix GA of paths
- Permutation cycles lifted cyclic paths
= {(1,2),(2,1)} = (1,2)(2,1)
CHT in Commutative Semirings
(Straubing’s Proof)
Lift original semiring to the semiring of paths:
- Matrix A is lifted to a matrix GA of paths
- Permutation cycles lifted cyclic paths
Prove CHT in the semiring of paths:
GA
n-q P
q
q0 n
= GA
n-q P
q
q0 n
(CHT holds?)
CHT in Commutative Semirings
(Straubing’s Proof)
Lift original semiring to the semiring of paths:
- Matrix A is lifted to a matrix GA of paths
- Permutation cycles lifted cyclic paths
Prove CHT in the semiring of paths:
- Show bijection between pos/neg products
GA
P3
= GA
2 P
1
3 2 1
(1,2) (2,1) (2,3) (3,1) (3,3)
3 2 1
(1,2) (2,1) (2,3) (3,1) (3,3)
(3,3)(1,2)(2,1) (3,3)(1,2)(2,1)
CHT in Commutative Semirings
(Straubing’s Proof)
Lift original semiring to the semiring of paths:
- Matrix A is lifted to a matrix GA of paths
- Permutation cycles lifted cyclic paths
Prove CHT in the semiring of paths:
- Show bijection between pos/neg products
Port results back to the original semiring:
- Apply products: (A)
- Path application: (1...n)(A) = A(1)...A(n)
CHT in Commutative Semirings
(Straubing’s Proof)
CHT in Idempotent Semirings
Lift original semiring to the semiring of paths:
- Matrix A: order in paths important
- Permutation cycles: rotations are distinct
CHT in Idempotent Semirings
Lift original semiring to the semiring of paths:
- Matrix A: order in paths important
- Permutation cycles: rotations are distinct
= {(1,2),(2,1)} =
(1,2)(2,1) (2,1)(1,2)0
Lift original semiring to the semiring of paths:
- Matrix A: order in paths important
- Permutation cycles: rotations are distinct
Prove CHT in the semiring of paths:
- Products Gn-||: cycles to be properly inserted
CHT in Idempotent Semirings
Lift original semiring to the semiring of paths:
- Matrix A: order in paths important
- Permutation cycles: rotations are distinct
Prove CHT in the semiring of paths:
- Products Gn-||: cycles to be properly inserted
Gn-|| = Gn-|| + GGn-||-1 +...+ Gn-||
CHT in Idempotent Semirings
Lift original semiring to the semiring of paths:
- Matrix A: order in paths important
- Permutation cycles: rotations are distinct
Prove CHT in the semiring of paths:
- Products Gn-||: cycles to be properly inserted
Port results back to the original semiring:
- Apply products: Gn-||(A)
CHT in Idempotent Semirings
Theorem: Gn =
Pq n
q1 n
GA
n-||
Proof:
LHS RHS: Let LHS
- Pidgeon-hole:
has at least one cycle in s
- Structural:
is a simple cycle of length k
- Remove in : [s/ ] is in Gn-||
- Shuffle-product: Gn-|| reinserts
RHS LHS: Let RHS
- No wrong path: The shuffle is sound
- Idempotence:
Takes care of multiple copies
CHT in Idempotent Semirings
Theorem: Gn =
Pq n
q1 n
GA
n-||
Proof:
LHS RHS: Let LHS
- Pidgeon-hole:
has at least one cycle in s
- Structural:
is also a simple cycle
- Remove in : [s/ ] is in Gn-||
- Shuffle-product: Gn-|| reinserts
RHS LHS: Let RHS
- No wrong path: The shuffle is sound
- Idempotence:
Takes care of multiple copies
CHT in Idempotent Semirings
Theorem: Gn =
Pq n
q1 n
GA
n-||
Proof:
LHS RHS: Let LHS
- Pidgeon-hole:
has at least one cycle in s
- Structural:
is also a simple cycle
- Remove in : [s/ ] is in Gn-||
- Shuffle-product: Gn-|| reinserts
RHS LHS: Let RHS
- No wrong path: The shuffle is sound
- Idempotence:
Takes care of multiple copies
CHT in Idempotent Semirings
Define: (i,i) =
if (i,i) = 0 0 if (i,i) =
Theorem: classic CHT can be derived by using:
- Gn-|| = G
n-|| + G n-||
- application of CHT to G
n-|| and G n-||
Matrix CHT: can be regarded as a constructive
version of the pumping lemma.
CHT in Idempotent Semirings
Define: (i,i) =
if (i,i) = 0 0 if (i,i) =
Theorem: classic CHT can be derived by using:
- Gn-|| = G
n-|| + G n-||
- application of CHT to G
n-|| and G n-||
Matrix CHT: can be regarded as a constructive
version of the pumping lemma.
CHT in Idempotent Semirings
Define: (i,i) =
if (i,i) = 0 0 if (i,i) =
Theorem: classic CHT can be derived by using:
- Gn-|| = G
n-|| + G n-||
- application of CHT to G
n-|| and G n-||
Matrix CHT: can be regarded as a constructive
version of the pumping lemma.
CHT in Idempotent Semirings
Finite Automata as Linear Systems
Now define the linear system LM= [S,A,C]:
xt(n+1) = xt(n)A, x0 = S() yt(n) = xt(n)C, C = F()
Example: consider following automaton:
x3 x2 x1 a a b b 1 1 A(a) = 0 1 0 , x ( ) = 0 1 A(b) = 0 0 , C( ) = 1 1 1
L1
A = A(a)a + A(b)b
Observability
t n-1 t
[y(0) y(1) ... y(n-1)] = x C AC ... A C] = x (1)
Let L = [S,A,C] be an n-state automaton. It's output: [ O L is observable if x is uniquely determin Exampl t ed by ( e:
- bser
h v 1). l e abi i
n 1 2 3
1
b b b b b
a a ε a a a x A C 1 1 1 1 O = 1 1 1 1 1 x x 1
ty matrix
- f