[PPT] - A Simple and Efficient Solution of the Identifiability Problem for PowerPoint Presentation

SLIDE 1

Guideline Introduction String Functions Solution of the Identifiability Problem

A Simple and Efficient Solution of the Identifiability Problem

for Hidden Markov Models and Quantum Random Walks Alexander Schönhuth

Pacific Institute for the Mathematical Sciences School of Computing Science Simon Fraser University

February 2009

Alexander Schönhuth Identifiability Problem

SLIDE 2

Guideline Introduction String Functions Solution of the Identifiability Problem

Guideline

1

Introduction Identifiability Problem Hidden Markov Processes (HMPs) Quantum Random Walks (QRWs)

2

String Functions Stochastic Processes as String Functions Hankel Matrices and Dimension of String Functions Observable Operators Dimension of HMPs and QRWs Minimal Representations

3

Solution of the Identifiability Problem Computational Bottleneck Key Insight Algorithm

Alexander Schönhuth Identifiability Problem

SLIDE 3

Guideline Introduction String Functions Solution of the Identifiability Problem Identifiability Problem Hidden Markov Processes (HMPs) Quantum Random Walks (QRWs)

Identifiability Problem

Situation: Φ : P → S where P is a set of parameterizations and S is the corresponding set of stochastic processes. Definition A stochastic process Φ(P) as induced by the parameterization P is said to be identifiable iff Φ−1(Φ(P)) = {P} (1)

Alexander Schönhuth Identifiability Problem

SLIDE 4

Guideline Introduction String Functions Solution of the Identifiability Problem Identifiability Problem Hidden Markov Processes (HMPs) Quantum Random Walks (QRWs)

Hidden Markov Processes (HMPs)

0.8 a b c a b c

1 2

START 0.25 0.5 0.25 0.25 0.3 0.45 0.5 0.7 0.5 0.3 0.2

Initial probabilities π = (0.8, 0.2)T Transition probabilities M = (mij := P(i → j))i,j=1,2 = 0.3 0.7 0.5 0.5

Emission probabilities,

e.g. e1b = 0.5, e2c = 0.45.

Alexander Schönhuth Identifiability Problem

SLIDE 5

Guideline Introduction String Functions Solution of the Identifiability Problem Identifiability Problem Hidden Markov Processes (HMPs) Quantum Random Walks (QRWs)

Hidden Markov Processes (HMPs)

0.8 a b c a b c

1 2

START 0.25 0.5 0.25 0.25 0.3 0.45 0.5 0.7 0.5 0.3 0.2

Initial probabilities π = (0.8, 0.2)T Transition probabilities M = (mij := P(i → j))i,j=1,2 = 0.3 0.7 0.5 0.5

Emission probabilities,

e.g. e1b = 0.5, e2c = 0.45. Random source (Xt ) with values in Σ = {a, b, c}: e.g.: PX (X1 = a, X2 = b) = π1e1a(a11e1b + a12e2b) + π2e2a(a21e1b + a22e2b)

Alexander Schönhuth Identifiability Problem

SLIDE 6

Guideline Introduction String Functions Solution of the Identifiability Problem Identifiability Problem Hidden Markov Processes (HMPs) Quantum Random Walks (QRWs)

Quantum Random Walks (QRWs)

A QRW Q = (G, U, ψ0) consists of a directed graph G = (V, E), a unitary operator U : C|E| → C|E| and a wave function ψ0 ∈ C|E|

Alexander Schönhuth Identifiability Problem

SLIDE 7

Guideline Introduction String Functions Solution of the Identifiability Problem Identifiability Problem Hidden Markov Processes (HMPs) Quantum Random Walks (QRWs)

Quantum Random Walks (QRWs)

A QRW Q = (G, U, ψ0) consists of a directed graph G = (V, E), a unitary operator U : C|E| → C|E| and a wave function ψ0 ∈ C|E| Classical random source associated with QRW Q = (G, U, ψo): Sequences of symbols v0...vtvt+1... from V Underlying sequences of states ψo...ψtψt+1... from C|E|

Alexander Schönhuth Identifiability Problem

SLIDE 8

Guideline Introduction String Functions Solution of the Identifiability Problem Identifiability Problem Hidden Markov Processes (HMPs) Quantum Random Walks (QRWs)

Quantum Random Walks (QRWs)

A QRW Q = (G, U, ψ0) consists of a directed graph G = (V, E), a unitary operator U : C|E| → C|E| and a wave function ψ0 ∈ C|E| Classical random source associated with QRW Q = (G, U, ψo): Sequences of symbols v0...vtvt+1... from V Underlying sequences of states ψo...ψtψt+1... from C|E| Mechanism: Generate symbol vt ∈ V with probability

e∈E,e=(vt,u) |(Uψt)e|2.

ψt+1 = (1/

e∈E,e=(vt,x) |(Uψ)e|2) · e∈E,e=(v,u)(Uψ)e

∈ C|E| Return to first step.

Alexander Schönhuth Identifiability Problem

SLIDE 9

Guideline Introduction String Functions Solution of the Identifiability Problem Identifiability Problem Hidden Markov Processes (HMPs) Quantum Random Walks (QRWs)

Identifiability Problem

Identifiability Problem Given the parameterizations of two HMPs M1, M2 or two QRWs Q1, Q2, decide whether the associated random processes p1, p2 are equivalent. Input: Two parameterizations of two HMPs M1, M2 or two QRWs Q1, Q2. Output: Yes, if p1 = p2, no else. Solution for HMPs: Ito, Amari and Kobayashi, IEEE Tr. Inf. Th., 1992. Algorithm is exponential in the number of hidden states. No solution for QRWs known!

Alexander Schönhuth Identifiability Problem

SLIDE 10

Guideline Introduction String Functions Solution of the Identifiability Problem Stochastic Processes as String Functions Hankel Matrices and Dimension of String Functions Observable Operators Dimension of HMPs and QRWs Minimal Representations

String Functions

Let Σ∗ := ∪t≥0Σt be the set of all strings of finite length over an alphabet Σ. Treat random processes (Xt) with values in Σ as string functions pX : Σ∗ → R by pX(v = v0v1...vt) := P(X0 = vo, X1 = v1, ..., Xt = vt). By standard arguments: (Xt) = (Yt) ⇔ ∀v ∈ Σ∗ : pX(v) = pY(v).

Alexander Schönhuth Identifiability Problem

SLIDE 11

Guideline Introduction String Functions Solution of the Identifiability Problem Stochastic Processes as String Functions Hankel Matrices and Dimension of String Functions Observable Operators Dimension of HMPs and QRWs Minimal Representations

Dimension of String Functions

The Hankel Matrix

Let wv = w1...wmv1...vn ∈ Σm+n be the concatenation of two strings w = w1...wm ∈ Σs, v = v1...vn ∈ Σt. Consider the (infinite-dimensional) Hankel matrix Pp := [p(wv)]v,w∈Σ∗ ∈ RΣ∗×Σ∗ ∼ = RN×N. for a string function p : Σ∗ → R.

Alexander Schönhuth Identifiability Problem

SLIDE 12

Guideline Introduction String Functions Solution of the Identifiability Problem Stochastic Processes as String Functions Hankel Matrices and Dimension of String Functions Observable Operators Dimension of HMPs and QRWs Minimal Representations

Dimension of String Functions

The Hankel Matrix

Let wv = w1...wmv1...vn ∈ Σm+n be the concatenation of two strings w = w1...wm ∈ Σs, v = v1...vn ∈ Σt. Consider the (infinite-dimensional) Hankel matrix Pp := [p(wv)]v,w∈Σ∗ ∈ RΣ∗×Σ∗ ∼ = RN×N. for a string function p : Σ∗ → R. Example: Let Σ = {0, 1}. Pp =          p() p(0) p(1) . . . p(0) p(00) p(10) . . . p(1) p(01) p(11) . . . p(00) p(000) p(100) . . . p(01) p(001) p(101) . . . . . . . . . . . . ...         

Alexander Schönhuth Identifiability Problem

SLIDE 13

Guideline Introduction String Functions Solution of the Identifiability Problem Stochastic Processes as String Functions Hankel Matrices and Dimension of String Functions Observable Operators Dimension of HMPs and QRWs Minimal Representations

Dimension of String Functions

The Hankel Matrix

Let wv = w1...wmv1...vn ∈ Σm+n be the concatenation of two strings w = w1...wm ∈ Σs, v = v1...vn ∈ Σt. Consider the (infinite-dimensional) Hankel matrix Pp := [p(wv)]v,w∈Σ∗ ∈ RΣ∗×Σ∗ ∼ = RN×N. for a string function p : Σ∗ → R. Example: Let Σ = {0, 1}. Pp =          p() p(0) p(1) . . . p(0) p(00) p(10) . . . p(1) p(01) p(11) . . . p(00) p(000) p(100) . . . p(01) p(001) p(101) . . . . . . . . . . . . ...          We define the dimension of p to be dim p := rk Pp ∈ N ∪ {∞}.

Alexander Schönhuth Identifiability Problem

SLIDE 14

Guideline Introduction String Functions Solution of the Identifiability Problem Stochastic Processes as String Functions Hankel Matrices and Dimension of String Functions Observable Operators Dimension of HMPs and QRWs Minimal Representations

Observable Operators

Let pv resp. pw be the row resp. column vector of Pp referring to strings v resp. w.

Alexander Schönhuth Identifiability Problem

SLIDE 15

Guideline Introduction String Functions Solution of the Identifiability Problem Stochastic Processes as String Functions Hankel Matrices and Dimension of String Functions Observable Operators Dimension of HMPs and QRWs Minimal Representations

Observable Operators

Let pv resp. pw be the row resp. column vector of Pp referring to strings v resp. w. Definition The linear operators ρv, τw : RΣ∗ − → RΣ∗ p → pv, pw for v, w ∈ Σ∗ are called observable operators.

Alexander Schönhuth Identifiability Problem

SLIDE 16

Guideline Introduction String Functions Solution of the Identifiability Problem Stochastic Processes as String Functions Hankel Matrices and Dimension of String Functions Observable Operators Dimension of HMPs and QRWs Minimal Representations

Observable Operators

Let pv resp. pw be the row resp. column vector of Pp referring to strings v resp. w. Definition The linear operators ρv, τw : RΣ∗ − → RΣ∗ p → pv, pw for v, w ∈ Σ∗ are called observable operators. Observation: Let v1, ..., vt, w1, ..., ws ∈ Σ be single letters. Then it holds that ρv1...vt = ρv1 ◦ ... ◦ ρvt and, in the reverse order on the letters, τw1...ws = τws ◦ ... ◦ τw1.

Alexander Schönhuth Identifiability Problem

SLIDE 17

Guideline Introduction String Functions Solution of the Identifiability Problem Stochastic Processes as String Functions Hankel Matrices and Dimension of String Functions Observable Operators Dimension of HMPs and QRWs Minimal Representations

Dimension of Hidden Markov Processes and Quantum Random Walks

Lemma Let p : Σ∗ → R be associated with a hidden Markov process on d hidden states resp. a quantum random walk on a graph with |E| edges. Then there are string functions gi : Σ∗ → R, i = 1, ..., N where N = d resp. N = |E|2, such that span{pw | w ∈ Σ∗} ⊂ span{gi | i = 1, ..., N}. and computation of gi(v = v1...vk) is efficient. Corollary: The lemma straightforwardly implies dim p ≤ N.

Alexander Schönhuth Identifiability Problem

SLIDE 18

Guideline Introduction String Functions Solution of the Identifiability Problem Stochastic Processes as String Functions Hankel Matrices and Dimension of String Functions Observable Operators Dimension of HMPs and QRWs Minimal Representations

Finite-dimensional Processes

Theorem (AS, Jaeger, 2007) Let p : Σ∗ → R. Then the following conditions are equivalent. (i) dim p = rk Pp ≤ d.

Alexander Schönhuth Identifiability Problem

SLIDE 19

Guideline Introduction String Functions Solution of the Identifiability Problem Stochastic Processes as String Functions Hankel Matrices and Dimension of String Functions Observable Operators Dimension of HMPs and QRWs Minimal Representations

Finite-dimensional Processes

Theorem (AS, Jaeger, 2007) Let p : Σ∗ → R. Then the following conditions are equivalent. (i) dim p = rk Pp ≤ d. (ii) There exist vectors x, y ∈ Rd as well as matrices Ta ∈ Rd×d for all a ∈ Σ such that ∀v ∈ Σ∗ : p(v = v1...vn) = y|Tvn...Tv1|x.

Alexander Schönhuth Identifiability Problem

SLIDE 20

Guideline Introduction String Functions Solution of the Identifiability Problem Stochastic Processes as String Functions Hankel Matrices and Dimension of String Functions Observable Operators Dimension of HMPs and QRWs Minimal Representations

Finite-dimensional Processes

Theorem (AS, Jaeger, 2007) Let p : Σ∗ → R. Then the following conditions are equivalent. (i) dim p = rk Pp ≤ d. (ii) There exist vectors x, y ∈ Rd as well as matrices Ta ∈ Rd×d for all a ∈ Σ such that ∀v ∈ Σ∗ : p(v = v1...vn) = y|Tvn...Tv1|x. Definition An ensemble ((Ta)a∈Σ, x, y) is called a minimal representation of p. Idea: Given two stochastic processes p1, p2, compare their minimal representations.

Alexander Schönhuth Identifiability Problem

SLIDE 21

Guideline Introduction String Functions Solution of the Identifiability Problem Stochastic Processes as String Functions Hankel Matrices and Dimension of String Functions Observable Operators Dimension of HMPs and QRWs Minimal Representations

Computation of Minimal Representations

1

Determine words v1, ..., vd and w1, ..., wd such that for V := [p(wjvi)]1≤i,j≤d : rk V = dim p.

2

Define x = (x1, ..., xd)T := (p(v1), ..., p(vd))T and y = (y1, ..., yd)T := (V T )−1(p(v1), ..., p(vd))T

3

For each a ∈ Σ, compute matrices Wa := [p(wjavi)]1≤i,j≤d ∈ Rd×d.

4

A minimal representation of p is then given by ((WaV −1)a∈Σ, x, y).

Alexander Schönhuth Identifiability Problem

SLIDE 22

Guideline Introduction String Functions Solution of the Identifiability Problem Stochastic Processes as String Functions Hankel Matrices and Dimension of String Functions Observable Operators Dimension of HMPs and QRWs Minimal Representations

Identification of Finite-Dimensional Processes

Generic Algorithm 1: Determine matrices V1, V2 of maximal rank for p1, p2. 2: If rk V1 = rk V2 (⇔ dim p1 = dim p2) then output ’NOT IDENTICAL’. 3: if d = rk V1 = rk V2 then 4:

Compute V3 := [p2(wjvi)]1≤i,j≤d, where vi, wj are from V1.

5:

If V1 = V3, output ’NOT IDENTICAL’.

6:

Compute matrices W1a, W2a for all a ∈ Σ and vectors x1, x2, y1, y2, all referring to the strings of V1.

7:

If W1a = W2a for all a and x1 = x2, y1 = y2 then output ’IDENTICAL’.

8:

Else, output ’NOT IDENTICAL’.

9: end if

Alexander Schönhuth Identifiability Problem

SLIDE 23

Guideline Introduction String Functions Solution of the Identifiability Problem Computational Bottleneck Key Insight Algorithm

Computational Bottleneck

Computational bottleneck of the identifiability problem: determination

f bases for the row and the column space of Pp.

Alexander Schönhuth Identifiability Problem

SLIDE 24

Guideline Introduction String Functions Solution of the Identifiability Problem Computational Bottleneck Key Insight Algorithm

Hidden Markov Processes and Quantum Random Walks

Situation (Σ = {0, 1}):          g1() . . . gN() p() p(0) p(1) . . . g1(0) . . . gN(0) p(0) p(00) p(10) . . . g1(1) . . . gN(1) p(1) p(01) p(11) . . . g1(00) . . . gN(00) p(00) p(000) p(100) . . . g1(01) . . . gN(01) p(01) p(001) p(101) . . . . . . . . . . . . . . . . . . . . . ...         

Alexander Schönhuth Identifiability Problem

SLIDE 25

Guideline Introduction String Functions Solution of the Identifiability Problem Computational Bottleneck Key Insight Algorithm

Hidden Markov Processes and Quantum Random Walks

Situation (Σ = {0, 1}):          g1() . . . gN() p() p0() p1() . . . g1(0) . . . gN(0) p(0) p0(0) p1(0) . . . g1(1) . . . gN(1) p(1) p0(1) p1(1) . . . g1(00) . . . gN(00) p(00) p0(00) p1(00) . . . g1(01) . . . gN(01) p(01) p0(01) p1(01) . . . . . . . . . . . . . . . . . . . . . ...         

Alexander Schönhuth Identifiability Problem

SLIDE 26

Guideline Introduction String Functions Solution of the Identifiability Problem Computational Bottleneck Key Insight Algorithm

Hidden Markov Processes and Quantum Random Walks

Situation (Σ = {0, 1}):          g1() . . . gN() p() p0() p1() . . . g1(0) . . . gN(0) p(0) p0(0) p1(0) . . . g1(1) . . . gN(1) p(1) p0(1) p1(1) . . . g1(00) . . . gN(00) p(00) p0(00) p1(00) . . . g1(01) . . . gN(01) p(01) p0(01) p1(01) . . . . . . . . . . . . . . . . . . . . . ...          where for all w ∈ Σ∗: pw ∈ span{gi, i = 1, ..., N}.

Alexander Schönhuth Identifiability Problem

SLIDE 27

Guideline Introduction String Functions Solution of the Identifiability Problem Computational Bottleneck Key Insight Algorithm

Key Insight

Lemma Let p : Σ∗ → R such that for all w ∈ Σ∗ pw ∈ span{gi, i = 1, ..., N} for suitable gi : Σ∗ → R, i = 1, ..., N (hence dim p ≤ N). Then it holds that

g1(v0)

· · · gN(v0)

∈

span    g1(v1) · · · gN(v1) . . . ... . . . g1(vm) · · · gN(vm)    = ⇒ ∀u ∈ Σ∗ : puv0 ∈ span    puv1 . . . puvk   

Alexander Schönhuth Identifiability Problem

SLIDE 28

Guideline Introduction String Functions Solution of the Identifiability Problem Computational Bottleneck Key Insight Algorithm

Key Insight

Proof: Choose β1, ..., βm and α1, ..., αN such that (g1(v0), ..., gN(v0)) =

m

j=1

βj(g1(vj), ..., gN(vj)) pw =

n

i=1

αigi. ⋄

Alexander Schönhuth Identifiability Problem

SLIDE 29

Guideline Introduction String Functions Solution of the Identifiability Problem Computational Bottleneck Key Insight Algorithm

Key Insight

Proof: Choose β1, ..., βm and α1, ..., αN such that (g1(v0), ..., gN(v0)) =

m

j=1

βj(g1(vj), ..., gN(vj)) pw =

n

i=1

αigi. It follows, for arbitrary w ∈ Σ∗, pv0(w) = p(wv0) = pw(v0) =

m

j=1

βj

n

i=1

αigi(vj) =

m

j=1

βjpw(vj) =

m

j=1

βjpvj(w) meaning that pv0 = m

j=1 βjpvj .

⋄

Alexander Schönhuth Identifiability Problem

SLIDE 30

Guideline Introduction String Functions Solution of the Identifiability Problem Computational Bottleneck Key Insight Algorithm

Key Insight

Proof: Choose β1, ..., βm and α1, ..., αN such that (g1(v0), ..., gN(v0)) =

m

j=1

βj(g1(vj), ..., gN(vj)) pw =

n

i=1

αigi. It follows, for arbitrary w ∈ Σ∗, pv0(w) = p(wv0) = pw(v0) =

m

j=1

βj

n

i=1

αigi(vj) =

m

j=1

βjpw(vj) =

m

j=1

βjpvj(w) meaning that pv0 = m

j=1 βjpvj . Applying ρu yields

puv0 = ρu(pv0) =

m

j=1

βjρu(pvj ) =

m

j=1

βjpuvj . ⋄

Alexander Schönhuth Identifiability Problem

SLIDE 31

Guideline Introduction String Functions Solution of the Identifiability Problem Computational Bottleneck Key Insight Algorithm

Solution of the Identifiability Problem

Theorem Let p : Σ∗ → R such that for all w ∈ Σ∗ pw ∈ span{gi, i = 1, ..., N} for suitable gi : Σ∗ → R, i = 1, ..., N.

Alexander Schönhuth Identifiability Problem

SLIDE 32

Guideline Introduction String Functions Solution of the Identifiability Problem Computational Bottleneck Key Insight Algorithm

Solution of the Identifiability Problem

Theorem Let p : Σ∗ → R such that for all w ∈ Σ∗ pw ∈ span{gi, i = 1, ..., N} for suitable gi : Σ∗ → R, i = 1, ..., N. Then one can determine strings vi, wj, i, j = 1, ..., dim p such that rk ([p(wjvi)]1≤i,j≤dim p) = dim p in time linear in N.

Alexander Schönhuth Identifiability Problem

SLIDE 33

Guideline Introduction String Functions Solution of the Identifiability Problem Computational Bottleneck Key Insight Algorithm

Algorithm

Collect strings v into Arow such that the pv, v ∈ Arow span the row space.

1: h(v) := (g1(v), ..., gN(v)) ∈ RN 2: Arow ← {}

Brow ← {h()} Crow ← Σ.

3: while Crow = ∅ do 4:

Choose v ∈ Crow.

5:

if h(v) is linearly independent of Brow then

6:

Arow ← Arow ∪ {v} Brow ← Brow ∪ {h(v)} Crow ← Crow ∪ {av | a ∈ Σ}

7:

end if

8: end while

Alexander Schönhuth Identifiability Problem

SLIDE 34

Guideline Introduction String Functions Solution of the Identifiability Problem Computational Bottleneck Key Insight Algorithm

Algorithm

Collect strings v into Arow such that the pv, v ∈ Arow span the row space.

1: h(v) := (g1(v), ..., gN(v)) ∈ RN 2: Arow ← {}

Brow ← {h()} Crow ← Σ.

3: while Crow = ∅ do 4:

Choose v ∈ Crow.

5:

if h(v) is linearly independent of Brow then

6:

Arow ← Arow ∪ {v} Brow ← Brow ∪ {h(v)} Crow ← Crow ∪ {av | a ∈ Σ}

7:

end if

8: end while

Collect strings w into Acol such that the pw, w ∈ Acol span the column space.

1: q(w) := (p(wv), v ∈ Arow) ∈ R|Arow |. 2: Acol ← {}

Bcol ← {q()} Ccol ← Σ

3: while Ccol = ∅ do 4:

Choose w ∈ Ccol.

5:

if q(w) is linearly independent of Bcol then

6:

Acol ← Acol ∪ {w} Bcol ← Bcol ∪ {q(w)} Ccol ← Ccol ∪ {wa | a ∈ Σ}

7:

end if

8: end while

Alexander Schönhuth Identifiability Problem

SLIDE 35

Guideline Introduction String Functions Solution of the Identifiability Problem Computational Bottleneck Key Insight Algorithm

Conclusion

Identifiability problem for hidden Markov processes and quantum random walks presented. Solution efficient in the parameterizations.

Alexander Schönhuth Identifiability Problem

SLIDE 36

Guideline Introduction String Functions Solution of the Identifiability Problem Computational Bottleneck Key Insight Algorithm

Conclusion

Identifiability problem for hidden Markov processes and quantum random walks presented. Solution efficient in the parameterizations. Core idea also applicable to efficiently test HMMs and QRWs for ergodicity: Theorem Let M := [

a Wa]V −1. A finite-dimensional process p is ergodic iff

dim Eig(M; 1) = 1.

Alexander Schönhuth Identifiability Problem

SLIDE 37

Guideline Introduction String Functions Solution of the Identifiability Problem Computational Bottleneck Key Insight Algorithm

Thanks for the attention!

Alexander Schönhuth Identifiability Problem