Loops and Overloops for Tree Walking Automata Pierre-Cyrille Ham, - - PowerPoint PPT Presentation

loops and overloops for tree walking automata
SMART_READER_LITE
LIVE PREVIEW

Loops and Overloops for Tree Walking Automata Pierre-Cyrille Ham, - - PowerPoint PPT Presentation

Loops and Overloops for Tree Walking Automata Pierre-Cyrille Ham, Vincent Hugot, Olga Kouchnarenko {pcheam,vhugot,okouchnarenko}@lifc.univ-fcomte.fr University of Franche-Comt DGA & LIFC-INRIA/CASSIS, project ACCESS July 13, 2011 1/24


slide-1
SLIDE 1

Loops and Overloops for Tree Walking Automata

Pierre-Cyrille Héam, Vincent Hugot, Olga Kouchnarenko

{pcheam,vhugot,okouchnarenko}@lifc.univ-fcomte.fr

University of Franche-Comté DGA & LIFC-INRIA/CASSIS, project ACCESS

July 13, 2011

1/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-2
SLIDE 2

Tree Walking Automata Old formalism (≈1970, Aho & Ullman) Sequential model, as opposed to branching tree automata Less extensively studied model for a long while. . . . . . but long standing questions solved in recent years Recent surge in interest, due mostly to connection to XML (Fragments of Core XPath, streaming etc) Research focused on fundamental problems (expressiveness. . . ) Our focus: practical, efficient algorithms Finality: efficient XML queries; compact Model-Checking Starting Point: Transformation from TWA to BUTA

2/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-3
SLIDE 3

Preliminaries

Definition of Tree Walking Automata

A Tree-Walking Automaton is a tuple A = Σ, Q, I, F, ∆ ∆ ⊆ Σ × Q × { ⋆, 0, 1 }

  • T : types

× { ↑,

  • , ւ, ց }
  • M : moves

×Q Notations “f , p, τ → µ, q” for the tuple (f , p, τ, µ, q) ∈ ∆. Σ2, p, T →

  • , q def

= { (σ, p, τ,

  • , q) | σ ∈ Σ2, τ ∈ T }

Remarks Ranked (binary) vs. unranked alphabet Σ0, Q, T → { ւ, ց }, Q ∪ Σ, Q, ⋆ → ↑, Q invalid

3/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-4
SLIDE 4

Preliminaries

Example Tree Walking Automaton

A Very Simple TWA: X = Σ, Q, I, F, ∆ Σ0 = { a, b, c } and Σ2 = { f , g, h } Q = { qℓ, qu }, I = {qℓ}, F = {qu} ∆ = a, qℓ, { ⋆, 0 } →

  • , qu

∪ Σ, qu, 0 → ↑, qu ∪ Σ2, qℓ, { ⋆, 0 } → ւ, qℓ X accepts exactly all trees whose left-most leaf is labelled by a — and the tree a itself.

4/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-5
SLIDE 5

Preliminaries

Example Tree Walking Automaton

Q = { qℓ, qu }, I = {qℓ}, F = {qu} ∆ = a, qℓ, { ⋆, 0 } →

  • , qu

∪ Σ, qu, 0 → ↑, qu ∪ Σ2, qℓ, { ⋆, 0 } → ւ, qℓ f h a b a

5/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-6
SLIDE 6

Preliminaries

Example Tree Walking Automaton

Q = { qℓ, qu }, I = {qℓ}, F = {qu} ∆ = a, qℓ, { ⋆, 0 } →

  • , qu

∪ Σ, qu, 0 → ↑, qu ∪ Σ2, qℓ, { ⋆, 0 } → ւ, qℓ f [qℓ] h a b a

5/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-7
SLIDE 7

Preliminaries

Example Tree Walking Automaton

Q = { qℓ, qu }, I = {qℓ}, F = {qu} ∆ = a, qℓ, { ⋆, 0 } →

  • , qu

∪ Σ, qu, 0 → ↑, qu ∪ Σ2, qℓ, { ⋆, 0 } → ւ, qℓ f h[qℓ] a b a

5/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-8
SLIDE 8

Preliminaries

Example Tree Walking Automaton

Q = { qℓ, qu }, I = {qℓ}, F = {qu} ∆ = a, qℓ, { ⋆, 0 } →

  • , qu

∪ Σ, qu, 0 → ↑, qu ∪ Σ2, qℓ, { ⋆, 0 } → ւ, qℓ f h a[qℓ] b a

5/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-9
SLIDE 9

Preliminaries

Example Tree Walking Automaton

Q = { qℓ, qu }, I = {qℓ}, F = {qu} ∆ = a, qℓ, { ⋆, 0 } →

  • , qu

∪ Σ, qu, 0 → ↑, qu ∪ Σ2, qℓ, { ⋆, 0 } → ւ, qℓ f h a[qu] b a

5/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-10
SLIDE 10

Preliminaries

Example Tree Walking Automaton

Q = { qℓ, qu }, I = {qℓ}, F = {qu} ∆ = a, qℓ, { ⋆, 0 } →

  • , qu

∪ Σ, qu, 0 → ↑, qu ∪ Σ2, qℓ, { ⋆, 0 } → ւ, qℓ f h[qu] a b a

5/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-11
SLIDE 11

Preliminaries

Example Tree Walking Automaton

Q = { qℓ, qu }, I = {qℓ}, F = {qu} ∆ = a, qℓ, { ⋆, 0 } →

  • , qu

∪ Σ, qu, 0 → ↑, qu ∪ Σ2, qℓ, { ⋆, 0 } → ւ, qℓ f [qu] h a b a

5/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-12
SLIDE 12

Given a TWA A, build an equivalent BUTA B. Known solution outlined in the literature [Bojańczyk, Samuelides] Based on the idea of tree loops Resulting states for B: T × 2Q2 (or det. (2Q2)T) Only proof sketches. No explicit algorithm is given. We argue that things are slightly less straightforward:

Needed states space: Σ × T × 2Q2 (or det. Σ × (2Q2)T) Because of this, some existing implementations are only almost correct [dtwa-tools]

We introduce tree overloops

This time we really have T × 2Q2 (or det. (2Q2)T) Lower upper bound if A is deterministic: |T| · 2|Q| log2(|Q|+1)

6/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-13
SLIDE 13

The Idea of Tree Loops

(p, q) ∈ Q2 is a loop of A on t|α if there exists a run which starts in p, ends in q (at the root α), and always stays in the subtree. f h a b a Loops of X on. . . t: {} t|0: {} t|0.0: {} t|0.1: {} t|1: {}

7/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-14
SLIDE 14

The Idea of Tree Loops

(p, q) ∈ Q2 is a loop of A on t|α if there exists a run which starts in p, ends in q (at the root α), and always stays in the subtree. f [qℓ] h a b a Loops of X on. . . t: {(qℓ, ?), (qℓ,qℓ)} t|0: {} t|0.0: {} t|0.1: {} t|1: {}

7/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-15
SLIDE 15

The Idea of Tree Loops

(p, q) ∈ Q2 is a loop of A on t|α if there exists a run which starts in p, ends in q (at the root α), and always stays in the subtree. f h[qℓ] a b a Loops of X on. . . t: {(qℓ, ?), (qℓ, qℓ)} t|0: {(qℓ, ?), (qℓ,qℓ)} t|0.0: {} t|0.1: {} t|1: {}

7/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-16
SLIDE 16

The Idea of Tree Loops

(p, q) ∈ Q2 is a loop of A on t|α if there exists a run which starts in p, ends in q (at the root α), and always stays in the subtree. f h a[qℓ] b a Loops of X on. . . t: {(qℓ, ?), (qℓ, qℓ)} t|0: {(qℓ, ?), (qℓ, qℓ)} t|0.0: {(qℓ, ?), (qℓ,qℓ)} t|0.1: {} t|1: {}

7/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-17
SLIDE 17

The Idea of Tree Loops

(p, q) ∈ Q2 is a loop of A on t|α if there exists a run which starts in p, ends in q (at the root α), and always stays in the subtree. f h a[qu] b a Loops of X on. . . t: {(qℓ, ?), (qℓ, qℓ)} t|0: {(qℓ, ?), (qℓ, qℓ)} t|0.0: {(qℓ, qu), (qℓ, qℓ), (qu,qu)} t|0.1: {} t|1: {}

7/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-18
SLIDE 18

The Idea of Tree Loops

(p, q) ∈ Q2 is a loop of A on t|α if there exists a run which starts in p, ends in q (at the root α), and always stays in the subtree. f h[qu] a b a Loops of X on. . . t: {(qℓ, ?), (qℓ, qℓ)} t|0: {(qℓ, qu), (qℓ, qℓ), (qu,qu)} t|0.0: {(qℓ, qu), (qℓ, qℓ), (qu, qu)} t|0.1: {} t|1: {}

7/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-19
SLIDE 19

The Idea of Tree Loops

(p, q) ∈ Q2 is a loop of A on t|α if there exists a run which starts in p, ends in q (at the root α), and always stays in the subtree. f [qu] h a b a Loops of X on. . . t: {(qℓ, qu), (qℓ, qℓ), (qu,qu)} t|0: {(qℓ, qu), (qℓ, qℓ), (qu, qu)} t|0.0: {(qℓ, qu), (qℓ, qℓ), (qu, qu)} t|0.1: {} t|1: {}

7/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-20
SLIDE 20

The Idea of Tree Loops

(p, q) ∈ Q2 is a loop of A on t|α if there exists a run which starts in p, ends in q (at the root α), and always stays in the subtree. f h a b a Loops of X on. . . t: {(qℓ, qu), (qℓ, qℓ), (qu, qu)} t|0: {(qℓ, qu), (qℓ, qℓ), (qu, qu)} t|0.0: {(qℓ, qu), (qℓ, qℓ), (qu, qu)} t|0.1: {(qℓ, qℓ), (qu, qu)} t|1: {(qℓ, qℓ), (qu, qu)}

7/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-21
SLIDE 21

Computing Tree Loops

Simple Loops, Computation for Leaves

A loop is a simple loop on t|α if there is a run which forms it and reaches α exactly twice (simple looping run). Proposition: Loops Decomposition If S ⊆ Q2 is the set of all simple loops of A on a given subtree u = t|α, then S∗ is the set of all loops of A on u. We denote ℧τ(u) the set of all loops of A on a subtree u, where τ is the type of the root of u. Compute loops on u = a ∈ Σ0. Simple looping run: run of the form (ε, p) ։ (ε, q) only. Hτ

σ def

= { (p, q) | σ, p, τ →

  • , q ∈ ∆ } .

So we have ℧τ(a) = (Hτ

a)∗.

8/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-22
SLIDE 22

Computing Tree Loops

Simple Loops, Computation for Leaves

A loop is a simple loop on t|α if there is a run which forms it and reaches α exactly twice (simple looping run). Proposition: Loops Decomposition If S ⊆ Q2 is the set of all simple loops of A on a given subtree u = t|α, then S∗ is the set of all loops of A on u. We denote ℧τ(u) the set of all loops of A on a subtree u, where τ is the type of the root of u. Compute loops on u = a ∈ Σ0. Simple looping run: run of the form (ε, p) ։ (ε, q) only. Hτ

σ def

= { (p, q) | σ, p, τ →

  • , q ∈ ∆ } .

So we have ℧τ(a) = (Hτ

a)∗.

8/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-23
SLIDE 23

Computing Tree Loops

Computation for Inner Nodes

Let f ∈ Σ2, and u = f (u0, u1); root of type τ. Compute ℧τ(u). First Move of a Simple Looping Run ↑ — impossible: leaves the subtree u

  • — all computed in Hτ

f

ւ — (ε, p), (0, p0), (β1, s1), . . . , (βn, sn), (0, q0), (ε, q), with all βk 0. So (p0, q0) ∈ ℧0(u0). ց —(ε, p), (1, p1), (β1, s1), . . . , (βn, sn), (1, q1), (ε, q), with all βk 1. So (p1, q1) ∈ ℧1(u1).

9/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-24
SLIDE 24

Computing Tree Loops

Computation for Inner Nodes

Let f ∈ Σ2, and u = f (u0, u1); root of type τ. Compute ℧τ(u). First Move of a Simple Looping Run ւ — (ε, p), (0, p0), (β1, s1), . . . , (βn, sn), (0, q0), (ε, q), with all βk 0. So (p0, q0) ∈ ℧0(u0). To build a simple loop (p, q) on the subtree u, we need to. . .

1 choose a side: θ ∈ S def

= { 0, 1 }

2 find an existing loop on that side: (pθ, qθ) ∈ ℧θ(uθ) 3 such that one can connect beginning and end 1

f , p, τ → χ(θ), pθ ∈ ∆a and

2

uθ(ε), qθ, θ → ↑, q ∈ ∆

aχ(·) : S → { ւ, ց } such that χ(0) =ւ and χ(1) =ց. 9/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-25
SLIDE 25

Computing Tree Loops

Overview of the Computation

Loops on Leaves Let a ∈ Σ0, we have ℧τ(a) = (Hτ

a)∗.

Loops on Inner Nodes Let f ∈ Σ2, and u = f (u0, u1); root of type τ. We have ℧τ(u) =

 Hτ

f ∪

   (p, q)

  • ∃θ ∈ S :

∃(pθ, qθ) ∈ ℧θ(uθ) : f , p, τ → χ(θ), pθ ∈ ∆ and uθ(ε), qθ, θ → ↑, q ∈ ∆     

10/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-26
SLIDE 26

Transformation Into BUTA

Using Loops

Transformation Into BUTA Using Loops

0 Input: A TWA A = Σ, Q, I, F, ∆ 1 Initialise States and Rules to ∅ 2 for each a ∈ Σ0, τ ∈ T do

let P = (a, τ, Hτ

a ∗)

add a → P to Rules and P to States

3 repeat until Rules remain unchanged

for each f ∈ Σ2, τ ∈ T do

add every f (P0, P1) → P to Rules and P to States where P0, P1 ∈ States such that P0 = (σ0, 0, S0) and P1 = (σ1, 1, S1) and P = (f , τ, (Hτ

f ∪ S)∗),

with S the set of simple loops built on the sons.

4 Output: A BUTA B equivalent to A:

B = Σ, States, { (σ, ⋆, L) ∈ States | L ∩ (I × F) = ∅ } , Rules

11/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-27
SLIDE 27

Transformation Into BUTA

Using Loops

Transformation Into BUTA Using Loops

0 Input: A TWA A = Σ, Q, I, F, ∆ 1 Initialise States and Rules to ∅ 2 for each a ∈ Σ0, τ ∈ T do

let P = (a, τ, Hτ

a ∗)

add a → P to Rules and P to States

3 repeat until Rules remain unchanged

for each f ∈ Σ2, τ ∈ T do

add every f (P0, P1) → P to Rules and P to States where P0, P1 ∈ States such that P0 = (σ0, 0, S0) and P1 = (σ1, 1, S1) and P = (f , τ, (Hτ

f ∪ S)∗),

with S the set of simple loops built on the sons.

4 Output: A BUTA B equivalent to A:

B = Σ, States, { (σ, ⋆, L) ∈ States | L ∩ (I × F) = ∅ } , Rules

11/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-28
SLIDE 28

Transformation Into BUTA

Using Loops

add every f (P0, P1) → P to Rules and P to States where P0, P1 ∈ States such that P0 = (σ0, 0, S0) and P1 = (σ1, 1, S1) and P = (f , τ, (Hτ

f ∪ S)∗),

with S the set of simple loops built on the sons. Set of Simple Loops Built on the Sons S =

    

(p, q)

  • ∃θ ∈ S :

∃(pθ, qθ) ∈ Sθ st. f , p, τ → χ(θ), pθ ∈ ∆ and σθ, qθ, θ → ↑, q ∈ ∆

    

The son’s symbol is needed to close the end of the loop.

12/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-29
SLIDE 29

Transformation Into BUTA

Using Loops

add every f (P0, P1) → P to Rules and P to States where P0, P1 ∈ States such that P0 = (σ0, 0, S0) and P1 = (σ1, 1, S1) and P = (f , τ, (Hτ

f ∪ S)∗),

with S the set of simple loops built on the sons. Set of Simple Loops Built on the Sons S =

    

(p, q)

  • ∃θ ∈ S :

∃(pθ, qθ) ∈ Sθ st. f , p, τ → χ(θ), pθ ∈ ∆ and σθ, qθ, θ → ↑, q ∈ ∆

    

The son’s symbol is needed to close the end of the loop.

12/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-30
SLIDE 30

Transformation Into BUTA

Using Loops

Important Remark In the construction, sets of loops cannot be considered independently from the symbol they are rooted in. Counter-Example Consider a, b ∈ Σ0, f ∈ Σ2, with only the transitions { a, b } , p, θ →

  • , q ∪

b, q, θ → ↑, s′, f , s, τ → χ(θ), p ⊆ ∆

Then ℧θ(a) = ℧θ(b) = {(p, q)}∗, but ℧τ(f (a, a)) = ℧τ(f (b, b)). Thus the loops-based construction has Σ × T × 2Q2 instead of only T × 2Q2 states (storing the symbol).

13/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-31
SLIDE 31

From Tree Loops to Tree Overloops

Advantages and Definition

Tree Overloops: slight alteration of the notion of tree loop. Advantages wrt. Transformation Into Buta Straightforward T × 2Q2 instead of Σ × T × 2Q2 states DTWA A: smaller |T| · 2|Q| log2(|Q|+1) upper bound on states 2 to 100 times smaller BUTA in average (random tests) (p, q) ∈ Q2 is an overloop of A on t|α if there exists a run which starts in p, ends in q at the parent of the root α, and always stays in the subtree, except for the last configuration. Parent of ε is ε. A TWA A must be escaped into A′ =

Σ, Q ⊎ {} , I, F, ∆ ⊎ Σ, F, ⋆ → ↑, .

14/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-32
SLIDE 32

From Tree Loops to Tree Overloops

Advantages and Definition

Tree Overloops: slight alteration of the notion of tree loop. Advantages wrt. Transformation Into Buta Straightforward T × 2Q2 instead of Σ × T × 2Q2 states DTWA A: smaller |T| · 2|Q| log2(|Q|+1) upper bound on states 2 to 100 times smaller BUTA in average (random tests) (p, q) ∈ Q2 is an overloop of A on t|α if there exists a run which starts in p, ends in q at the parent of the root α, and always stays in the subtree, except for the last configuration. Parent of ε is ε. A TWA A must be escaped into A′ =

Σ, Q ⊎ {} , I, F, ∆ ⊎ Σ, F, ⋆ → ↑, .

14/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-33
SLIDE 33

From Tree Loops to Tree Overloops

Computing Tree Overloops

Idea: Compute loops, then check for ↑-transitions Definition: Up-Closure Let L ⊆ Q2, τ ∈ T and σ ∈ Σ: Uτ

σ[L] def

=

(p, q)

  • ∃p′ : (p, p′) ∈ L and σ, p′, τ → ↑, q ∈ ∆

.

Theorem: Up-Closure Let A be a TWA. If L is the set of all loops of A on a subtree u = t|α, then U♮α

t(α)[L] is the set of all overloops of A on u.

15/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-34
SLIDE 34

From Tree Loops to Tree Overloops

Computing Tree Overloops

Overloops on Leaves Let a ∈ Σ0, we have ℧ ↑τ(a) = Uτ

a[Hτ a ∗].

Overloops on Inner Nodes Let f ∈ Σ2, and u = f (u0, u1); root of type τ. To build a loop (p, qθ) on a subtree, we need to. . .

1 choose a side: θ ∈ S def

= { 0, 1 }

2 find an existing overloop on that side: (pθ, qθ) ∈ ℧

↑θ(uθ)

3 such that one can connect the beginning: 1

f , p, τ → χ(θ), pθ ∈ ∆a

2

unlike loops, the end is already conected!

aχ(·) : S → { ւ, ց } such that χ(0) =ւ and χ(1) =ց. 16/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-35
SLIDE 35

From Tree Loops to Tree Overloops

Computing Tree Overloops

Overloops on Leaves Let a ∈ Σ0, we have ℧ ↑τ(a) = Uτ

a[Hτ a ∗].

Overloops on Inner Nodes Let f ∈ Σ2, and u = f (u0, u1); root of type τ. We have ℧ ↑τ(u) = Uτ

f

f ∪

  • (p, qθ)
  • ∃θ ∈ S :

∃pθ ∈ Q : f , p, τ → χ(θ), pθ ∈ ∆ and (pθ, qθ) ∈ ℧ ↑θ(uθ)

16/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-36
SLIDE 36

Transformation Into BUTA

Using Overloops

Transformation Into BUTA Using Overloops Almost the same the the loops-based version, but:

1 We compute sets of overloops instead of loops 2 Symbols are not stored in the states (not needed): thus

Σ × T × 2Q2 becomes T × 2Q2 again.

3 Final states are

{ (⋆, O) ∈ States | O ∩ (I × {}) = ∅ } Acceptance criterion: Final Loops & Overloops A term t is accepted iff ℧τ(t) ∩ I × F = ∅ ℧ ↑τ(t) ∩ I × {} = ∅ (because of A′ =

Σ, Q ⊎ {} , I, F, ∆ ⊎ Σ, F, ⋆ → ↑, )

17/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-37
SLIDE 37

Transformation Into BUTA

Using Overloops

Transformation Into BUTA Using Overloops Almost the same the the loops-based version, but:

1 We compute sets of overloops instead of loops 2 Symbols are not stored in the states (not needed): thus

Σ × T × 2Q2 becomes T × 2Q2 again.

3 Final states are

{ (⋆, O) ∈ States | O ∩ (I × {}) = ∅ } Acceptance criterion: Final Loops & Overloops A term t is accepted iff ℧τ(t) ∩ I × F = ∅ ℧ ↑τ(t) ∩ I × {} = ∅ (because of A′ =

Σ, Q ⊎ {} , I, F, ∆ ⊎ Σ, F, ⋆ → ↑, )

17/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-38
SLIDE 38

Transformation Into BUTA in the Deterministic Case

Deterministic TWA: Definition A TWA A = Σ, Q, I, F, ∆ is deterministic (ie. a DTWA) ifa for all σ ∈ Σ, p ∈ Q, τ ∈ T, |σ, p, τ → M, Q ∩ ∆| 1.

aWe do not need the usual, stronger definition, where I is a singleton.

Theorem: Deterministic Upper-Bound In general, the overloops-based BUTA B has |T| × 2|Q|2 states. However, it has at most |T| · 2|Q| log2(|Q|+1) states if A is a DTWA. Proof Idea (Full Proof in Appendix) Sets of overloops on a given subterm are functional (ie. right-unique). Each computed state stores the set of overloops on a given subterm; thus there are at most |Q + 1||Q| of them, as

  • pposed to 2|Q|2 in the general case.

18/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-39
SLIDE 39

Transformation Into BUTA in the Deterministic Case

Deterministic TWA: Definition A TWA A = Σ, Q, I, F, ∆ is deterministic (ie. a DTWA) ifa for all σ ∈ Σ, p ∈ Q, τ ∈ T, |σ, p, τ → M, Q ∩ ∆| 1.

aWe do not need the usual, stronger definition, where I is a singleton.

Theorem: Deterministic Upper-Bound In general, the overloops-based BUTA B has |T| × 2|Q|2 states. However, it has at most |T| · 2|Q| log2(|Q|+1) states if A is a DTWA. Proof Idea (Full Proof in Appendix) Sets of overloops on a given subterm are functional (ie. right-unique). Each computed state stores the set of overloops on a given subterm; thus there are at most |Q + 1||Q| of them, as

  • pposed to 2|Q|2 in the general case.

18/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-40
SLIDE 40

Transformation Into BUTA in the Deterministic Case

Deterministic TWA: Definition A TWA A = Σ, Q, I, F, ∆ is deterministic (ie. a DTWA) ifa for all σ ∈ Σ, p ∈ Q, τ ∈ T, |σ, p, τ → M, Q ∩ ∆| 1.

aWe do not need the usual, stronger definition, where I is a singleton.

Theorem: Deterministic Upper-Bound In general, the overloops-based BUTA B has |T| × 2|Q|2 states. However, it has at most |T| · 2|Q| log2(|Q|+1) states if A is a DTWA. Proof Idea (Full Proof in Appendix) Sets of overloops on a given subterm are functional (ie. right-unique). Each computed state stores the set of overloops on a given subterm; thus there are at most |Q + 1||Q| of them, as

  • pposed to 2|Q|2 in the general case.

18/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-41
SLIDE 41

An Overloops-Based Polynomial Approximation

Introduction

Testing emptiness of a TWA is an ExpTime-complete problem. Practical problems: XML Queries Satisfiability of some XPath fragments But also model-checking. . . Standard approach: tranform into BUTA, then test emptiness. We propose another approach: An “over-approximation”; may detect emptiness Executes in polynomial time and space Very (surprisingly) accurate in our random tests

19/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-42
SLIDE 42

An Overloops-Based Polynomial Approximation

The Algorithm

Over-Approximation of the Emptiness Problem, Using Overloops

0 Input: An escaped TWA A = Σ, Q, I, F, ∆ 1 Initialise L0, L1, L⋆ to ∅ 2 for each a ∈ Σ0, τ ∈ T do

Lτ ← Lτ ∪ Uτ

a[Hτ a ∗]

3 repeat until L0, L1, L⋆ remain unchanged

for each f ∈ Σ2, τ ∈ T do

Lτ ← Lτ ∪ Uτ

f [(Hτ f ∪ S)∗]

with S the set of simple loops built on L0 and L1.

4 Output: Empty if L⋆ ∩ (I × {}) = ∅, else Unknown 20/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-43
SLIDE 43

An Overloops-Based Polynomial Approximation

The Algorithm

Over-Approximation of the Emptiness Problem, Using Overloops

0 Input: An escaped TWA A = Σ, Q, I, F, ∆ 1 Initialise L0, L1, L⋆ to ∅ 2 for each a ∈ Σ0, τ ∈ T do

Lτ ← Lτ ∪ Uτ

a[Hτ a ∗]

3 repeat until L0, L1, L⋆ remain unchanged

for each f ∈ Σ2, τ ∈ T do

Lτ ← Lτ ∪ Uτ

f [(Hτ f ∪ S)∗]

with S the set of simple loops built on L0 and L1.

4 Output: Empty if L⋆ ∩ (I × {}) = ∅, else Unknown 20/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-44
SLIDE 44

An Overloops-Based Polynomial Approximation

The Algorithm

S the set of simple loops built on L0 and L1. Set of Simple Loops Built on the Sons (From Overloops) S =

  • (p, qθ)
  • ∃θ ∈ S :

∃pθ ∈ Q

  • f , p, τ → χ(θ), pθ ∈ ∆

and (pθ, qθ) ∈ Lθ

  • 21/24

CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-45
SLIDE 45

An Overloops-Based Polynomial Approximation

Discussion

Approach can be made coarser or finer: from a variant with no type information (ie. L = L0 ∪ L1 ∪ L⋆) to someting equivalent to transformation to BUTA The presented variant is polynomial in time and space Astonishing accuracy in random tests: Out of ≈ 20 000 TWA (2 |Q| 20), 75% of which had empty languages, only two Unknown instead of Empty. Caveat Our generation scheme was simplistic. Trivial instances? More tests to be made using uniform generation scheme.

22/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-46
SLIDE 46

Conclusion and Perspectives

What We Have Seen Two TWA Membership algorithms (loops & overloops) Two transformations from TWA into BUTA (idem) Overloops-based BUTA have expected states Overloops-based BUTA smaller (2 to 100 times. . . ) DTWA: Lower upper-bound on BUTA states with overloops Approximation: polynomial, accurate in random tests What Is Left To Do Test approximation with uniform DTWA generation scheme Characterise classes of TWA using overloops (det. etc) Significant size reductions on TWA (optimise queries etc)

23/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-47
SLIDE 47

Some References

[Comon et al., 2007, Samuelides, 2007, Héam et al., 2009, Bojańczyk, 2008] Bojańczyk, M. (2008). Tree-Walking Automata. LATA’08 (tutorial), LNCS, 5196. Comon, H., Dauchet, M., Gilleron, R., Löding, C., Jacquemard, F., Lugiez, D., Tison, S., and Tommasi, M. (2007). Tree automata techniques and applications. Héam, P.-C., Nicaud, C., and Schmitz, S. (2009). Random Generation of Deterministic Tree (Walking) Automata. In CIAA’09, volume 5642 of LNCS, pages 115–124. Samuelides, M. (2007). Automates d’arbres à jetons. PhD thesis, Université Paris-Diderot - Paris VII.

24/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-48
SLIDE 48

Transformation Into BUTA

Using Overloops

Transformation Into BUTA Using Overloops

0 Input: An escaped TWA A = Σ, Q, I, F, ∆ 1 Initialise States and Rules to ∅ 2 for each a ∈ Σ0, τ ∈ T do

let P = (τ, Uτ

a[Hτ a ∗])

add a → P to Rules and P to States

3 repeat until Rules remain unchanged

for each f ∈ Σ2, τ ∈ T do

add every f (P0, P1) → P to Rules and P to States where P0, P1 ∈ States such that P0 = (0, S0) and P1 = (1, S1) and P = (τ, Uτ

f [(Hτ f ∪ S)∗]), with S the set of simple loops built

  • n the sons.

4 Output: A BUTA B equivalent to A: B =

Σ, States, { (⋆, O) ∈ States | O ∩ (I × {}) = ∅ } , Rules

25/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-49
SLIDE 49

Transformation Into BUTA

Using Overloops

Transformation Into BUTA Using Overloops

0 Input: An escaped TWA A = Σ, Q, I, F, ∆ 1 Initialise States and Rules to ∅ 2 for each a ∈ Σ0, τ ∈ T do

let P = (τ, Uτ

a[Hτ a ∗])

add a → P to Rules and P to States

3 repeat until Rules remain unchanged

for each f ∈ Σ2, τ ∈ T do

add every f (P0, P1) → P to Rules and P to States where P0, P1 ∈ States such that P0 = (0, S0) and P1 = (1, S1) and P = (τ, Uτ

f [(Hτ f ∪ S)∗]), with S the set of simple loops built

  • n the sons.

4 Output: A BUTA B equivalent to A: B =

Σ, States, { (⋆, O) ∈ States | O ∩ (I × {}) = ∅ } , Rules

25/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-50
SLIDE 50

Transformation Into BUTA

Using Overloops

add every f (P0, P1) → P to Rules and P to States where P0, P1 ∈ States such that P0 = (0, S0) and P1 = (1, S1) and P = (τ, Uτ

f [(Hτ f ∪ S)∗]),

with S the set of simple loops built on the sons. Set of Simple Loops Built on the Sons (From Overloops) S =

  • (p, qθ)
  • ∃θ ∈ S :

∃pθ ∈ Q st. f , p, τ → χ(θ), pθ ∈ ∆ and (pθ, qθ) ∈ Sθ

  • No need to store the son’s symbol anywhere.

26/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-51
SLIDE 51

Transformation Into BUTA in the Deterministic Case

Theorem: Deterministic Upper-Bound In general, the overloops-based BUTA B has |T| × 2|Q|2 states. However, it has at most |T| · 2|Q| log2(|Q|+1) states if A is a DTWA. A TWA A = Σ, Q, I, F, ∆ is deterministic (ie. a DTWA) ifa for all σ ∈ Σ, p ∈ Q, τ ∈ T, |σ, p, τ → M, Q ∩ ∆| 1. A relation R ⊆ Q2 is functional (or right-unique, or a partial function) if, for all p, q, q′ ∈ Q, pRq and pRq′ = ⇒ q = q′. There are 2|Q|2 binary relations on Q, of which |Q + 1||Q| are partial functions, of which |Q||Q| are total functions. If a relation R is functional, then so is Rk, for any k ∈ N.

aWe do not need the usual, stronger definition, where I is a singleton. 27/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-52
SLIDE 52

Transformation Into BUTA in the Deterministic Case

Idea: Each state built is the set of overloops (resp. loops) on some

  • tree. We show that the set of overloops on a tree is functional.

Lemma: Deterministic TWA If A is a deterministic TWA, then ։A is functional. Not enough to make sets of loops functional: f h a b a

28/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-53
SLIDE 53

Transformation Into BUTA in the Deterministic Case

Idea: Each state built is the set of overloops (resp. loops) on some

  • tree. We show that the set of overloops on a tree is functional.

Lemma: Deterministic TWA If A is a deterministic TWA, then ։A is functional. Not enough to make sets of loops functional: f [qℓ] h a b a

28/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-54
SLIDE 54

Transformation Into BUTA in the Deterministic Case

Idea: Each state built is the set of overloops (resp. loops) on some

  • tree. We show that the set of overloops on a tree is functional.

Lemma: Deterministic TWA If A is a deterministic TWA, then ։A is functional. Not enough to make sets of loops functional: f h[qℓ] a b a

28/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-55
SLIDE 55

Transformation Into BUTA in the Deterministic Case

Idea: Each state built is the set of overloops (resp. loops) on some

  • tree. We show that the set of overloops on a tree is functional.

Lemma: Deterministic TWA If A is a deterministic TWA, then ։A is functional. Not enough to make sets of loops functional: f h a[qℓ] b a

28/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-56
SLIDE 56

Transformation Into BUTA in the Deterministic Case

Idea: Each state built is the set of overloops (resp. loops) on some

  • tree. We show that the set of overloops on a tree is functional.

Lemma: Deterministic TWA If A is a deterministic TWA, then ։A is functional. Not enough to make sets of loops functional: f h a[qu] b a

28/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-57
SLIDE 57

Transformation Into BUTA in the Deterministic Case

Idea: Each state built is the set of overloops (resp. loops) on some

  • tree. We show that the set of overloops on a tree is functional.

Lemma: Deterministic TWA If A is a deterministic TWA, then ։A is functional. Not enough to make sets of loops functional: f h[qu] a b a

28/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-58
SLIDE 58

Transformation Into BUTA in the Deterministic Case

Idea: Each state built is the set of overloops (resp. loops) on some

  • tree. We show that the set of overloops on a tree is functional.

Lemma: Deterministic TWA If A is a deterministic TWA, then ։A is functional. Not enough to make sets of loops functional: f [qu] h a b a

28/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-59
SLIDE 59

Transformation Into BUTA in the Deterministic Case

Idea: Each state built is the set of overloops (resp. loops) on some

  • tree. We show that the set of overloops on a tree is functional.

Lemma: Deterministic TWA If A is a deterministic TWA, then ։A is functional. Not enough to make sets of loops functional: f [qr] h a b a

28/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-60
SLIDE 60

Transformation Into BUTA in the Deterministic Case

Idea: Each state built is the set of overloops (resp. loops) on some

  • tree. We show that the set of overloops on a tree is functional.

Lemma: Deterministic TWA If A is a deterministic TWA, then ։A is functional. Not enough to make sets of loops functional: f h a b a[qr]

28/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-61
SLIDE 61

Transformation Into BUTA in the Deterministic Case

Idea: Each state built is the set of overloops (resp. loops) on some

  • tree. We show that the set of overloops on a tree is functional.

Lemma: Deterministic TWA If A is a deterministic TWA, then ։A is functional. Not enough to make sets of loops functional: f h a b a[qf]

28/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-62
SLIDE 62

Transformation Into BUTA in the Deterministic Case

Idea: Each state built is the set of overloops (resp. loops) on some

  • tree. We show that the set of overloops on a tree is functional.

Lemma: Deterministic TWA If A is a deterministic TWA, then ։A is functional. Not enough to make sets of loops functional: f [qf] h a b a

28/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-63
SLIDE 63

Transformation Into BUTA in the Deterministic Case

Idea: Each state built is the set of overloops (resp. loops) on some

  • tree. We show that the set of overloops on a tree is functional.

Lemma: Deterministic TWA If A is a deterministic TWA, then ։A is functional. Not enough to make sets of loops functional: Loops on this tree: { (qℓ, qℓ), (qℓ, qu), (qℓ, qf), . . . }. Not functional.

28/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-64
SLIDE 64

Transformation Into BUTA in the Deterministic Case

Lemma: Hidden Loops Let p, q, q′ ∈ Q, q = q′ such that (p, q) and (p, q′) are loops of a TWA A on a given subtree t|α. Then if A is deterministic, either (q, q′) or (q′, q) must be a loop of A on t|α. By definition, there exist two runs c0, . . . , cn and d0, . . . , dm such that c0 = d0 = (α, p), cn = (α, q) and dm = (α, q′). If n = m then c0 ։n cn and c0 ։n dn. It follows that cn = dm. But this contradicts q = q′, so we must have n = m. Say that n < m. Then cn = dn, and (α, q) = dn, . . . , dm = (α, q′) forms a run. Therefore (q, q′) is a loop. If n > m, then by the same arguments (q′, q) is a loop.

29/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-65
SLIDE 65

Transformation Into BUTA in the Deterministic Case

Lemma: Hidden Loops Let p, q, q′ ∈ Q, q = q′ such that (p, q) and (p, q′) are loops of a TWA A on a given subtree t|α. Then if A is deterministic, either (q, q′) or (q′, q) must be a loop of A on t|α. By definition, there exist two runs c0, . . . , cn and d0, . . . , dm such that c0 = d0 = (α, p), cn = (α, q) and dm = (α, q′). If n = m then c0 ։n cn and c0 ։n dn. It follows that cn = dm. But this contradicts q = q′, so we must have n = m. Say that n < m. Then cn = dn, and (α, q) = dn, . . . , dm = (α, q′) forms a run. Therefore (q, q′) is a loop. If n > m, then by the same arguments (q′, q) is a loop.

29/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-66
SLIDE 66

Transformation Into BUTA in the Deterministic Case

Lemma: Functional Overloops Let p, q, q′ ∈ Q, such that (p, q) and (p, q′) are overloops of a TWA A on a given subtree t|α. Then if A is deterministic, q = q′. We have two runs (α, p), . . . , (α, s), (p(α) , q) (α, p), . . . , (α, s′), (p(α) , q′) Thus (p, s) and (p, s′) are loops. If s = s′, then q = q′, because ։ is functional. If s = s′, then say (s, s′) is a loop. So there exist s1, . . . , sn ∈ Q, β1 α, . . . , βn α such that (α, s), (β1, s1), . . . , (βn, sn), (α, s′) is a run. Thus (α, s) ։ (p(α) , q) and (α, s) ։ (β1, s1). It follows that p(α) = β1 α: contradiction.

30/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-67
SLIDE 67

Transformation Into BUTA in the Deterministic Case

Lemma: Functional Overloops Let p, q, q′ ∈ Q, such that (p, q) and (p, q′) are overloops of a TWA A on a given subtree t|α. Then if A is deterministic, q = q′. We have two runs (α, p), . . . , (α, s), (p(α) , q) (α, p), . . . , (α, s′), (p(α) , q′) Thus (p, s) and (p, s′) are loops. If s = s′, then q = q′, because ։ is functional. If s = s′, then say (s, s′) is a loop. So there exist s1, . . . , sn ∈ Q, β1 α, . . . , βn α such that (α, s), (β1, s1), . . . , (βn, sn), (α, s′) is a run. Thus (α, s) ։ (p(α) , q) and (α, s) ։ (β1, s1). It follows that p(α) = β1 α: contradiction.

30/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-68
SLIDE 68

Transformation Into BUTA in the Deterministic Case

Theorem: Deterministic Upper-Bound In general, the overloops-based BUTA B has |T| × 2|Q|2 states. However, it has at most |T| · 2|Q| log2(|Q|+1) states if A is a DTWA. By construction, for every state P = (τ, L) generated for B by the

  • verloops-based algorithm, there exists at least a subtree t such

that L is the set of overloops of A on t. Thus, by the previous lemma, L is functional. Therefore, there are at most |T| · |Q + 1||Q| states (or, equivalently, |T| · 2|Q| log2(|Q|+1)).

31/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA

slide-69
SLIDE 69

Transformation Into BUTA in the Deterministic Case

Theorem: Deterministic Upper-Bound In general, the overloops-based BUTA B has |T| × 2|Q|2 states. However, it has at most |T| · 2|Q| log2(|Q|+1) states if A is a DTWA. By construction, for every state P = (τ, L) generated for B by the

  • verloops-based algorithm, there exists at least a subtree t such

that L is the set of overloops of A on t. Thus, by the previous lemma, L is functional. Therefore, there are at most |T| · |Q + 1||Q| states (or, equivalently, |T| · 2|Q| log2(|Q|+1)).

31/24 CIAA’11 Vincent HUGOT Loops & Overloops for TWA