S HARED L OSS T OPOLOGY D ISCOVERY SLTD 1: Input: Set of receivers R - - PowerPoint PPT Presentation

s hared l oss t opology d iscovery sltd
SMART_READER_LITE
LIVE PREVIEW

S HARED L OSS T OPOLOGY D ISCOVERY SLTD 1: Input: Set of receivers R - - PowerPoint PPT Presentation

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc T OPOLOGY INFERENCE : ESCAPING THE SPATIAL INDEPENDENCE STRAIGHTJACKET Rhys Bowden, Darryl Veitch Department of Electrical &


slide-1
SLIDE 1

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

TOPOLOGY INFERENCE: ESCAPING THE

SPATIAL INDEPENDENCE STRAIGHTJACKET Rhys Bowden, Darryl Veitch

Department of Electrical & Electronic Engineering The University of Melbourne

1 / 121

slide-2
SLIDE 2

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MULTICAST TOMOGRAPHY

2 / 121

slide-3
SLIDE 3

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

DEFINITIONS AND NOTATION

  • Tree T = (V, L).
  • Nodes V labelled 0, . . . , n.
  • m receivers R at leaves of tree.

3 / 121

slide-4
SLIDE 4

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

PROBING

4 / 121

slide-5
SLIDE 5

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

PROBING

5 / 121

slide-6
SLIDE 6

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

PROBING

  • View as vector-valued stochastic process

Z(i) = [Z1(i), . . . , Zn(i)].

  • Tree-geometry: node/path state fixed by states of ancestor links:

Xk(i) =

  • j∈0→k

Zj(i) .

6 / 121

slide-7
SLIDE 7

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

PROBING

  • View as vector-valued stochastic process

Z(i) = [Z1(i), . . . , Zn(i)].

  • Tree-geometry: node/path state fixed by states of ancestor links:

Xk(i) =

  • j∈0→k

Zj(i) . GOAL: TOPOLOGY FROM TOMOGRAPHY

  • Deduce the topology T from the distribution of XR = (Xk(i))k∈R .
  • First assume infinite data, address identifiability.
  • Then consider inference with finite data.

7 / 121

slide-8
SLIDE 8

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

PREVIOUSLY

SPATIAL AND TEMPORAL INDEPENDENCE (CLASSICAL ASSUMPTIONS)

  • Link processes Zk(i) mutually independent.
  • Each an i.i.d. random sequence: Pr(Zk(i) = 1) = lk.

8 / 121

slide-9
SLIDE 9

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

PREVIOUSLY

SPATIAL AND TEMPORAL INDEPENDENCE (CLASSICAL ASSUMPTIONS)

  • Link processes Zk(i) mutually independent.
  • Each an i.i.d. random sequence: Pr(Zk(i) = 1) = lk.
  • Assume lk < 1, else unidentifiable.

9 / 121

slide-10
SLIDE 10

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

SHARED PATH TO BRANCH POINT

10 / 121

slide-11
SLIDE 11

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

SHARED TRANSMISSION

  • Function of two nodes, i, j :

S(i, j) = Pr(Xi = 1)Pr(Xj = 1) Pr(Xi = 1, Xj = 1) .

  • Under spatial independence

S(i, j) = Pr(Xb = 1).

11 / 121

slide-12
SLIDE 12

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

SHARED TRANSMISSION

  • Function of two nodes, i, j :

S(i, j) = Pr(Xi = 1)Pr(Xj = 1) Pr(Xi = 1, Xj = 1) .

  • Under spatial independence

S(i, j) = Pr(Xb = 1).

  • Use/need pairwise only ⇒ still feasible with finite data.

12 / 121

slide-13
SLIDE 13

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

CHOOSING SIBLINGS

SHARED TRANSMISSION DECREASES DOWN THE TREE

  • If b(i, j) under b(i, k) then S(i, j) < S(i, k).

13 / 121

slide-14
SLIDE 14

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

CERTAIN PATERNITY

  • Pair(s) of nodes in B with lowest shared transmission are siblings.
  • If J ⊂ B has S(i, j) minimal for each pair i, j ∈ B then J are siblings.

14 / 121

slide-15
SLIDE 15

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

SHARED TRANSMISSION FOR VIRTUAL NODES

  • Nodes created by merging siblings are “virtual”.
  • Will correspond to real nodes if algorithm successful.
  • But how to calculate Shared Transmission for j ∈ B\R?

15 / 121

slide-16
SLIDE 16

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

SHARED TRANSMISSION FOR VIRTUAL NODES

  • Nodes created by merging siblings are “virtual”.
  • Will correspond to real nodes if algorithm successful.
  • But how to calculate Shared Transmission for j ∈ B\R?
  • Define “virtual” losses for j as the sequence

˜ Xj =

  • 1

if Xk = 1 for any k ∈ d(j) ∩ R

  • therwise,

since know Xj = 1 if a transmission seen at any descendant.

16 / 121

slide-17
SLIDE 17

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

SHARED TRANSMISSION FOR VIRTUAL NODES

  • Nodes created by merging siblings are “virtual”.
  • Will correspond to real nodes if algorithm successful.
  • But how to calculate Shared Transmission for j ∈ B\R?
  • Define “virtual” losses for j as the sequence

˜ Xj =

  • 1

if Xk = 1 for any k ∈ d(j) ∩ R

  • therwise,

since know Xj = 1 if a transmission seen at any descendant.

  • Shared transmission defined analogously:

˜ S(i, j) = Pr(˜ Xi = 1)Pr(˜ Xj = 1) Pr(˜ Xi = 1, ˜ Xj = 1) .

  • ˜

S(i, j) = S(i, j) under classical assumptions!

17 / 121

slide-18
SLIDE 18

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

ITERATIVE BOTTOM-UP TOPOLOGY INFERENCE

Red nodes are the working set B.

18 / 121

slide-19
SLIDE 19

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

SHARED LOSS TOPOLOGY DISCOVERY – SLTD

1: Input:

Set of receivers R; distribution fR, XR(i).

2: Variables:

Nodes V, Links L, Root nodes B, ˜ X(i).

3: Initialize: V ← R; L ← ∅; B ← R; ˜

XR(i) ← XR(i).

4: while |B| > 1 do 5:

Calculate S∗ = max{j,k}⊂B ˜ Sj,k;

6:

Find largest J ⊂ B: ∀{j, k} ⊂ J, ˜ Sj,k = S∗;

7:

if exists some i ∈ J, j ∈ J : ˜ Si,j = S∗ then

8:

return ∅; # sibling set not transitive!

9:

else

10:

Create new node v, set ˜ Xv =

j∈J ˜

Xj;

11:

V ← V ∪ v;

12:

L ← L ∪

j∈J(v, j);

13:

B ← (B\J) ∪ v;

14:

end if

15: end while 16: Create root node 0; 17: V ← V ∪ 0; 18: L ← L ∪ (0, B);

# |B| = 1 here

19: Output: T = (V, L);

19 / 121

slide-20
SLIDE 20

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

CHARACTERIZING THE LINK PROCESSES

SPATIAL STRUCTURE

  • Assume Z(i) = [Z1(i), . . . , Zn(i)] stationary and ergodic.
  • Spatial dependency captured by the marginal Z = [Z1, . . . , Zn].
  • Induces the path-passage marginal X = [X1, . . . , Xn].
  • We are interested in fXR.

20 / 121

slide-21
SLIDE 21

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

CHARACTERIZING THE LINK PROCESSES

SPATIAL STRUCTURE

  • Assume Z(i) = [Z1(i), . . . , Zn(i)] stationary and ergodic.
  • Spatial dependency captured by the marginal Z = [Z1, . . . , Zn].
  • Induces the path-passage marginal X = [X1, . . . , Xn].
  • We are interested in fXR.

LINK JOINT DISTRIBUTION

  • Characterise joint distribution fZ using probabilities

Pr(Z = r) = Pr(Z1 = r1, Z2 = r2, . . . Zn = rn),

  • ne for each link passage pattern r = [r1, . . . , rn] ∈ {0, 1}n.
  • These sum to 1, so 2n − 1 degrees of freedom.
  • In contrast: classical case is much simpler, n degrees of freedom.

21 / 121

slide-22
SLIDE 22

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MODELS VERSUS TOPOLOGY

MODELS

  • A topology T with a joint distribution fZ is a model M = (T, fZ).
  • A model M induces a joint distribution fR(M) on the vector
  • bservable XR.
  • T(M) is the tree component of the model M.
  • Goal: to determine T(M) from fR(M).

22 / 121

slide-23
SLIDE 23

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MEASUREMENT EQUIVALENCE

Two models M1 and M2 are measurement equivalent if fR(M1) = fR(M2).

EXAMPLE 1:

Classical with lk = 0.9 for all k ∈ V.

23 / 121

slide-24
SLIDE 24

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MEASUREMENT EQUIVALENCE

Two models M1 and M2 are measurement equivalent if fR(M1) = fR(M2).

EXAMPLE 1:

Both models have Pr(X1 = 1) = 0.93 Pr(X2 = 1) = 0.93 Pr(X3 = 1) = 0.92 Pr([X1, X2] = 12) = 0.94 Pr([X1, X3] = 12) = 0.94 Pr([X2, X3] = 12) = 0.94 Pr(XR = 13) = 0.95.

24 / 121

slide-25
SLIDE 25

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MEASUREMENT EQUIVALENCE

Two models M1 and M2 are measurement equivalent if fR(M1) = fR(M2).

EXAMPLE 1:

Classical with lk = 0.9 for all k ∈ V. Pr(Z = z) =      [z1, z2, z3] = [1, 1, 0] 0.930.12 + 0.92 − 0.93 z = [1, 0, 1, 0, 1] 0.9

  • izi 0.15−

izi

  • therwise.

25 / 121

slide-26
SLIDE 26

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

TOPOLOGY IDENTIFIABILITY

EXAMPLE 1 LESSONS

  • Example 1 gave two models with same fR(M), different T(M).
  • So in that case, T is not identifiable.

26 / 121

slide-27
SLIDE 27

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

TOPOLOGY IDENTIFIABILITY

EXAMPLE 1 LESSONS

  • Example 1 gave two models with same fR(M), different T(M).
  • So in that case, T is not identifiable.
  • Must restrict M if we hope to identify T for each M ∈ M.

27 / 121

slide-28
SLIDE 28

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

TOPOLOGY IDENTIFIABILITY

EXAMPLE 1 LESSONS

  • Example 1 gave two models with same fR(M), different T(M).
  • So in that case, T is not identifiable.
  • Must restrict M if we hope to identify T for each M ∈ M.

TOPOLOGICAL DETERMINISM

  • A class M is Topologically Determinate if ∄M1, M2 ∈ M with

fR(M1) = fR(M2), and T(M1) = T(M2).

  • i.e., models with same fR have same T.

28 / 121

slide-29
SLIDE 29

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

GOALS (INFINITE DATA CASE)

  • Find “large”, natural Topologically Determinate class(es) M.
  • Find algorithm guaranteed to recover T(M) for all M ∈ M.

29 / 121

slide-30
SLIDE 30

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

GOALS (INFINITE DATA CASE)

  • Find “large”, natural Topologically Determinate class(es) M.
  • Find algorithm guaranteed to recover T(M) for all M ∈ M.

EXAMPLE: CLASSICAL MODELS MC

  • Classical models are Topologically Determinate.
  • SLTD works for them.
  • In fact, one model per fR(M), so one model per T.

30 / 121

slide-31
SLIDE 31

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

NEW CLASSES

MCE MJI

MAJIE

MAJI MC

31 / 121

slide-32
SLIDE 32

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

DIMENSIONS OF NEW CLASSES

T dim(MC,T) 4 6 9 14 29 dim(MCE,T) 12 54 489 14350 536805405 dim(MJI,T) 15 56 478 14133 536613988 dim(MAJI,T) 15 56 478 14133 536613988 dim(MAJIE,T) 15 57 489 14395 536805415 dim(MT) 15 63 511 16383 536870911

TABLE : Examples of model class dimensions.

32 / 121

slide-33
SLIDE 33

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

CLASSICALLY EQUIVALENT MODELS: MCE

DEFINITION

M1 ∈ MCE iff ∃M2 ∈ MC with fR(M1) = fR(M2) and T(M1) = T(M2). These are models that appear classical.

33 / 121

slide-34
SLIDE 34

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

CLASSICALLY EQUIVALENT MODELS: MCE

DEFINITION

M1 ∈ MCE iff ∃M2 ∈ MC with fR(M1) = fR(M2) and T(M1) = T(M2). These are models that appear classical.

SLTD STILL WORKS!

  • SLTD returns T(M) correctly for every M ∈ MCE.
  • Returns topology as though M is classical.
  • ∴ Returns correct topology.
  • So MCE is Topologically Determinate.

34 / 121

slide-35
SLIDE 35

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

CLASSICALLY EQUIVALENT MODELS: MCE

DEFINITION

M1 ∈ MCE iff ∃M2 ∈ MC with fR(M1) = fR(M2) and T(M1) = T(M2). These are models that appear classical.

SLTD STILL WORKS!

  • SLTD returns T(M) correctly for every M ∈ MCE.
  • Returns topology as though M is classical.
  • ∴ Returns correct topology.
  • So MCE is Topologically Determinate.

EXTENSION TRICK WORKS IN GENERAL

  • Can apply for any algorithm and class it works on.

35 / 121

slide-36
SLIDE 36

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

COMMENTS ON MCE

STRENGTHS

  • MC ⊂ MCE.
  • Much larger than MC.
  • Can contain complex spatial dependencies.

36 / 121

slide-37
SLIDE 37

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

COMMENTS ON MCE

STRENGTHS

  • MC ⊂ MCE.
  • Much larger than MC.
  • Can contain complex spatial dependencies.

DRAWBACKS

  • Not constructive.
  • Depends on receiver positions.

37 / 121

slide-38
SLIDE 38

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

COMMENTS ON MCE

STRENGTHS

  • MC ⊂ MCE.
  • Much larger than MC.
  • Can contain complex spatial dependencies.

DRAWBACKS

  • Not constructive.
  • Depends on receiver positions.
  • Need a model class that:
  • Is not based on receiver positions.
  • Reflects properties of real networks.

38 / 121

slide-39
SLIDE 39

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

PAINLESS GENERALITY

RECALL

  • Xk =

i∈(0→k) Zk.

39 / 121

slide-40
SLIDE 40

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

PAINLESS GENERALITY

RECALL

  • Xk =

i∈(0→k) Zk.

DEPENDENCY OF HIDDEN Z

  • If Xi = 0 then for all k below i, Xk = 0.

40 / 121

slide-41
SLIDE 41

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

PAINLESS GENERALITY

RECALL

  • Xk =

i∈(0→k) Zk.

DEPENDENCY OF HIDDEN Z

  • If Xi = 0 then for all k below i, Xk = 0.
  • If Xf(i) = 0 then changing the value of Zi won’t change the output.
  • This suggests a way of adding dependency without affecting fR(M).

41 / 121

slide-42
SLIDE 42

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MODEL PRINCIPLES

HOW DOES DEPENDENCY ARISE?

  • Links touch at routers, influenced by router traffic and dynamics

– suggests dependencies between siblings.

  • Distant links unlikely to affect each other except via tree.

– suggests ruling out ‘action at a distance’.

42 / 121

slide-43
SLIDE 43

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MODEL PRINCIPLES

HOW DOES DEPENDENCY ARISE?

  • Links touch at routers, influenced by router traffic and dynamics

– suggests dependencies between siblings.

  • Distant links unlikely to affect each other except via tree.

– suggests ruling out ‘action at a distance’.

TRANSLATION TO MODEL PRINCIPLES

  • Locally: most general possible dependency between adjacent links.
  • Globally: only necessary dependency over non-adjacent links.

43 / 121

slide-44
SLIDE 44

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

JUMP INDEPENDENCE

DEFINITION (JUMP INDEPENDENT MODELS)

A model with links L and receivers R is Jump Independent if ∀k ∈ V\R, ∀J ⊂ V with J ∩ d(k) = ∅, Xc(k) is conditionally independent of XJ given Xk = 1.

44 / 121

slide-45
SLIDE 45

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

DEFINITIONS

DEFINITION (SUBTREE INDUCED BY U)

Let M(T, fZ) ∈ MJI with T = (V, L). Let U ⊂ V. Then define the subtree induced by U as T(U) =

  • i∈U

{0 → i} and R(U) as the leaves of T(U).

DEFINITION (ρ-VALUES)

Define sibling passage probabilities: ρJ = Pr(∩j∈D{Xj = 1}|Xf(D) = 1) for each set of siblings D.

45 / 121

slide-46
SLIDE 46

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

FUNDAMENTAL PROPERTY OF JI MODELS

LEMMA (FUNDAMENTAL PROPERTY OF JI MODELS)

Let M(T, fZ) ∈ MJI. Then Pr(

  • k∈U

{Xk = 1}) =

  • i∈T(U)\R(U)

ρc(i)∩T(U) for every U ⊂ V.

46 / 121

slide-47
SLIDE 47

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

FUNDAMENTAL PROPERTY OF JI MODELS

Example : U = {2, 5, 6} Pr(X2 = 1, X5 = 1, X6 = 1) = ρ1 · ρ2,3 · ρ4,5 · ρ6

47 / 121

slide-48
SLIDE 48

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

SHARED TRANSMISSION IN JI MODELS

  • For i, j ∈ V,

Si,j = Pr(Xb = 1) · ρ1ρ2 ρ1,2 =

k∈0→b

ρk

  • · ρ1ρ2

ρ1,2

48 / 121

slide-49
SLIDE 49

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

SHARED TRANSMISSION IN JI MODELS

  • For i, j ∈ V,

Si,j = Pr(Xb = 1) · ρ1ρ2 ρ1,2 =

k∈0→b

ρk

  • · ρ1ρ2

ρ1,2

  • Shared Transmission a function
  • f the shared path and the two

children at the branch point.

49 / 121

slide-50
SLIDE 50

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

BINARY JI MODELS

MEASUREMENT EQUIVALENCE

  • Assume M1 ∈ MJI and M2 ∈ MC with T(M1) = T(M2).
  • Solve for li from M2 in terms of ρJ from M1.

li =                ρi,s(i) ρs(i) , if i ∈ R ρi · ρc1(i)ρc2(i) ρc1(i),c2(i) if i = 1 ρi,s(i) ρs(i) · ρc1(i)ρc2(i) ρc1(i),c2(i)

  • therwise.

50 / 121

slide-51
SLIDE 51

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

BINARY JI MODELS

MEASUREMENT EQUIVALENCE

  • Assume M1 ∈ MJI and M2 ∈ MC with T(M1) = T(M2).
  • Solve for li from M2 in terms of ρJ from M1.

li =                ρi,s(i) ρs(i) , if i ∈ R ρi · ρc1(i)ρc2(i) ρc1(i),c2(i) if i = 1 ρi,s(i) ρs(i) · ρc1(i)ρc2(i) ρc1(i),c2(i)

  • therwise.

OBTAIN (BINARY) EXAMPLES OF MODELS IN CE

  • If li < 1, must be the marginal link passage parameter of the CE model.
  • Insight: siblings dependencies compensated by change in transmission on

path to father.

51 / 121

slide-52
SLIDE 52

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

IDENTIFIABILITY FAILURE: INVISIBLE PATHS

LEMMA

Let i, j, k be three distinct receivers in a Jump Independent model such that b(i, k) is below b(i, j). Then S(i, k) = S(j, k) if and only if b(i, j) → b(i, k) is invisible.

52 / 121

slide-53
SLIDE 53

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

IDENTIFIABILITY FAILURE: INVISIBLE PATHS

AUGMENTED PATH

  • An augmented path g(g1, g2) → h(h1, h2) is a path g → h together

with g1, g2 ∈ c(g), h1, h2 ∈ c(h) such that g1 ∈ g → h.

53 / 121

slide-54
SLIDE 54

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

IDENTIFIABILITY FAILURE: INVISIBLE PATHS

INVISIBLE PATH

  • An augmented path is invisible if

ρg1ρg2 ρg1,g2 =

i∈g→h

ρi ρh1ρh2 ρh1,h2 .

54 / 121

slide-55
SLIDE 55

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

IDENTIFIABILITY FAILURE: INVISIBLE PATHS

INVISIBLE PATH

  • An augmented path is invisible if

ρg1ρg2 ρg1,g2 =

i∈g→h

ρi ρh1ρh2 ρh1,h2 .

  • For Binary models this reduces to:
  • i∈g→h

li = 1.

  • Analogue of lk = 1 from classical.

55 / 121

slide-56
SLIDE 56

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

IDENTIFIABILITY FAILURE: LOCAL STRUCTURE

LOCAL LIMITATIONS ON ANY SIBLING SET J

  • Internally agreeing

if Si,j = Sk,l ∀i, j, k, l ∈ J with i = j, k = l.

  • Internally disagreeing if Si,j = Sk,l ∀i, j, k, l ∈ J with {i, j} = {k, l}.

56 / 121

slide-57
SLIDE 57

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

IDENTIFIABILITY FAILURE: LOCAL STRUCTURE

LOCAL LIMITATIONS ON ANY SIBLING SET J

  • Internally agreeing

if Si,j = Sk,l ∀i, j, k, l ∈ J with i = j, k = l.

  • Internally disagreeing if Si,j = Sk,l ∀i, j, k, l ∈ J with {i, j} = {k, l}.

ROLES

  • Disagreeing is the generic/general case.
  • Agreeing includes classical.

57 / 121

slide-58
SLIDE 58

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

AGREEABLE JI MODELS

DEFINITION (AGREEABLE JI MODELS (MAJI))

An AJI model is a model M ∈ MJI which satisfies : i) (internally consistent) Each sibling set J is agreeing or disagreeing. ii) (no invisible paths) No augmented paths in M are invisible.

58 / 121

slide-59
SLIDE 59

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

AGREEABLE JI MODELS

DEFINITION (AGREEABLE JI MODELS (MAJI))

An AJI model is a model M ∈ MJI which satisfies : i) (internally consistent) Each sibling set J is agreeing or disagreeing. ii) (no invisible paths) No augmented paths in M are invisible.

ROLE OF RESTRICTIONS

  • Condition (i) prevents sibling sets from looking like they aren’t.
  • Condition (ii) prevents groups of non-siblings from looking like

they are siblings.

59 / 121

slide-60
SLIDE 60

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

AGREEABLE JI MODELS

DEFINITION (AGREEABLE JI MODELS (MAJI))

An AJI model is a model M ∈ MJI which satisfies : i) (internally consistent) Each sibling set J is agreeing or disagreeing. ii) (no invisible paths) No augmented paths in M are invisible.

ROLE OF RESTRICTIONS

  • Condition (i) prevents sibling sets from looking like they aren’t.
  • Condition (ii) prevents groups of non-siblings from looking like

they are siblings. Including ‘agreeing’ in (i) a big headache, but important!

60 / 121

slide-61
SLIDE 61

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

A PROPERTY OF SIBLINGS IN JI MODELS

LEMMA (SIBLINGS AGREE EXTERNALLY)

Let M ∈ MJI. If two nodes i, j are members of a sibling set J, and k ∈ R such that (0 → k) ∩ J = ∅, then Si,k = Sj,k.

61 / 121

slide-62
SLIDE 62

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

SEEKING CERTAIN PATERNITY

TRY TO INVERT SIBLING PROPERTY

  • Define agreement set of i, j ∈ V

Ai,j = {k ∈ R : S(i, k) = S(j, k), k = i, j}.

  • Agreement sets used to compare ‘world view’ of candidate siblings.

62 / 121

slide-63
SLIDE 63

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

FINDING COMPLETE SIBLING SETS

DEFINITION (EXTERNALLY-AGREEING SETS)

Call D ⊂ R an externally-agreeing set (EAS) if |D| ≥ 3 and Ai,j = R\D for all i, j ∈ D.

63 / 121

slide-64
SLIDE 64

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

FINDING COMPLETE SIBLING SETS

DEFINITION (EXTERNALLY-AGREEING SETS)

Call D ⊂ R an externally-agreeing set (EAS) if |D| ≥ 3 and Ai,j = R\D for all i, j ∈ D.

DEFINITION (ALL-AGREEING SETS)

Call D ⊂ R with |D| ≥ 2 an all-agreeing set (AAS) if Ai,j = R\{i, j} for all i, j ∈ D. Subsets of an all-agreeing set are also all-agreeing. Call an all-agreeing set D a maximal all-agreeing set (MAAS) if it is not a proper subset of another one.

64 / 121

slide-65
SLIDE 65

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

FINDING COMPLETE SIBLING SETS

LEMMA (FINDING DISAGREEING SIBLING SETS)

Consider M ∈ MAJI with receiver nodes R. A set D ⊂ R with |D| ≥ 3 is an disagreeing sibling set if and only if it is an EAS.

65 / 121

slide-66
SLIDE 66

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

FINDING COMPLETE SIBLING SETS

LEMMA (FINDING DISAGREEING SIBLING SETS)

Consider M ∈ MAJI with receiver nodes R. A set D ⊂ R with |D| ≥ 3 is an disagreeing sibling set if and only if it is an EAS.

LEMMA (FINDING AGREEING SIBLING SUBSETS)

Consider M ∈ MAJI with receiver nodes R. A set D ⊂ R with |D| ≥ 2 is a subset of an agreeing sibling set if and only if it is an AAS.

  • The MAAS are the maximal agreeing sibling subsets.
  • Some/all of these may still have hidden siblings.

66 / 121

slide-67
SLIDE 67

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

PROPOSITION (CERTAIN PATERNITY II)

Assume an M ∈ MAJI model. Then at least one available sibling set can be identified without error.

PROOF

  • Find all the EAS and AASes

67 / 121

slide-68
SLIDE 68

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

CASE 1: AT LEAST ONE EAS EXISTS

Select any of them.

68 / 121

slide-69
SLIDE 69

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

CASE 2: NO EAS EXISTS

Select a MAAS which is a sibling set (can test if one below another).

69 / 121

slide-70
SLIDE 70

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

SLTD2

Similar to SLTD, but agreement set based.

70 / 121

slide-71
SLIDE 71

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

SLTD2

Similar to SLTD, but agreement set based.

THEOREM (CORRECTNESS OF SLTD2 ON MAJI)

Let M = (T, fZ) ∈ MAJI. Then SLTD2 returns T.

71 / 121

slide-72
SLIDE 72

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

SLTD2

Similar to SLTD, but agreement set based.

THEOREM (CORRECTNESS OF SLTD2 ON MAJI)

Let M = (T, fZ) ∈ MAJI. Then SLTD2 returns T.

PROOF

  • Find sibling set using Certain Paternity.
  • S(i, j) = ˜

S(i, j) for M ∈ MJI.

72 / 121

slide-73
SLIDE 73

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

SLTD2

Similar to SLTD, but agreement set based.

THEOREM (CORRECTNESS OF SLTD2 ON MAJI)

Let M = (T, fZ) ∈ MAJI. Then SLTD2 returns T.

PROOF

  • Find sibling set using Certain Paternity.
  • S(i, j) = ˜

S(i, j) for M ∈ MJI.

  • So each iteration will be correct.

73 / 121

slide-74
SLIDE 74

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

SLTD2

Similar to SLTD, but agreement set based.

THEOREM (CORRECTNESS OF SLTD2 ON MAJI)

Let M = (T, fZ) ∈ MAJI. Then SLTD2 returns T.

PROOF

  • Find sibling set using Certain Paternity.
  • S(i, j) = ˜

S(i, j) for M ∈ MJI.

  • So each iteration will be correct.
  • Hence recover T at termination.

74 / 121

slide-75
SLIDE 75

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

AJIE MODELS

  • Defined analogously to MCE, but start with MAJI instead of MC.

75 / 121

slide-76
SLIDE 76

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

AJIE MODELS

  • Defined analogously to MCE, but start with MAJI instead of MC.
  • MCE ⊂ MAJIE, since MC ⊂ MAJI.

76 / 121

slide-77
SLIDE 77

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

AJIE MODELS

  • Defined analogously to MCE, but start with MAJI instead of MC.
  • MCE ⊂ MAJIE, since MC ⊂ MAJI.
  • SLTD2 succeeds on all topologies in MAJIE.

77 / 121

slide-78
SLIDE 78

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

RELATIONSHIPS BETWEEN CLASSES

MCE MJI

MAJIE

MAJI MC

78 / 121

slide-79
SLIDE 79

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

DIMENSIONS OF CLASSES

T dim(MC,T) 4 6 9 14 29 dim(MCE,T) 12 54 489 14350 536805405 dim(MJI,T) 15 56 478 14133 536613988 dim(MAJI,T) 15 56 478 14133 536613988 dim(MAJIE,T) 15 57 489 14395 536805415 dim(MT) 15 63 511 16383 536870911

TABLE : Examples of model class dimensions.

79 / 121

slide-80
SLIDE 80

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

INFINITE DATA SUMMARY

PREVIOUS WORK

  • Classical model: full spatial independence of tree loss process.
  • Algorithm SLTD to recover topology in this case.

80 / 121

slide-81
SLIDE 81

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

INFINITE DATA SUMMARY

PREVIOUS WORK

  • Classical model: full spatial independence of tree loss process.
  • Algorithm SLTD to recover topology in this case.

OUR WORK

  • Break spatial independence assumptions.
  • Define more general class MCE such that SLTD still works.
  • General result for extending class while keeping algorithm.

81 / 121

slide-82
SLIDE 82

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

INFINITE DATA SUMMARY

PREVIOUS WORK

  • Classical model: full spatial independence of tree loss process.
  • Algorithm SLTD to recover topology in this case.

OUR WORK

  • Break spatial independence assumptions.
  • Define more general class MCE such that SLTD still works.
  • General result for extending class while keeping algorithm.

82 / 121

slide-83
SLIDE 83

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

INFINITE DATA SUMMARY

PREVIOUS WORK

  • Classical model: full spatial independence of tree loss process.
  • Algorithm SLTD to recover topology in this case.

OUR WORK

  • Break spatial independence assumptions.
  • Define more general class MCE such that SLTD still works.
  • General result for extending class while keeping algorithm.
  • Define class MJI with physically motivated structure.
  • Find TD class MAJI with dim(MAJI) = dim(MJI).

83 / 121

slide-84
SLIDE 84

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

INFINITE DATA SUMMARY

PREVIOUS WORK

  • Classical model: full spatial independence of tree loss process.
  • Algorithm SLTD to recover topology in this case.

OUR WORK

  • Break spatial independence assumptions.
  • Define more general class MCE such that SLTD still works.
  • General result for extending class while keeping algorithm.
  • Define class MJI with physically motivated structure.
  • Find TD class MAJI with dim(MAJI) = dim(MJI).
  • New algorithm SLTD2 recovers topology for all M ∈ MAJI.

84 / 121

slide-85
SLIDE 85

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

INFINITE DATA SUMMARY

PREVIOUS WORK

  • Classical model: full spatial independence of tree loss process.
  • Algorithm SLTD to recover topology in this case.

OUR WORK

  • Break spatial independence assumptions.
  • Define more general class MCE such that SLTD still works.
  • General result for extending class while keeping algorithm.
  • Define class MJI with physically motivated structure.
  • Find TD class MAJI with dim(MAJI) = dim(MJI).
  • New algorithm SLTD2 recovers topology for all M ∈ MAJI.
  • Also recovers topology for all M ∈ MAJIE.

85 / 121

slide-86
SLIDE 86

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

CHALLENGES FOR FINITE DATA

  • Underlying Sij not known, only estimated.
  • Failure of exact Sij equality underlying agreement set definition.
  • Random topology selection in MAJI, with degree constraints.
  • Random model selection, with loss constraints.
  • Sensible error metric on trees.

86 / 121

slide-87
SLIDE 87

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

A SLTD BASED ALGORITHM

MODIFIED ITERATION

  • Estimate shared transmission over all pairs
  • Sij =

Xi/np Xj/np XiXj/np .

  • Merge i, j into J∗ = (ij) with minimal

Sij.

  • Merge additional receivers k in J∗ obeying (we use β = 0.002)
  • S(ij)k ≤ (1 + β)

S∗. Straightforward because key steps based on inequality of Sij.

87 / 121

slide-88
SLIDE 88

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MEASURING APPROXIMATE AGREEMENT

THREE STEPS TO MEASURE AGREEMENT OF J TO A

(i) shared passage measure pk;ij (|J| = 2 and |A| = 1); (ii) agreement set measure gij(A) (|J| = 2 and |A| ≥ 1); (iii) sibling set measure r

A(J)

(|J| ≥ 2 and |A| ≥ 1).

88 / 121

slide-89
SLIDE 89

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MEASURING APPROXIMATE AGREEMENT

STEP (I): SHARED PASSAGE MEASURE pk;ij (|J| = 2 AND |A| = 1)

Let pk|i = Pr(Xk = 1|Xi = 1). From the definition, Sik = Sjk equivalent to pk|i = pk|j.

89 / 121

slide-90
SLIDE 90

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MEASURING APPROXIMATE AGREEMENT

STEP (I): SHARED PASSAGE MEASURE pk;ij (|J| = 2 AND |A| = 1)

Let pk|i = Pr(Xk = 1|Xi = 1). From the definition, Sik = Sjk equivalent to pk|i = pk|j. Estimate pk|i by ˆ pk|i =

  • (XkXi)/ni .

90 / 121

slide-91
SLIDE 91

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MEASURING APPROXIMATE AGREEMENT

STEP (I): SHARED PASSAGE MEASURE pk;ij (|J| = 2 AND |A| = 1)

Let pk|i = Pr(Xk = 1|Xi = 1). From the definition, Sik = Sjk equivalent to pk|i = pk|j. Estimate pk|i by ˆ pk|i =

  • (XkXi)/ni .

Null hypothesis: pk|i = pk|j. Under H0 ˆ pk| = (niˆ pk|i + njˆ pk|j)/(ni + nj) Test statistic: Tij(k) = ˆ pk|i − ˆ pk|j

  • ni+nj

ninj ˆ

pk|(1 − ˆ pk|) with corresponding (Gaussian based) p-value pij ∈ [0, 1]. Higher pij = ⇒ higher agreeement.

91 / 121

slide-92
SLIDE 92

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MEASURING APPROXIMATE AGREEMENT

STEP (II): AGREEMENT SET MEASURE gij(A) (|J| = 2 AND |A| ≥ 1)

Let A ⊂ B\{i, j}, and select a significance level α.

92 / 121

slide-93
SLIDE 93

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MEASURING APPROXIMATE AGREEMENT

STEP (II): AGREEMENT SET MEASURE gij(A) (|J| = 2 AND |A| ≥ 1)

Let A ⊂ B\{i, j}, and select a significance level α. Note the good proportion, gp, of the p(k) obeying p(k) > α, k ∈ A. (Avoids using p-value as a weight – bad idea) Note worst agreement: gw = mink∈A p(k). (for gp and gw, higher values = ⇒ closer agreement)

93 / 121

slide-94
SLIDE 94

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MEASURING APPROXIMATE AGREEMENT

STEP (II): AGREEMENT SET MEASURE gij(A) (|J| = 2 AND |A| ≥ 1)

Let A ⊂ B\{i, j}, and select a significance level α. Note the good proportion, gp, of the p(k) obeying p(k) > α, k ∈ A. (Avoids using p-value as a weight – bad idea) Note worst agreement: gw = mink∈A p(k). (for gp and gw, higher values = ⇒ closer agreement) Define gij(A) = gp, using gw to break ties. In other words, agreement follows the worst case in A.

94 / 121

slide-95
SLIDE 95

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MEASURING APPROXIMATE AGREEMENT

STEP (III): SIBLING SET MEASURE r

A(J)

(|J| ≥ 2 AND |A| ≥ 1)

Assume A ⊂ B \ J. To define r

A(J), must combine the values of gij(A) for all {i, j} ∈ J.

95 / 121

slide-96
SLIDE 96

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MEASURING APPROXIMATE AGREEMENT

STEP (III): SIBLING SET MEASURE r

A(J)

(|J| ≥ 2 AND |A| ≥ 1)

Assume A ⊂ B \ J. To define r

A(J), must combine the values of gij(A) for all {i, j} ∈ J.

Per-leaf noise reduction: for each k ∈ J, average the g(A) values involving k.

96 / 121

slide-97
SLIDE 97

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MEASURING APPROXIMATE AGREEMENT

STEP (III): SIBLING SET MEASURE r

A(J)

(|J| ≥ 2 AND |A| ≥ 1)

Assume A ⊂ B \ J. To define r

A(J), must combine the values of gij(A) for all {i, j} ∈ J.

Per-leaf noise reduction: for each k ∈ J, average the g(A) values involving k. Define r

A(R) ∈ [0, 1] as the smallest such average.

(signature of bad leaves won’t be diluted)

97 / 121

slide-98
SLIDE 98

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

MEASURING APPROXIMATE AGREEMENT

STEP (III): SIBLING SET MEASURE r

A(J)

(|J| ≥ 2 AND |A| ≥ 1)

Assume A ⊂ B \ J. To define r

A(J), must combine the values of gij(A) for all {i, j} ∈ J.

Per-leaf noise reduction: for each k ∈ J, average the g(A) values involving k. Define r

A(R) ∈ [0, 1] as the smallest such average.

(signature of bad leaves won’t be diluted) Notes: – r

A(J) = g(A) whenever |J| = 2 such as in binary trees.

– Typically A = B \ J in which case we write simply r(J).

98 / 121

slide-99
SLIDE 99

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

DEFINING TRUETREE

Inspired by SLTD2, tries to use r(J) to identify the MAAS and EAS.

99 / 121

slide-100
SLIDE 100

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

DEFINING TRUETREE

Inspired by SLTD2, tries to use r(J) to identify the MAAS and EAS. LOCATING AN EAS Infeasible to search for highest r(J) at each iteration – too many J.

100 / 121

slide-101
SLIDE 101

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

DEFINING TRUETREE

Inspired by SLTD2, tries to use r(J) to identify the MAAS and EAS. LOCATING AN EAS Infeasible to search for highest r(J) at each iteration – too many J. Greedy alternative: construct a subset of J’s likely to contain the EASes

  • Set seed J1 = {i, j}, record r(J1).
  • J2 = J1 ∪ {kd}, where kd ∈ B \ J1 is the leaf that minimizes r

{k}(J1)

– invite most disagreeable member outside of J to join J.

  • For each of

m

2

  • seeds, get a sequence of |B| − 1 candidates EAS J sets.
  • Select J∗ with the highest agreement r(J).
  • Termination step: (needed since above gives |J| ≤ |B| − 1)

Set J = B if |J∗| = |B| − 1, AND if r(J) > α for all J of size |J| = |B| − 1.

101 / 121

slide-102
SLIDE 102

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

DEFINING TRUETREE

Inspired by SLTD2, tries to use r(J) to identify the MAAS and EAS. LOCATING AN EAS Infeasible to search for highest r(J) at each iteration – too many J. Greedy alternative: construct a subset of J’s likely to contain the EASes

  • Set seed J1 = {i, j}, record r(J1).
  • J2 = J1 ∪ {kd}, where kd ∈ B \ J1 is the leaf that minimizes r

{k}(J1)

– invite most disagreeable member outside of J to join J.

  • For each of

m

2

  • seeds, get a sequence of |B| − 1 candidates EAS J sets.
  • Select J∗ with the highest agreement r(J).
  • Termination step: (needed since above gives |J| ≤ |B| − 1)

Set J = B if |J∗| = |B| − 1, AND if r(J) > α for all J of size |J| = |B| − 1. LOCATING A COMPLETE MAAS Try to assemble set with highest internal agreement based on sibling transitivity.

  • Order all seed J’s according to their r(J): r(J1) ≤ r(J2) ≤ r(J3) ≤ etc..
  • Initialize J∗ = J1.
  • If J∗ ∩ J2 = ∅, set J∗ = J∗ ∪ J2 and continue, else stop.

102 / 121

slide-103
SLIDE 103

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

DEFINING TRUETREE

Inspired by SLTD2, tries to use r(J) to identify the MAAS and EAS. LOCATING AN EAS Infeasible to search for highest r(J) at each iteration – too many J. Greedy alternative: construct a subset of J’s likely to contain the EASes

  • Set seed J1 = {i, j}, record r(J1).
  • J2 = J1 ∪ {kd}, where kd ∈ B \ J1 is the leaf that minimizes r

{k}(J1)

– invite most disagreeable member outside of J to join J.

  • For each of

m

2

  • seeds, get a sequence of |B| − 1 candidates EAS J sets.
  • Select J∗ with the highest agreement r(J).
  • Termination step: (needed since above gives |J| ≤ |B| − 1)

Set J = B if |J∗| = |B| − 1, AND if r(J) > α for all J of size |J| = |B| − 1. LOCATING A COMPLETE MAAS Try to assemble set with highest internal agreement based on sibling transitivity.

  • Order all seed J’s according to their r(J): r(J1) ≤ r(J2) ≤ r(J3) ≤ etc..
  • Initialize J∗ = J1.
  • If J∗ ∩ J2 = ∅, set J∗ = J∗ ∪ J2 and continue, else stop.

Finally: from candidate EAS and MAAS, select one with highest r(J).

103 / 121

slide-104
SLIDE 104

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

RANDOM TOPOLOGY GENERATION

Want to constrain maximum node degree dmax: – gives spectrum of error modes. – includes binary special case.

104 / 121

slide-105
SLIDE 105

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

RANDOM TOPOLOGY GENERATION

Want to constrain maximum node degree dmax: – gives spectrum of error modes. – includes binary special case. Generation Method: – Pseudo-uniform bottom up algorithm with dmax constraint. – Working on fast approach for true uniform generation.

105 / 121

slide-106
SLIDE 106

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

RANDOM TOPOLOGY GENERATION

Want to constrain maximum node degree dmax: – gives spectrum of error modes. – includes binary special case. Generation Method: – Pseudo-uniform bottom up algorithm with dmax constraint. – Working on fast approach for true uniform generation.

TEST CASES

dmax = 2 3 4 5 6 7 8 9 m = 3

— — — — — m = 5

— — — m = 9

  • TABLE : The (m, dmax) used in model generation.

106 / 121

slide-107
SLIDE 107

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

RANDOM SPATIAL DEPENDENCY GENERATION

Want to sample from MAJI(T).

107 / 121

slide-108
SLIDE 108

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

RANDOM SPATIAL DEPENDENCY GENERATION

Want to sample from MAJI(T). Main task is to select the joint sibling distributions. Need to add constraints to allow scenario control.

108 / 121

slide-109
SLIDE 109

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

RANDOM SPATIAL DEPENDENCY GENERATION

Want to sample from MAJI(T). Main task is to select the joint sibling distributions. Need to add constraints to allow scenario control. Generation Method: – Select loss marginal targets for each sibling set. – Express constraints as a matrix equation defining a subset of MJI. – Use MCMC (R.L.Smith ’84) method to sample uniformly.

109 / 121

slide-110
SLIDE 110

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

RANDOM SPATIAL DEPENDENCY GENERATION

Want to sample from MAJI(T). Main task is to select the joint sibling distributions. Need to add constraints to allow scenario control. Generation Method: – Select loss marginal targets for each sibling set. – Express constraints as a matrix equation defining a subset of MJI. – Use MCMC (R.L.Smith ’84) method to sample uniformly. Compose sibling set samples according to global JI model rules. Resulting model-sample is in MAJI(T) with probability 1.

110 / 121

slide-111
SLIDE 111

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

TREE DISTANCE / ERROR

Want to define distance between T1 = (V1, L1) and T2 = (V2, L2), sharing the same labelled receivers R, with m = |R|.

111 / 121

slide-112
SLIDE 112

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

TREE DISTANCE / ERROR

Want to define distance between T1 = (V1, L1) and T2 = (V2, L2), sharing the same labelled receivers R, with m = |R|. Define R(v) to be set of receivers below v. (leaf based equivalent tree description)

112 / 121

slide-113
SLIDE 113

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

TREE DISTANCE / ERROR

Want to define distance between T1 = (V1, L1) and T2 = (V2, L2), sharing the same labelled receivers R, with m = |R|. Define R(v) to be set of receivers below v. (leaf based equivalent tree description) Let V1\2 = {v ∈ V1|∄u ∈ V2 with R(v) = R(u)}. (number of nodes in T1 that do not appear in T2)

113 / 121

slide-114
SLIDE 114

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

TREE DISTANCE / ERROR

Want to define distance between T1 = (V1, L1) and T2 = (V2, L2), sharing the same labelled receivers R, with m = |R|. Define R(v) to be set of receivers below v. (leaf based equivalent tree description) Let V1\2 = {v ∈ V1|∄u ∈ V2 with R(v) = R(u)}. (number of nodes in T1 that do not appear in T2) Definition: dist(T1, T2) = |V1\2| + |V2\1| This is a true distance metric, taking values in {0, 1, . . . , 2(m − 2)}.

114 / 121

slide-115
SLIDE 115

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

TREE DISTANCE / ERROR

Want to define distance between T1 = (V1, L1) and T2 = (V2, L2), sharing the same labelled receivers R, with m = |R|. Define R(v) to be set of receivers below v. (leaf based equivalent tree description) Let V1\2 = {v ∈ V1|∄u ∈ V2 with R(v) = R(u)}. (number of nodes in T1 that do not appear in T2) Definition: dist(T1, T2) = |V1\2| + |V2\1| This is a true distance metric, taking values in {0, 1, . . . , 2(m − 2)}. Error: eT = dist(T, T)

115 / 121

slide-116
SLIDE 116

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

PERFORMANCE UNDER ‘GENTLE MODELS’

Low Loss Regime: ρi ∈ [0.9, 0.99] for each node i.

3 4 5 6 7 8 9 0.05 0.1 0.15 0.2 0.25 0.3 0.35

Number of errors Number of receivers

SLTD TrueTree

(errors averaged over 200 random models for each fixed T, and 6000 probes)

116 / 121

slide-117
SLIDE 117

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

PERFORMANCE UNDER ‘GENTLE MODELS’

Low Loss Regime: ρi ∈ [0.9, 0.99] for each node i.

3 4 5 6 7 8 9 0.05 0.1 0.15 0.2 0.25 0.3 0.35

Number of errors Number of receivers

SLTD TrueTree

(errors averaged over 200 random models for each fixed T, and 6000 probes) Binary trees: samples Classically Equivalent = ⇒ SLTD, TrueTree legal. If dmax > 2: TrueTree legal (model in MAJI), but SLTD behaviour undefined.

117 / 121

slide-118
SLIDE 118

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

PERFORMANCE UNDER ‘GENTLE MODELS’

Low Loss Regime: ρi ∈ [0.9, 0.99] for each node i.

2 3 4 5 6 7 8 9 0.5 1 1.5 2 2.5 3

Number of errors Maximum node degree

SLTD 3 receivers SLTD 5 receivers SLTD 9 receivers TrueTree 3 receivers TrueTree 5 receivers TrueTree 9 receivers 118 / 121

slide-119
SLIDE 119

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

PERFORMANCE ON DISRUPTIVE MODELS

Low Hot Spot scenario: single model with negative dependency.

SLTD TrueTree 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Average errors

(errors averaged over 200 random models for each fixed T, and 6000 probes)

119 / 121

slide-120
SLIDE 120

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

PERFORMANCE ON DISRUPTIVE MODELS

Low Hot Spot scenario: single model with negative dependency.

SLTD TrueTree 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Average errors

(errors averaged over 200 random models for each fixed T, and 6000 probes) SLTD has worst possible eT = 2 in 100% of cases. TrueTree has eT = 0 in 100% of cases.

120 / 121

slide-121
SLIDE 121

Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc

FINITE CONCLUSION

  • Agreement sets great in theory, tricky in practice, but can be done.
  • TrueTree

– gives comparable results to SLTD on gentle loss. – can handle disruptive loss. – outperforms SLTD when loss higher.

  • More work to be done, but promise of SLTD2 on rich class of

spatial models can be realized.

121 / 121