Confidence Sets for Persistent Diagrams Aleksandr Popov Eindhoven - - PowerPoint PPT Presentation

confidence sets for persistent diagrams
SMART_READER_LITE
LIVE PREVIEW

Confidence Sets for Persistent Diagrams Aleksandr Popov Eindhoven - - PowerPoint PPT Presentation

Confidence Sets for Persistent Diagrams Aleksandr Popov Eindhoven University of Technology 12th June 2018 1 / 31 What to expect 1. Introduction and motivation 2. Formal definition 3. Computation methods 2 / 31 Introduction and motivation


slide-1
SLIDE 1

Confidence Sets for Persistent Diagrams

Aleksandr Popov

Eindhoven University of Technology

12th June 2018

1 / 31

slide-2
SLIDE 2

What to expect

  • 1. Introduction and motivation
  • 2. Formal definition
  • 3. Computation methods

2 / 31

slide-3
SLIDE 3

Introduction and motivation

3 / 31

slide-4
SLIDE 4

Back to the origins

Main idea of TDA: determine topology of underlying space, based on a point cloud. Point cloud can be viewed as a sample of the true space.

Random sample of 10 points.

4 / 31

slide-5
SLIDE 5

Back to the origins

Main idea of TDA: determine topology of underlying space, based on a point cloud. Point cloud can be viewed as a sample of the true space.

Random sample of 10 points.

4 / 31

slide-6
SLIDE 6

Čech complex

Čech complex on the 10 points.

5 / 31

slide-7
SLIDE 7

Čech complex

Čech complex on the 10 points.

5 / 31

slide-8
SLIDE 8

Čech complex

Čech complex on the 10 points.

5 / 31

slide-9
SLIDE 9

Čech complex

Čech complex on the 10 points.

5 / 31

slide-10
SLIDE 10

Čech complex

Čech complex on the 10 points.

5 / 31

slide-11
SLIDE 11

Homology

Betui numbers: β0 = β1 = These will change if we take difgerent radius of balls!

6 / 31

slide-12
SLIDE 12

Homology

Betui numbers: β0 = 1 β1 = 1 These will change if we take difgerent radius of balls!

6 / 31

slide-13
SLIDE 13

Homology

Betui numbers: β0 = 1 β1 = 1 These will change if we take difgerent radius of balls!

6 / 31

slide-14
SLIDE 14

A difgerent Čech complex

A difgerent Čech complex on the 10 points: β0 = 6, β1 = 0.

7 / 31

slide-15
SLIDE 15

A difgerent Čech complex

A difgerent Čech complex on the 10 points: β0 = 6, β1 = 0.

7 / 31

slide-16
SLIDE 16

A difgerent Čech complex

A difgerent Čech complex on the 10 points: β0 = 6, β1 = 0.

7 / 31

slide-17
SLIDE 17

A difgerent Čech complex

A difgerent Čech complex on the 10 points: β0 = 6, β1 = 0.

7 / 31

slide-18
SLIDE 18

A difgerent Čech complex

A difgerent Čech complex on the 10 points: β0 = 6, β1 = 0.

7 / 31

slide-19
SLIDE 19

Persistent homology

1 2 3 4 1 2 3 4 Birth Death component loop

Persistence diagram from Vietoris–Rips complex for the 10 points.

8 / 31

slide-20
SLIDE 20

So what’s wrong with this?

What here is ‘noise’ and what is ‘signal’? How well do we represent the homology of the space of interest?

9 / 31

slide-21
SLIDE 21

What do we want?

1 2 3 4 1 2 3 4 Birth Death component loop

Persistence diagram from Vietoris–Rips complex for the 10 points with the confidence band.

10 / 31

slide-22
SLIDE 22

Formal definition

11 / 31

slide-23
SLIDE 23

Statistical model and basic definitions

Space of interest: manifold M. Sample set Sn X Xn following distribution P. P is concentrated on . Persistence diagram—multiset of points in

  • plane. Denote by

.

Manifold M.

12 / 31

slide-24
SLIDE 24

Statistical model and basic definitions

Space of interest: manifold M. Sample set Sn = {X1, . . . , Xn} following distribution P. P is concentrated on M. Persistence diagram—multiset of points in

  • plane. Denote by

.

M with points sampled from P.

12 / 31

slide-25
SLIDE 25

Statistical model and basic definitions

Space of interest: manifold M. Sample set Sn = {X1, . . . , Xn} following distribution P. P is concentrated on M. Persistence diagram—multiset of points in

  • plane. Denote by

.

Sampled points Sn.

12 / 31

slide-26
SLIDE 26

Statistical model and basic definitions

Space of interest: manifold M. Sample set Sn = {X1, . . . , Xn} following distribution P. P is concentrated on M. Persistence diagram—multiset of points in

  • plane. Denote by P.

1 2 3 4 1 2 3 4 Birth Death component loop

Persistence diagram P for the example.

12 / 31

slide-27
SLIDE 27

Čech filtrations; distance

Čech filtration—sequence of Čech complexes with gradually increasing radius. Distance function for a set A

D:

dA x inf

y A y

x i.e. the distance to the closest point in A. Čech filtration corresponds to lower level set filtration of distance function: x dSn x for increasing from 0 to .

Čech filtration.

13 / 31

slide-28
SLIDE 28

Čech filtrations; distance

Čech filtration—sequence of Čech complexes with gradually increasing radius. Distance function for a set A

D:

dA x inf

y A y

x i.e. the distance to the closest point in A. Čech filtration corresponds to lower level set filtration of distance function: x dSn x for increasing from 0 to .

Čech filtration.

13 / 31

slide-29
SLIDE 29

Čech filtrations; distance

Čech filtration—sequence of Čech complexes with gradually increasing radius. Distance function for a set A ⊂ RD: dA(x) = inf

y∈A∥y − x∥2 ,

i.e. the distance to the closest point in A. Čech filtration corresponds to lower level set filtration of distance function: x dSn x for increasing from 0 to . A p

Distance from p to A.

13 / 31

slide-30
SLIDE 30

Čech filtrations; distance

Čech filtration—sequence of Čech complexes with gradually increasing radius. Distance function for a set A ⊂ RD: dA(x) = inf

y∈A∥y − x∥2 ,

i.e. the distance to the closest point in A. Čech filtration corresponds to lower level set filtration of distance function: {x : dSn(x) ≤ ε} , for ε increasing from 0 to ∞.

Regions for increasing ε.

13 / 31

slide-31
SLIDE 31

Čech filtrations; distance

Čech filtration—sequence of Čech complexes with gradually increasing radius. Distance function for a set A ⊂ RD: dA(x) = inf

y∈A∥y − x∥2 ,

i.e. the distance to the closest point in A. Čech filtration corresponds to lower level set filtration of distance function: {x : dSn(x) ≤ ε} , for ε increasing from 0 to ∞.

Regions for increasing ε.

13 / 31

slide-32
SLIDE 32

Čech filtrations; distance

Čech filtration—sequence of Čech complexes with gradually increasing radius. Distance function for a set A ⊂ RD: dA(x) = inf

y∈A∥y − x∥2 ,

i.e. the distance to the closest point in A. Čech filtration corresponds to lower level set filtration of distance function: {x : dSn(x) ≤ ε} , for ε increasing from 0 to ∞.

Regions for increasing ε.

13 / 31

slide-33
SLIDE 33

What are we trying to get?

We can construct the persistence diagram of Sn, ˆ P. We use it to estimate the persistence diagram of M, P. Can we guarantee with high probability that they are close for all the points? We need to express ‘closeness’ somehow.

14 / 31

slide-34
SLIDE 34

What are we trying to get?

We can construct the persistence diagram of Sn, ˆ P. We use it to estimate the persistence diagram of M, P. Can we guarantee with high probability that they are close for all the points? We need to express ‘closeness’ somehow.

14 / 31

slide-35
SLIDE 35

Botuleneck distance

Perfect matching M of sets A and B (bijection). Cost between points a ∈ A, b ∈ B: d∞(a, b) = max(|ax − bx|, |ay − by|) . Botuleneck distance: W A B min

M

max

a b M d

a b Botuleneck distance minimises the maximum pairwise cost. 2 3 a b

d∞(a, b) = 3.

15 / 31

slide-36
SLIDE 36

Botuleneck distance

Perfect matching M of sets A and B (bijection). Cost between points a ∈ A, b ∈ B: d∞(a, b) = max(|ax − bx|, |ay − by|) . Botuleneck distance: W∞(A, B) = min

M

max

(a,b)∈M d∞(a, b) .

Botuleneck distance minimises the maximum pairwise cost. (0, 0) (0, 0.8) (4, 2) (4.8, 2.8) (2, 4) (2.8, 4)

W∞(A, B) =

15 / 31

slide-37
SLIDE 37

Botuleneck distance

Perfect matching M of sets A and B (bijection). Cost between points a ∈ A, b ∈ B: d∞(a, b) = max(|ax − bx|, |ay − by|) . Botuleneck distance: W∞(A, B) = min

M

max

(a,b)∈M d∞(a, b) .

Botuleneck distance minimises the maximum pairwise cost. (0, 0) (0, 0.8) (4, 2) (4.8, 2.8) (2, 4) (2.8, 4)

W∞(A, B) =

15 / 31

slide-38
SLIDE 38

Botuleneck distance

Perfect matching M of sets A and B (bijection). Cost between points a ∈ A, b ∈ B: d∞(a, b) = max(|ax − bx|, |ay − by|) . Botuleneck distance: W∞(A, B) = min

M

max

(a,b)∈M d∞(a, b) .

Botuleneck distance minimises the maximum pairwise cost. (0, 0) (0, 0.8) (4, 2) (4.8, 2.8) (2, 4) (2.8, 4)

W∞(A, B) = 0.8.

15 / 31

slide-39
SLIDE 39

What are we trying to get: take 2

We can construct the persistence diagram of Sn, ˆ P. We use it to estimate the persistence diagram of M, P. Can we guarantee with high probability that they are close for all the points? Given confidence , find cn such that lim sup

n

W cn If the point in is further than cn from the diagonal, then with probability at least it is also not on the diagonal in , so it is significant.

16 / 31

slide-40
SLIDE 40

What are we trying to get: take 2

We can construct the persistence diagram of Sn, ˆ P. We use it to estimate the persistence diagram of M, P. Can we guarantee with high probability that they are close for all the points? Given confidence α ∈ (0, 1), find cn such that lim sup

n→∞

P(W∞( ˆ P, P) > cn) ≤ α . If the point in is further than cn from the diagonal, then with probability at least it is also not on the diagonal in , so it is significant.

16 / 31

slide-41
SLIDE 41

What are we trying to get: take 2

We can construct the persistence diagram of Sn, ˆ P. We use it to estimate the persistence diagram of M, P. Can we guarantee with high probability that they are close for all the points? Given confidence α ∈ (0, 1), find cn such that lim sup

n→∞

P(W∞( ˆ P, P) > cn) ≤ α . If the point in ˆ P is further than cn from the diagonal, then with probability at least 1 − α it is also not on the diagonal in P, so it is significant.

16 / 31

slide-42
SLIDE 42

Botuleneck stability

L∞ distance between functions f, g : X → R: ∥f − g∥∞ = sup

x∈X

|f(x) − g(x)| . sin x sin(x + π)

17 / 31

slide-43
SLIDE 43

Botuleneck stability

L∞ distance between functions f, g : X → R: ∥f − g∥∞ = sup

x∈X

|f(x) − g(x)| . sin x sin(x + π)

17 / 31

slide-44
SLIDE 44

Botuleneck stability

L∞ distance between functions f, g : X → R: ∥f − g∥∞ = sup

x∈X

|f(x) − g(x)| . Botuleneck stability: W∞(P(f), P(g)) ≤ ∥f − g∥∞ , where P is a persistence diagram.

17 / 31

slide-45
SLIDE 45

Hausdorfg distance

For two sets A, B ⊂ RD: H(A, B) = max { max

x∈A min y∈B∥x − y∥, max x∈B min y∈A∥x − y∥

} . This is maximum Euclidean distance from a point in one set to the closest point in the other set. (0, 0) (0, 0.8) (3, 2) (3, 3) (2, 4) (2.8, 4)

H(A, B) =

18 / 31

slide-46
SLIDE 46

Hausdorfg distance

For two sets A, B ⊂ RD: H(A, B) = max { max

x∈A min y∈B∥x − y∥, max x∈B min y∈A∥x − y∥

} . This is maximum Euclidean distance from a point in one set to the closest point in the other set. (0, 0) (0, 0.8) (3, 2) (3, 3) (2, 4) (2.8, 4)

H(A, B) =

18 / 31

slide-47
SLIDE 47

Hausdorfg distance

For two sets A, B ⊂ RD: H(A, B) = max { max

x∈A min y∈B∥x − y∥, max x∈B min y∈A∥x − y∥

} . This is maximum Euclidean distance from a point in one set to the closest point in the other set. (0, 0) (0, 0.8) (3, 2) (3, 3) (2, 4) (2.8, 4)

H(A, B) =

18 / 31

slide-48
SLIDE 48

Hausdorfg distance

For two sets A, B ⊂ RD: H(A, B) = max { max

x∈A min y∈B∥x − y∥, max x∈B min y∈A∥x − y∥

} . This is maximum Euclidean distance from a point in one set to the closest point in the other set. (0, 0) (0, 0.8) (3, 2) (3, 3) (2, 4) (2.8, 4)

H(A, B) = 3 √ 2.

18 / 31

slide-49
SLIDE 49

Bringing this all together

For two sets A, B ⊂ RD: H(A, B) = max { max

x∈A min y∈B∥x − y∥, max x∈B min y∈A∥x − y∥

} . L∞ distance between functions f, g : X → R: ∥f − g∥∞ = sup

x∈X

|f(x) − g(x)| . Therefore, ∥dSn − dM∥∞= H(Sn, M) .

19 / 31

slide-50
SLIDE 50

Bringing this all together (pt. 2)

Using stability theorem, we get W∞(PSn, PM) ≤ ∥dSn − dM∥∞ = H(Sn, M) . Earlier we said we want to have lim sup

n

W cn Given confidence , we need to find cn such that lim sup

n

H Sn cn

20 / 31

slide-51
SLIDE 51

Bringing this all together (pt. 2)

Using stability theorem, we get W∞(PSn, PM) ≤ ∥dSn − dM∥∞ = H(Sn, M) . Earlier we said we want to have lim sup

n→∞

P(W∞( ˆ P, P) > cn) ≤ α . Given confidence α, we need to find cn such that lim sup

n→∞

P(H(Sn, M) > cn) ≤ α .

20 / 31

slide-52
SLIDE 52

Final definition

Given confidence α, we need to find cn such that lim sup

n→∞

P(H(Sn, M) > cn) ≤ α . Confidence set Cn is the set of persistence diagrams that are plausible as true persistence diagrams, given our estimate ˆ P: Cn = { ˜ P : W∞( ˆ P, ˜ P) ≤ cn} .

21 / 31

slide-53
SLIDE 53

Illustration

1 2 3 4 1 2 3 4 Birth Death component loop

Persistence diagram from Vietoris–Rips complex for the 10 points with the confidence band.

22 / 31

slide-54
SLIDE 54

Computation methods

23 / 31

slide-55
SLIDE 55

Overview

Multiple methods. Simplest: subsampling. Betuer: concentration of measure. Difgerent, more robust: density estimation.

24 / 31

slide-56
SLIDE 56

Subsampling

Let b be such that b = o (

n log n

) and bn → ∞ as n → ∞. We have n data points. Define N = (n

b

) .

  • 1. Draw N subsamples from Sn, each of size b: Sb n

SN

b n.

  • 2. Compute for each subsample Tj

H Sj

b n Sn .

  • 3. Define Lb t

N N j

1 Tj t , so count the proportion of subsamples with Hausdorfg distance to the original above t.

  • 4. Let cb

Lb .

25 / 31

slide-57
SLIDE 57

Subsampling

Let b be such that b = o (

n log n

) and bn → ∞ as n → ∞. We have n data points. Define N = (n

b

) .

  • 1. Draw N subsamples from Sn, each of size b: S1

b,n, . . . , SN b,n.

  • 2. Compute for each subsample Tj = H(Sj

b,n, Sn).

  • 3. Define Lb(t) = 1

N

∑N

j=1 1(Tj > t), so count the proportion of subsamples with

Hausdorfg distance to the original above t.

  • 4. Let cb = 2L−1

b (α).

25 / 31

slide-58
SLIDE 58

Why does this work?

As Sn is sampled from M, so our subsamples are sampled from Sn. We tune the cb so that most of our subsamples are close to Sn. This way Sn will be close to M. Formally: for all large n: W cb H Sn cb b n

26 / 31

slide-59
SLIDE 59

Why does this work?

As Sn is sampled from M, so our subsamples are sampled from Sn. We tune the cb so that most of our subsamples are close to Sn. This way Sn will be close to M. Formally: for all large n: P(W∞( ˆ P, P) > cb) ≤ P(H(Sn, M) > cb) ≤ α + O (b n ) 1

4

.

26 / 31

slide-60
SLIDE 60

Other methods

Subsampling method is conservative. Concentration of measure-based method is betuer, but complicated. Both use Čech filtration. Both are not robust to outliers. Density estimation-based methods are. Other methods require too much statistics to present here.

27 / 31

slide-61
SLIDE 61

Some experimental results

28 / 31

slide-62
SLIDE 62

Some experimental results (pt. 2)

29 / 31

slide-63
SLIDE 63

Summary

We introduced the concept of confidence sets. Goal: distinguish ‘noise’ from ‘signal’ in persistence diagrams built on point sets. Implementation: confidence interval-like band around the diagonal. Points in the band are noise. Computation: various methods, subsampling the simplest.

30 / 31

slide-64
SLIDE 64

References

David Cohen-Steiner, Herbert Edelsbrunner and John Harer. ‘Stability of Persistence Diagrams’. In: Discrete & Computational Geometry 37.1 (Jan. 2007),

  • pp. 103–120. issn: 1432-0444. doi: 10.1007/s00454-006-1276-5.

Brituany Terese Fasy et al. ‘Confidence Sets for Persistence Diagrams’. Version 3. In: Annals of Statistics 42.6 (20th Nov. 2014), pp. 2301–2339. doi: 10.1214/14-AOS1252. arXiv: 1303.7117 [math.ST]. Brituany Terese Fasy et al. ‘Supplement to: Confidence Sets for Persistence Diagrams’. In: Annals of Statistics 42.6 (20th Nov. 2014). doi: 10.1214/14-AOS1252SUPP.

31 / 31