Combinatorial entropy and succinct data structures Gilles Schaeffer - - PowerPoint PPT Presentation

combinatorial entropy and succinct data structures
SMART_READER_LITE
LIVE PREVIEW

Combinatorial entropy and succinct data structures Gilles Schaeffer - - PowerPoint PPT Presentation

Combinatorial entropy and succinct data structures Gilles Schaeffer based in part on joined works with L. Castelli Aleardi, O. Devillers, E. Fusy and D. Poulalhon Analysis of Algorithms, 2009 Before we start... Geometric data ; meshes Among


slide-1
SLIDE 1

Gilles Schaeffer

Analysis of Algorithms, 2009

Combinatorial entropy and succinct data structures

based in part on joined works with

  • L. Castelli Aleardi, O. Devillers,
  • E. Fusy and D. Poulalhon
slide-2
SLIDE 2

Among data structures for geometric data, I pick meshes...

Before we start... Geometric data ; meshes

Geographic information systems Surface recontruction from sampling Surface modelling

slide-3
SLIDE 3

Before we start... ∃ very large geometric data

  • St. Matthew (Stanford’s Digital

Michelangelo Project, 2000) 186 millions vertices 6 Giga bytes (for storing on disk) minutes for loading the model from disk David statue (Stanford’s Digital Michelangelo Project, 2000) 2 billions polygons 32 Giga bytes (without compression)

No existing algorithm nor data structure for dealing with the entire model

slide-4
SLIDE 4

Before we start... What we are aiming at

Mesh compression Geometric data structures disk storage Transmission

slide-5
SLIDE 5

Before we start... What we are aiming at

Mesh compression Compact representations of geometric data structures Geometric data structures

i . . .

disk storage Transmission MERGE INTO:

slide-6
SLIDE 6
  • rdered tree with n edges

balanced parenthesis word of length 2n

Starter: the encoding of plane trees

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

slide-7
SLIDE 7

⇒ 2n bits for encoding an ordered tree with n edges

  • rdered tree with n edges

balanced parenthesis word of length 2n

Starter: the encoding of plane trees

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

slide-8
SLIDE 8

⇒ 2n bits for encoding an ordered tree with n edges

  • rdered tree with n edges

balanced parenthesis word of length 2n

Starter: the encoding of plane trees

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 Compare to the standard explicit represention:

3n pointers ≈ 96 bits 3n log n in theory

slide-9
SLIDE 9

⇒ 2n bits for encoding an ordered tree with n edges

  • rdered tree with n edges

balanced parenthesis word of length 2n

Bn =

1 n+1

2n

n

  • ≈ 22nn− 3

2

enumeration:

Starter: the encoding of plane trees

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

slide-10
SLIDE 10

⇒ 2n bits for encoding an ordered tree with n edges

  • rdered tree with n edges

balanced parenthesis word of length 2n

Bn =

1 n+1

2n

n

  • ≈ 22nn− 3

2

enumeration:

Starter: the encoding of plane trees

log2 Bn = 2n + O(lg n) bpv 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

slide-11
SLIDE 11

⇒ 2n bits for encoding an ordered tree with n edges

  • rdered tree with n edges

balanced parenthesis word of length 2n

Bn =

1 n+1

2n

n

  • ≈ 22nn− 3

2

enumeration:

Starter: the encoding of plane trees

log2 Bn = 2n + O(lg n) bpv

This is an optimal encoding!

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

it matches asymptotically the information-theory lower bound

slide-12
SLIDE 12

⇒ 2n bits for encoding an ordered tree with n edges

  • rdered tree with n edges

balanced parenthesis word of length 2n

Bn =

1 n+1

2n

n

  • ≈ 22nn− 3

2

enumeration:

Starter: the encoding of plane trees

log2 Bn = 2n + O(lg n) bpv

This is an optimal encoding!

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

it matches asymptotically the information-theory lower bound exponential growth rate

slide-13
SLIDE 13

⇒ 2n bits for encoding an ordered tree with n edges

  • rdered tree with n edges

balanced parenthesis word of length 2n

Bn =

1 n+1

2n

n

  • ≈ 22nn− 3

2

enumeration:

Starter: the encoding of plane trees

log2 Bn = 2n + O(lg n) bpv

This is an optimal encoding!

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

it matches asymptotically the information-theory lower bound exponential growth rate ⇔ combinatorial entropy

slide-14
SLIDE 14

Starter: linear space data structures for plane trees?

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

  • rdered tree with n edges

balanced parenthesis word of length 2n Navigation in the tree: handlers

slide-15
SLIDE 15

Starter: linear space data structures for plane trees?

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

  • rdered tree with n edges

balanced parenthesis word of length 2n Navigation in the tree: handlers

slide-16
SLIDE 16

Starter: linear space data structures for plane trees?

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

  • rdered tree with n edges

balanced parenthesis word of length 2n Navigation in the tree: handlers

slide-17
SLIDE 17

Starter: linear space data structures for plane trees?

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

  • rdered tree with n edges

balanced parenthesis word of length 2n Navigation in the tree: handlers move the handler to first son move the handler to father move the handler to next brother

slide-18
SLIDE 18

Starter: linear space data structures for plane trees?

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

  • rdered tree with n edges

balanced parenthesis word of length 2n Navigation in the tree: handlers move the handler to first son move the handler to father move the handler to next brother Constant time with standard (pointer) representation but the pointer based representation uses Θ(n log n) bits

slide-19
SLIDE 19

Starter: linear space data structures for plane trees?

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

  • rdered tree with n edges

balanced parenthesis word of length 2n Navigation in the tree: handlers move the handler to first son move the handler to father move the handler to next brother Constant time with standard (pointer) representation but the pointer based representation uses Θ(n log n) bits handler = index of opening bracket

slide-20
SLIDE 20

Starter: linear space data structures for plane trees?

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

  • rdered tree with n edges

balanced parenthesis word of length 2n Navigation in the tree: handlers move the handler to first son move the handler to father move the handler to next brother Constant time with standard (pointer) representation but the pointer based representation uses Θ(n log n) bits handler = index of opening bracket

slide-21
SLIDE 21

Starter: linear space data structures for plane trees?

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

  • rdered tree with n edges

balanced parenthesis word of length 2n Navigation in the tree: handlers move the handler to first son move the handler to father move the handler to next brother Constant time with standard (pointer) representation but the pointer based representation uses Θ(n log n) bits index → index+1 handler = index of opening bracket

slide-22
SLIDE 22

Starter: linear space data structures for plane trees?

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

  • rdered tree with n edges

balanced parenthesis word of length 2n Navigation in the tree: handlers move the handler to first son move the handler to father move the handler to next brother Constant time with standard (pointer) representation but the pointer based representation uses Θ(n log n) bits index → index+1 handler = index of opening bracket index → matching(index)+1

slide-23
SLIDE 23

Starter: linear space data structures for plane trees?

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

  • rdered tree with n edges

balanced parenthesis word of length 2n Navigation in the tree: handlers move the handler to first son move the handler to father move the handler to next brother Constant time with standard (pointer) representation but the pointer based representation uses Θ(n log n) bits index → index+1 handler = index of opening bracket index → matching(index)+1 index → outer(index)

slide-24
SLIDE 24

Starter: linear space data structures for plane trees?

1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0

  • rdered tree with n edges

balanced parenthesis word of length 2n Navigation in the tree: handlers move the handler to first son move the handler to father move the handler to next brother Constant time with standard (pointer) representation but the pointer based representation uses Θ(n log n) bits index → index+1 handler = index of opening bracket index → matching(index)+1 index → outer(index) up to linear time!

slide-25
SLIDE 25

( ( ( ( ) ) )

b1 b2 b3 b4 b5

) ( ) ) ( ( ( ) ) ( ) ( ) ) (

(Jacobson, Focs89)

Starter: linear space data structures for plane trees

Decompose into m small blocks of size ε

2n bits

slide-26
SLIDE 26

( ( ( ( ) ) )

b1 b2 b3 b4 b5

) ( ) ) ( ( ( ) ) ( ) ( ) ) (

(Jacobson, Focs89)

Starter: linear space data structures for plane trees

Decompose into m small blocks of size ε matching(index): go slowly inside block

2n bits

slide-27
SLIDE 27

( ( ( ( ) ) )

b1 b2 b3 b4 b5

) ( ) ) ( ( ( ) ) ( ) ( ) ) (

(Jacobson, Focs89)

Starter: linear space data structures for plane trees

Decompose into m small blocks of size ε matching(index): go slowly inside block if border reached: interblock

2n bits

slide-28
SLIDE 28

( ( ( ( ) ) )

b1 b2 b3 b4 b5

) ( ) ) ( ( ( ) ) ( ) ( ) ) (

(Jacobson, Focs89)

Starter: linear space data structures for plane trees

Decompose into m small blocks of size ε matching(index): go slowly inside block if border reached: interblock

2n bits

slide-29
SLIDE 29

( ( ( ( ) ) )

b1 b2 b3 b4 b5

) ( ) ) ( ( ( ) ) ( ) ( ) ) (

(Jacobson, Focs89)

Starter: linear space data structures for plane trees

Decompose into m small blocks of size ε matching(index): go slowly inside block if border reached: interblock encode interblock explicitely: up to n edges ⇒ space n log n

2n bits

slide-30
SLIDE 30

( ( ( ( ) ) )

b1 b2 b3 b4 b5

) ( ) ) ( ( ( ) ) ( ) ( ) ) (

(Jacobson, Focs89)

Starter: linear space data structures for plane trees

Decompose into m small blocks of size ε matching(index): go slowly inside block if border reached: interblock encode interblock explicitely: up to n edges ⇒ space n log n

2n bits

encode ≤ m-1 pioneers (outermost between blocks) ⇒ space m log n

slide-31
SLIDE 31

( ( ( ( ) ) )

b1 b2 b3 b4 b5

) ( ) ) ( ( ( ) ) ( ) ( ) ) (

(Jacobson, Focs89)

Starter: linear space data structures for plane trees

Decompose into m small blocks of size ε matching(index): go slowly inside block if border reached: interblock encode interblock explicitely: up to n edges ⇒ space n log n

2n bits

encode ≤ m-1 pioneers (outermost between blocks) ⇒ space m log n

(1, 22)(2, 9)(3, 6)(10, 19)(15, 16)(20, 21)

slide-32
SLIDE 32

( ( ( ( ) ) )

b1 b2 b3 b4 b5

) ( ) ) ( ( ( ) ) ( ) ( ) ) (

(Jacobson, Focs89)

Starter: linear space data structures for plane trees

Decompose into m small blocks of size ε matching(index): go slowly inside block if border reached: interblock encode interblock explicitely: up to n edges ⇒ space n log n

2n bits

encode ≤ m-1 pioneers (outermost between blocks) ⇒ space m log n

(1, 22)(2, 9)(3, 6)(10, 19)(15, 16)(20, 21)

the explicit representation must allow navigation...

slide-33
SLIDE 33

( ( ( ( ) ) )

1100000001000010000100 b1 b2 b3 b4 b5 22 9 16 21 B T

) ( ) ) ( ( ( ) ) ( ) ( ) ) (

(Jacobson, Focs89)

Starter: linear space data structures for plane trees

Decompose into m small blocks of size ε matching(index): go slowly inside block if border reached: interblock encode interblock explicitely: up to n edges ⇒ space n log n

2n bits

encode ≤ m-1 pioneers (outermost between blocks) ⇒ space m log n the explicit representation must allow navigation...

19

low weight bit vectors m log n bits select/rank queries

matching(3): 3,4,5, interblock, rB(3) = 2, T(2) = 9, 9,8,7,6.

slide-34
SLIDE 34

( ( ( ( ) ) )

1100000001000010000100 b1 b2 b3 b4 b5 22 9 16 21 B T

) ( ) ) ( ( ( ) ) ( ) ( ) ) (

(Jacobson, Focs89)

Starter: linear space data structures for plane trees

Decompose into m small blocks of size ε matching(index): go slowly inside block if border reached: interblock encode interblock explicitely: up to n edges ⇒ space n log n

2n bits

encode ≤ m-1 pioneers (outermost between blocks) ⇒ space m log n the explicit representation must allow navigation...

19

low weight bit vectors m log n bits select/rank queries

matching(3): 3,4,5, interblock, rB(3) = 2, T(2) = 9, 9,8,7,6.

slide-35
SLIDE 35

( ( ( ( ) ) )

1100000001000010000100 b1 b2 b3 b4 b5 22 9 16 21 B T

) ( ) ) ( ( ( ) ) ( ) ( ) ) (

(Jacobson, Focs89)

Starter: linear space data structures for plane trees

Decompose into m small blocks of size ε matching(index): go slowly inside block if border reached: interblock encode interblock explicitely: up to n edges ⇒ space n log n

2n bits

encode ≤ m-1 pioneers (outermost between blocks) ⇒ space m log n

19

low weight bit vectors m log n bits select/rank queries

matching(3): 3,4,5, interblock, rB(3) = 2, T(2) = 9, 9,8,7,6. Taking ε = Θ(log n): space m log n = O(n), queries in O(log n)

O(n) extra bits

slide-36
SLIDE 36

( ( ( ( ) ) )

1100000001000010000100 b1 b2 b3 b4 b5 22 9 16 21 B T

) ( ) ) ( ( ( ) ) ( ) ( ) ) (

(Jacobson, Focs89)

Starter: linear space data structures for plane trees

Decompose into m small blocks of size ε matching(index): go slowly inside block if border reached: interblock encode interblock explicitely: up to n edges ⇒ space n log n

2n bits

encode ≤ m-1 pioneers (outermost between blocks) ⇒ space m log n

19

low weight bit vectors m log n bits select/rank queries

matching(3): 3,4,5, interblock, rB(3) = 2, T(2) = 9, 9,8,7,6. Taking ε = Θ(log n): space m log n = O(n), queries in O(log n)

O(n) extra bits

succinct data structures: want space 2n + o(n) and queries in O(1)

slide-37
SLIDE 37

Combinatorial entropy and succinct data structures

An: structures of size n, with log2 |An| = αn + O(n). but large explicit representation (using O(n) pointers of size log n) Aim 1 (compression): find an encoding with α bits per size unit

with linear time encoding/decoding procedures

slide-38
SLIDE 38

Combinatorial entropy and succinct data structures

An: structures of size n, with log2 |An| = αn + O(n). but large explicit representation (using O(n) pointers of size log n) Aim 1 (compression): find an encoding with α bits per size unit Aim 2 (succinct data struc): idem + efficient query support

with linear time encoding/decoding procedures answer natural queries in constant time (logtime if not constant)

slide-39
SLIDE 39

Combinatorial entropy and succinct data structures

An: structures of size n, with log2 |An| = αn + O(n). but large explicit representation (using O(n) pointers of size log n) Aim 1 (compression): find an encoding with α bits per size unit Aim 2 (succinct data struc): idem + efficient query support Aim 3 (dynamical s.d.s.): idem + update of the structure

with linear time encoding/decoding procedures answer natural queries in constant time (logtime if not constant) update the structure in logtime (amortized if not worst case)

slide-40
SLIDE 40

Combinatorial entropy and succinct data structures

An: structures of size n, with log2 |An| = αn + O(n). but large explicit representation (using O(n) pointers of size log n) Aim 1 (compression): find an encoding with α bits per size unit Aim 2 (succinct data struc): idem + efficient query support Aim 3 (dynamical s.d.s.): idem + update of the structure

with linear time encoding/decoding procedures answer natural queries in constant time (logtime if not constant) update the structure in logtime (amortized if not worst case)

Aim 0: understand and deal with entropy reduction...

slide-41
SLIDE 41

Entropy reduction and parametrized classes

  • rdered trees with n vertices

entropy 2bpv

slide-42
SLIDE 42

Entropy reduction and parametrized classes

  • rdered trees with n vertices

degree 2 and 0 only: complete binary trees entropy 2bpv 1bpv (2n + 1 vertices: n nodes, n + 1 leaves)

slide-43
SLIDE 43

Entropy reduction and parametrized classes

  • rdered trees with n vertices

degree 2 and 0 only: complete binary trees degree 3 and 0 only: complete ternary entropy 2bpv 1bpv

1 3 log2 27 2 ≈ 1.25 bpv

(2n + 1 vertices: n nodes, n + 1 leaves) (3n + 1 vertices: n nodes, 2n + 1 leaves)

slide-44
SLIDE 44

Entropy reduction and parametrized classes

  • rdered trees with n vertices

degree 2 and 0 only: complete binary trees degree 3 and 0 only: complete ternary more generally, ni vertices of degree i entropy 2bpv 1bpv

1 3 log2 27 2 ≈ 1.25 bpv

(2n + 1 vertices: n nodes, n + 1 leaves) (3n + 1 vertices: n nodes, 2n + 1 leaves)

slide-45
SLIDE 45

Entropy reduction and parametrized classes

  • rdered trees with n vertices

degree 2 and 0 only: complete binary trees degree 3 and 0 only: complete ternary more generally, ni vertices of degree i entropy 2bpv 1bpv

1 3 log2 27 2 ≈ 1.25 bpv

(2n + 1 vertices: n nodes, n + 1 leaves) (3n + 1 vertices: n nodes, 2n + 1 leaves) Old Thm: |T (n0, . . . , nk)| = 1

n

  • n

n0,n1,...,nk

slide-46
SLIDE 46

Entropy reduction and parametrized classes

  • rdered trees with n vertices

degree 2 and 0 only: complete binary trees degree 3 and 0 only: complete ternary more generally, ni vertices of degree i entropy 2bpv 1bpv

1 3 log2 27 2 ≈ 1.25 bpv

(2n + 1 vertices: n nodes, n + 1 leaves) (3n + 1 vertices: n nodes, 2n + 1 leaves) log2

  • n

n0,n1,...,nk

1

n

Old Thm: |T (n0, . . . , nk)| = 1

n

  • n

n0,n1,...,nk

  • log2
  • i α−αi

i

if n = ni = 1 + ini if ni = αin

slide-47
SLIDE 47

Entropy reduction and parametrized classes

  • rdered trees with n vertices

degree 2 and 0 only: complete binary trees degree 3 and 0 only: complete ternary more generally, ni vertices of degree i entropy 2bpv 1bpv

1 3 log2 27 2 ≈ 1.25 bpv

(2n + 1 vertices: n nodes, n + 1 leaves) (3n + 1 vertices: n nodes, 2n + 1 leaves) encode tree by degree list in prefix order log2

  • n

n0,n1,...,nk

1

n

Old Thm: |T (n0, . . . , nk)| = 1

n

  • n

n0,n1,...,nk

  • log2
  • i α−αi

i

if n = ni = 1 + ini if ni = αin

  • bserve that:

entropy(trees)=entropy of text compress optimally with arithmetic coder

slide-48
SLIDE 48

Entropy reduction and parametrized classes

  • rdered trees with n vertices

degree 2 and 0 only: complete binary trees degree 3 and 0 only: complete ternary more generally, ni vertices of degree i entropy 2bpv 1bpv

1 3 log2 27 2 ≈ 1.25 bpv

(2n + 1 vertices: n nodes, n + 1 leaves) (3n + 1 vertices: n nodes, 2n + 1 leaves) encode tree by degree list in prefix order log2

  • n

n0,n1,...,nk

1

n

Old Thm: |T (n0, . . . , nk)| = 1

n

  • n

n0,n1,...,nk

  • log2
  • i α−αi

i

if n = ni = 1 + ini if ni = αin

  • bserve that:

entropy(trees)=entropy of text compress optimally with arithmetic coder Question: what is the maximum entropy, for which degrees?

slide-49
SLIDE 49

Entropy quizz

  • rdered trees

entropy compression dynamic succinct d.s.

yes 4 yes yes

slide-50
SLIDE 50

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

slide-51
SLIDE 51

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi
slide-52
SLIDE 52

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi yes
slide-53
SLIDE 53

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi yes

yes ?

(soda’07)

slide-54
SLIDE 54

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi yes

yes ? bipartite: p black, q white

(soda’07)

slide-55
SLIDE 55

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi yes

yes ? bipartite: p black, q white 4 if p = n

2 + O(√n)

(soda’07)

slide-56
SLIDE 56

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi yes

yes ? bipartite: p black, q white 4 if p = n

2 + O(√n)

(soda’07)

use basic result

slide-57
SLIDE 57

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi yes

yes ? bipartite: p black, q white

p+q

p

2

n

4 if p = n

2 + O(√n)

  • therwise

(soda’07)

use basic result

slide-58
SLIDE 58

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi yes

yes ? bipartite: p black, q white

p+q

p

2

n

probably ? 4 if p = n

2 + O(√n)

  • therwise

yes

(soda’07)

use basic result

slide-59
SLIDE 59

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi yes

yes ? bipartite: p black, q white

p+q

p

2

n

probably ? 4 if p = n

2 + O(√n)

  • therwise

yes height h

(soda’07)

use basic result

slide-60
SLIDE 60

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi yes

yes ? bipartite: p black, q white

p+q

p

2

n

probably ? 4 if p = n

2 + O(√n)

  • therwise

yes height h known ?

(soda’07)

use basic result

slide-61
SLIDE 61

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi yes

yes ? bipartite: p black, q white

p+q

p

2

n

probably ? 4 if p = n

2 + O(√n)

  • therwise

yes height h known ?

(soda’07)

positive natural embedding use basic result

slide-62
SLIDE 62

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi yes

yes ? bipartite: p black, q white

p+q

p

2

n

probably ? 4 if p = n

2 + O(√n)

  • therwise

yes height h known ?

(soda’07)

positive natural embedding 4 use basic result use basic result

slide-63
SLIDE 63

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi yes

yes ? bipartite: p black, q white

p+q

p

2

n

probably ? 4 if p = n

2 + O(√n)

  • therwise

yes height h known ?

(soda’07)

positive natural embedding 4 all leaves at same depth use basic result use basic result

slide-64
SLIDE 64

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi yes

yes ? bipartite: p black, q white

p+q

p

2

n

probably ? 4 if p = n

2 + O(√n)

  • therwise

yes height h known ?

(soda’07)

positive natural embedding 4 all leaves at same depth known? ? use basic result use basic result

slide-65
SLIDE 65

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi yes

yes ? bipartite: p black, q white

p+q

p

2

n

probably ? 4 if p = n

2 + O(√n)

  • therwise

yes height h known ?

(soda’07)

positive natural embedding 4 all leaves at same depth known? ?

  • rdinary decomposable structures

(multitype ordered trees) use basic result use basic result

slide-66
SLIDE 66

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi yes

yes ? bipartite: p black, q white

p+q

p

2

n

probably ? 4 if p = n

2 + O(√n)

  • therwise

yes height h known ?

(soda’07)

positive natural embedding 4 all leaves at same depth known? ?

  • rdinary decomposable structures

(multitype ordered trees) computable ? use frequecies ?

link with multivariable Lagrange inversion?

use basic result use basic result

slide-67
SLIDE 67

Entropy quizz

  • rdered trees

given degree distribution

entropy compression dynamic succinct d.s.

yes 4 yes yes

  • αi log2 1/αi yes

yes ? bipartite: p black, q white

p+q

p

2

n

probably ? 4 if p = n

2 + O(√n)

  • therwise

yes height h known ?

(soda’07)

positive natural embedding 4 all leaves at same depth known? ?

  • rdinary decomposable structures

(multitype ordered trees) computable ? use frequecies ?

link with multivariable Lagrange inversion?

use basic result use basic result

entropy measures diversity of local structure

slide-68
SLIDE 68

Geometry between 30 et 96 bits/vertex

Geometric information

vertex triangle 1 reference to a triangle 3 references to vertices 3 references to triangles ”Connectivity”: the underlying triangulation

13n log n 416n bits

Combinatorial information vs

vertex coordinates adjacency relations between triangles, vertices

  • r
slide-69
SLIDE 69

Geometry between 30 et 96 bits/vertex

Geometric information

vertex triangle 1 reference to a triangle 3 references to vertices 3 references to triangles ”Connectivity”: the underlying triangulation

13n log n 416n bits

Combinatorial information vs

vertex coordinates adjacency relations between triangles, vertices

  • r

#{triangulations} =

2(4n + 1)! (3n + 2)!(n + 1)! ≈ 16 27

  • 3

2π n−5/2 256 27

n

slide-70
SLIDE 70

Geometry between 30 et 96 bits/vertex

Geometric information

vertex triangle 1 reference to a triangle 3 references to vertices 3 references to triangles ”Connectivity”: the underlying triangulation

13n log n 416n bits

Combinatorial information vs

vertex coordinates adjacency relations between triangles, vertices

  • r

#{triangulations} =

2(4n + 1)! (3n + 2)!(n + 1)! ≈ 16 27

  • 3

2π n−5/2 256 27

n

⇒ entropy = log2

256 27 ≈ 3.24 bpv.

slide-71
SLIDE 71

Geometry between 30 et 96 bits/vertex

Geometric information

vertex triangle 1 reference to a triangle 3 references to vertices 3 references to triangles ”Connectivity”: the underlying triangulation

13n log n 416n bits

Combinatorial information vs

vertex coordinates adjacency relations between triangles, vertices

  • r

#{triangulations} =

2(4n + 1)! (3n + 2)!(n + 1)! ≈ 16 27

  • 3

2π n−5/2 256 27

n

⇒ entropy = log2

256 27 ≈ 3.24 bpv. Room for improvement!

slide-72
SLIDE 72

Triangulation encodings: trees decompositions

Edgebreaker, Rosignac (’99) CCCRCCRCCRECRRELCRE

C C C R C C R C C R SC RR E L C R E

( [ [ [ ) ( ] ( ] ( ] [ [ [ ) [ ) ( ] ] [ ) . . .

V5V5V6V5V4V5V8V5V5V4S4V3V4

1101000110000010010000011001000000000

Common visual framework (Isenburg Snoeyink’05)

Canonical orderings, Chiang at al. (’98)

Degree encoding, Touma-Gotsman (’98) Leftmost tree in minimal canonical ordering Poulalhon, S. (’03)

3.67n ? but efficient 4n 3.24n

slide-73
SLIDE 73

Triangulation encodings: trees decompositions

Edgebreaker, Rosignac (’99) CCCRCCRCCRECRRELCRE

C C C R C C R C C R SC RR E L C R E

( [ [ [ ) ( ] ( ] ( ] [ [ [ ) [ ) ( ] ] [ ) . . .

V5V5V6V5V4V5V8V5V5V4S4V3V4

1101000110000010010000011001000000000

Common visual framework (Isenburg Snoeyink’05)

Canonical orderings, Chiang at al. (’98)

Degree encoding, Touma-Gotsman (’98) Leftmost tree in minimal canonical ordering Poulalhon, S. (’03)

3.67n ? but efficient 4n 3.24n

”optimal”

slide-74
SLIDE 74

Triangulation encodings: trees decompositions

Edgebreaker, Rosignac (’99) CCCRCCRCCRECRRELCRE

C C C R C C R C C R SC RR E L C R E

( [ [ [ ) ( ] ( ] ( ] [ [ [ ) [ ) ( ] ] [ ) . . .

V5V5V6V5V4V5V8V5V5V4S4V3V4

1101000110000010010000011001000000000

Common visual framework (Isenburg Snoeyink’05)

Canonical orderings, Chiang at al. (’98)

Degree encoding, Touma-Gotsman (’98) Leftmost tree in minimal canonical ordering Poulalhon, S. (’03)

3.67n ? but efficient 4n 3.24n

”optimal”

better?!

slide-75
SLIDE 75

Triangulation encodings: trees decompositions

V5V5V6V5V4V5V8V5V5V4S4V3V4

1101000110000010010000011001000000000

Common visual framework (Isenburg Snoeyink’05)

Degree encoding, Touma-Gotsman (’98)

? but efficient 3.24n

”optimal”

better?!

The (non-optimal) degree encoder gives much better codes for low entropy triangulations! Patch of triangular grids ⇒ 6,6,6,6,6,6,5,6,6,6,6,5,6,6,6,7. . .

Alliez Desbrun (Eurographics ’01): could a degree encoder be optimal?

Leftmost tree in minimal canonical ordering Poulalhon, S. (’03)

slide-76
SLIDE 76

Triangulation encodings: trees decompositions

V5V5V6V5V4V5V8V5V5V4S4V3V4

1101000110000010010000011001000000000

Common visual framework (Isenburg Snoeyink’05)

Degree encoding, Touma-Gotsman (’98)

? but efficient 3.24n

”optimal”

better?!

The (non-optimal) degree encoder gives much better codes for low entropy triangulations! Patch of triangular grids ⇒ 6,6,6,6,6,6,5,6,6,6,6,5,6,6,6,7. . .

Alliez Desbrun (Eurographics ’01): could a degree encoder be optimal? Gotsman (’06): No. Under constraints p1 = 1 and ipi = 6 on the proportion of vertices of degree pi, the max entropy of degree sequence is 3.236 bpv < 3.245 bpv!

Leftmost tree in minimal canonical ordering Poulalhon, S. (’03)

slide-77
SLIDE 77

Mesh compression Graph encoding Succinct representations

Jacobson (Focs89) Munro and Raman (Focs97) Chiang et al. (Soda01) Castelli Aleardi, Devillers and S. (Wads05, CCCG05, SoCG06) Barbay et al. (Isaac07) Nakano et al. (2008) Poulalhon S.(Icalp03) Fusy et al. (Soda05) Blandford Blelloch (Soda03) Castelli Aleardi, Fusy, Lewiner (SoCG08) Turan (’84) Keeler Westbrook (’95) Computer graphics Graph theory / combinatorics He et al. (’99) Edgebreaker V alence (degree) Rossignac (’99) Touma and Gotsman (’98) Alliez and Debrun Algorithms and DS Cut − border machine Isenburg Khodakovsky Gumhold et al. (Siggraph ’98) Gumhold (Soda ’05) Lope et al. (’03) Lewiner et al. (’04) . . . . . . (many many others) . . . . . . (many others) Chuang et al. (Icalp98)

slide-78
SLIDE 78

A more generic approach?

slide-79
SLIDE 79

Decomposition of quadrangulations...by the french artist L´ eon Gischia

First idea (following Luca Castelli Aleardi)

(1903-1991)

slide-80
SLIDE 80

Literary digression

Teacher Listen to me, If you cannot deeply understand these principles, these arithmetic archetypes, you will never perform correctly a ”polytechnicien” job... you will never obtain a teaching position at ”Ecole Polytechnique”. For example, what is 3.755.918.261 multiplied by 5.162.303.508? Student (very quickly) the result is 193891900145... Teacher (very astonished) yes ... the product is really... But, how have you computed it, if you do not know the principles of arithmetic reasoning?

(La le¸ con, Eug` ene Ionesco, 1951)

During a private lesson, a very young student, preparing herself for the total doctorate, talks about arithmetics with her teacher

2nd idea (following Luca Castelli Aleardi)

Student: it is simple: I have learned by heart all possible results of all possible different multiplications. (the young student cannot understand how to subtract integers)

slide-81
SLIDE 81

1 2 3 . . . . . . . . .

Level 1:

  • Θ(

n log2 n) regions of size Θ(log2 n),

represented by pointers to level 2 Level 2: in each of the

n log2 n regions

  • Θ(log n) regions of size C log n,

represented by pointers to level 3

A hierarchical approach, with a dictionary at bottom.

Level 3: exhaustive catalog of all different regions of size i < C log n:

  • complete explicit representation.
slide-82
SLIDE 82

1 2 3 . . . . . . . . .

Level 1:

  • Θ(

n log2 n) regions of size Θ(log2 n),

represented by pointers to level 2 Level 2: in each of the

n log2 n regions

  • Θ(log n) regions of size C log n,

represented by pointers to level 3

A hierarchical approach, with a dictionary at bottom.

Level 3: exhaustive catalog of all different regions of size i < C log n:

  • complete explicit representation.
  • global pointers of size log n
  • local pointers of size log log n
slide-83
SLIDE 83

1 2 3 . . . . . . . . .

Dictionnary space is o(n) if C small enough. Level 1:

  • Θ(

n log2 n) regions of size Θ(log2 n),

represented by pointers to level 2 Level 2: in each of the

n log2 n regions

  • Θ(log n) regions of size C log n,

represented by pointers to level 3

A hierarchical approach, with a dictionary at bottom.

Level 3: exhaustive catalog of all different regions of size i < C log n:

  • complete explicit representation.
  • global pointers of size log n
  • local pointers of size log log n
slide-84
SLIDE 84

1 2 3 . . . . . . . . .

Dictionnary space is o(n) if C small enough. Level 1:

  • Θ(

n log2 n) regions of size Θ(log2 n),

represented by pointers to level 2 Level 2: in each of the

n log2 n regions

  • Θ(log n) regions of size C log n,

represented by pointers to level 3

A hierarchical approach, with a dictionary at bottom.

Level 3: exhaustive catalog of all different regions of size i < C log n:

  • complete explicit representation.
  • global pointers of size log n
  • local pointers of size log log n

space O(

n log2 n · log n) = o(n)

slide-85
SLIDE 85

1 2 3 . . . . . . . . .

Dictionnary space is o(n) if C small enough. Level 1:

  • Θ(

n log2 n) regions of size Θ(log2 n),

represented by pointers to level 2 Level 2: in each of the

n log2 n regions

  • Θ(log n) regions of size C log n,

represented by pointers to level 3

A hierarchical approach, with a dictionary at bottom.

Level 3: exhaustive catalog of all different regions of size i < C log n:

  • complete explicit representation.
  • global pointers of size log n
  • local pointers of size log log n

space O(

n log2 n · log n) = o(n)

space O(

n log n · log log n) = o(n)

slide-86
SLIDE 86

The dominant term is given by the sum of references to the dictionary

  • j 2.175kj = 2.175m bits

r

k triangles

Dominant term?

2.175bpt is entropy

  • f triangulations

with a boundary

A hierarchical approach, with a dictionary at bottom.

references on objects of Tk have size log2 Tk ∼ 2.175k if k → ∞

slide-87
SLIDE 87

The dominant term is given by the sum of references to the dictionary

  • j 2.175kj = 2.175m bits

r

k triangles

Dominant term?

2.175bpt is entropy

  • f triangulations

with a boundary

A hierarchical approach, with a dictionary at bottom.

references on objects of Tk have size log2 Tk ∼ 2.175k if k → ∞ we should take all k s.t.

1 12 log n < k < 1 2 log n

slide-88
SLIDE 88

The dominant term is given by the sum of references to the dictionary

  • j 2.175kj = 2.175m bits

r

k triangles

Dominant term?

2.175bpt is entropy

  • f triangulations

with a boundary

A hierarchical approach, with a dictionary at bottom.

references on objects of Tk have size log2 Tk ∼ 2.175k if k → ∞ larger than previous

1 2 · 3.24bpt

we should take all k s.t.

1 12 log n < k < 1 2 log n

slide-89
SLIDE 89

The dominant term is given by the sum of references to the dictionary

  • j 2.175kj = 2.175m bits

r

k triangles

Dominant term?

2.175bpt is entropy

  • f triangulations

with a boundary

A hierarchical approach, with a dictionary at bottom.

references on objects of Tk have size log2 Tk ∼ 2.175k if k → ∞ larger than previous

1 2 · 3.24bpt

we should take all k s.t.

1 12 log n < k < 1 2 log n

Adaptative to ”reasonable” entropy reduction

slide-90
SLIDE 90

A word of conclusion

  • A relatively generic method to get adaptative s.d.s:

triangulations with boundary, trees, polyhedral maps... but complex hierarchical structure, unpractical subleading terms...

  • Some examples of nice optimal encodings

but not so adaptative and no query support ⇒ find an optimal adaptative encoder for triangulations with given degrees ⇒ develop ”elegant” succinct data structures: a non asymptotic 2n + O(log n) bits sds for plane trees with n vertices? ⇒ find other parameters of trees or maps that allow for simple adaptative compression or sds (depth?)