Ordered Set Problems Giulio Ermanno Pibiri - - PowerPoint PPT Presentation

ordered set problems
SMART_READER_LITE
LIVE PREVIEW

Ordered Set Problems Giulio Ermanno Pibiri - - PowerPoint PPT Presentation

Ordered Set Problems Giulio Ermanno Pibiri giulio.pibiri@di.unipi.it http://pages.di.unipi.it/pibiri 07/06/2019 The Static Ordered Set Problem Given a set of n items and an order relation defined on them, we are asked to design a data


slide-1
SLIDE 1

Ordered Set Problems

Giulio Ermanno Pibiri

07/06/2019

giulio.pibiri@di.unipi.it http://pages.di.unipi.it/pibiri

slide-2
SLIDE 2

The Static Ordered Set Problem

Given a set of n items and an order relation defined on them,
 we are asked to design a data structure that supports
 Access, Contains, Successor, Predecessor efficiently.

slide-3
SLIDE 3

The Static Ordered Set Problem

Given a set of n items and an order relation defined on them,
 we are asked to design a data structure that supports
 Access, Contains, Successor, Predecessor efficiently.

Let us assume our items are integers
 drawn from some universe of size u ≥ n.

slide-4
SLIDE 4

The Static Ordered Set Problem

Given a set of n items and an order relation defined on them,
 we are asked to design a data structure that supports
 Access, Contains, Successor, Predecessor efficiently.

Let us assume our items are integers
 drawn from some universe of size u ≥ n.

If the integers are not to be compressed:
 use an array.
 Operations are made efficient
 by binary search with loop unrolling
 with cut-off to SSE/AVX (SIMD) linear search


  • n small segments.

If the keys are uniformly distributed, interpolation search can help:
 O(log log n) time with high probability.

slide-5
SLIDE 5

The Static Ordered Set Problem

Given a set of n items and an order relation defined on them,
 we are asked to design a data structure that supports
 Access, Contains, Successor, Predecessor efficiently.

Let us assume our items are integers
 drawn from some universe of size u ≥ n.

If the integers are not to be compressed:
 use an array.
 Operations are made efficient
 by binary search with loop unrolling
 with cut-off to SSE/AVX (SIMD) linear search


  • n small segments.

If the keys are uniformly distributed, interpolation search can help:
 O(log log n) time with high probability.

Let us also assume n is so big that we
 must compress the set.

slide-6
SLIDE 6

Sorted integer sets are ubiquitous

Inverted indexes Databases Semantic data Geospatial data Graph compression E-Commerce

slide-7
SLIDE 7

The Static Compressed Ordered Set Problem

Large research corpora describing different space/time trade-offs.

  • Elias’ Gamma and Delta
  • Elias-Fano
  • Variable-Byte Family
  • Binary Interpolative Coding
  • Simple Family
  • PForDelta
  • QMX
  • Quasi-Succinct
  • Partitioned Elias-Fano
  • Clustered Elias-Fano
  • Optimal Variable-Byte
  • DINT

~1970 2019

+ set intersection, union and decode

slide-8
SLIDE 8

Partitioning by Cardinality

The problem of (almost all) such representations is that
 Access, Contains, Predecessor/Successor
 are not natively supported, but we can just
 decode sequentially.

slide-9
SLIDE 9

Partitioning by Cardinality

The problem of (almost all) such representations is that
 Access, Contains, Predecessor/Successor
 are not natively supported, but we can just
 decode sequentially.

Solution 1
 Introduce some redundancy to accelerate queries:
 the so-called skip pointers.

slide-10
SLIDE 10

Partitioning by Cardinality

The problem of (almost all) such representations is that
 Access, Contains, Predecessor/Successor
 are not natively supported, but we can just
 decode sequentially.

Solution 1
 Introduce some redundancy to accelerate queries:
 the so-called skip pointers.

3 9 10 14 23 24 25 34 38 42 44 49 50 65 71 98

B

slide-11
SLIDE 11

Partitioning by Cardinality

The problem of (almost all) such representations is that
 Access, Contains, Predecessor/Successor
 are not natively supported, but we can just
 decode sequentially.

Solution 1
 Introduce some redundancy to accelerate queries:
 the so-called skip pointers.

14 34 49 98 Upperbounds 3 9 10 14 23 24 25 34 38 42 44 49 50 65 71 98

B

slide-12
SLIDE 12

Partitioning by Cardinality

The problem of (almost all) such representations is that
 Access, Contains, Predecessor/Successor
 are not natively supported, but we can just
 decode sequentially.

Solution 1
 Introduce some redundancy to accelerate queries:
 the so-called skip pointers.

Bits Offsets Upperbounds 14 34 49 98 Upperbounds 3 9 10 14 23 24 25 34 38 42 44 49 50 65 71 98

B

slide-13
SLIDE 13

Partitioning by Cardinality

The problem of (almost all) such representations is that
 Access, Contains, Predecessor/Successor
 are not natively supported, but we can just
 decode sequentially.

Solution 1
 Introduce some redundancy to accelerate queries:
 the so-called skip pointers.

Bits Offsets Upperbounds 14 34 49 98 Upperbounds 3 9 10 14 23 24 25 34 38 42 44 49 50 65 71 98

B Solution 2
 Redesign the data structure.

slide-14
SLIDE 14

Partitioning by Universe

slide-15
SLIDE 15

Partitioning by Universe

Does this remind you of something?

slide-16
SLIDE 16

Partitioning by Universe

[Elias-Fano 1971-1975] Does this remind you of something?

slide-17
SLIDE 17

Partitioning by Universe

[Elias-Fano 1971-1975]

√u

1 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 0 1 1 1

√u

[van Emde Boas 1974-1975]

summary

Does this remind you of something?

slide-18
SLIDE 18

Partitioning by Universe

Assume a slice size of 23

slide-19
SLIDE 19

Partitioning by Universe

Contains(x): i = x >> 3
 search for x - (i << 3) in the i-th slice Assume a slice size of 23

slide-20
SLIDE 20

Partitioning by Universe

Contains(x): i = x >> 3
 search for x - (i << 3) in the i-th slice Assume a slice size of 23 x = 010101

slide-21
SLIDE 21

Partitioning by Universe

Contains(x): i = x >> 3
 search for x - (i << 3) in the i-th slice Assume a slice size of 23 x = 010101 010101

slide-22
SLIDE 22

Partitioning by Universe

Contains(x): i = x >> 3
 search for x - (i << 3) in the i-th slice Assume a slice size of 23 x = 010101 x - 16 = 5 010101

slide-23
SLIDE 23

Partitioning by Universe

Contains(x): i = x >> 3
 search for x - (i << 3) in the i-th slice Assume a slice size of 23 x = 010101 Successor(x):
 i = x >> 3
 search for successor of x - (i << 3) in the i-th slice
 (if i-th slice is empty or x - (i << 3) > max_value in i-th slice,
 then return first value on the right) x - 16 = 5 010101

slide-24
SLIDE 24

Partitioning by Universe

Contains(x): i = x >> 3
 search for x - (i << 3) in the i-th slice Assume a slice size of 23 x = 010101 Successor(x):
 i = x >> 3
 search for successor of x - (i << 3) in the i-th slice
 (if i-th slice is empty or x - (i << 3) > max_value in i-th slice,
 then return first value on the right) Intersection between lists has to intersect only the slices in common between the lists. x - 16 = 5 010101

slide-25
SLIDE 25

Bitmaps

Good old data structure for storing dense sets:
 x-th bit is set if integer x is in the set.

slide-26
SLIDE 26

Bitmaps

Good old data structure for storing dense sets:
 x-th bit is set if integer x is in the set.

1 1 0 0 0 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0

S = {0,1,5,7,8,10,11,14,18,21,22,28,29,30}

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

slide-27
SLIDE 27

Bitmaps

Good old data structure for storing dense sets:
 x-th bit is set if integer x is in the set.

1 1 0 0 0 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0

S = {0,1,5,7,8,10,11,14,18,21,22,28,29,30}

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Contains: testing a bit
 Successor/Predecessor: __builtin_ctzll
 Select: __builtin_ctzll
 Max: __builtin_clzll
 Min: __builtin_ctzll
 Decode: __builtin_ctzll
 Insertion: setting a bit
 Deletion: clearing a bit

slide-28
SLIDE 28

Bitmaps

Good old data structure for storing dense sets:
 x-th bit is set if integer x is in the set.

1 1 0 0 0 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0

S = {0,1,5,7,8,10,11,14,18,21,22,28,29,30}

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Contains: testing a bit
 Successor/Predecessor: __builtin_ctzll
 Select: __builtin_ctzll
 Max: __builtin_clzll
 Min: __builtin_ctzll
 Decode: __builtin_ctzll
 Insertion: setting a bit
 Deletion: clearing a bit

Nothing is better than a bitmap for dense sets.

slide-29
SLIDE 29

Roaring

[Lemire et al. 2013]

Assume u = 232 216 216 … 216 216

≤ 216 spans of 216 values each

slide-30
SLIDE 30

Roaring

[Lemire et al. 2013]

Assume u = 232 216 216 … 216 216 Dense: cardinality > 4096 Sparse: otherwise

Sparse Dense Sparse

Dense spans are represented with bitmaps of 216 bits. Sparse spans are represented with sorted-arrays of 16-bit integers. Ensure at most 16 bits x key
 (excluding overhead)

≤ 216 spans of 216 values each

slide-31
SLIDE 31

Slicing

216 216 … 216 … 216

Dense: cardinality > 216/2

Sparse Dense Sparse

Dense slices are represented with bitmaps of 216 or 28 bits. Sparse slices are represented with sorted-arrays of 8-bit integers.

S D

≤ 216 slices of 216 values each ≤ 28 slices of 28 values each

D S D S D

28 28 28 28 28 28 28

(ensure at most 2 bits x key) Dense: cardinality ≥ 31 (ensure at most 8 bits x key)

Assume u = 232

slide-32
SLIDE 32

Intersection

  • Dense vs. Dense (Bitmap vs. Bitmap):


bitwise AND operations + (usually) automatic compiler vectorization 


  • Dense vs. Sparse (Bitmap vs. Array):


Given the array A: check if bit A[i] is set in the bitmap
 


  • Sparse vs. Sparse (Array vs. Array):


Vectorized processing using _mm_cmpestrm and
 _mm_shuffle_epi8 SIMD instructions Intersection between lists has to intersect only the slices in common between the lists.

slide-33
SLIDE 33

Summing up

Partitioning by Cardinality
 (PC) Partitioning by Universe
 (PU) 2 different paradigms

slide-34
SLIDE 34

Experimental Comparison — Setting

C++ sources

https://github.com/jermp/s_indexes
 https://github.com/jermp/dint
 https://github.com/ot/ds2i https://github.com/RoaringBitmap/CRoaring

Machine Intel i7-4790K CPU @4GHz, 32 GiB RAM, Linux 4.13.0 Compiler gcc 7.2.0 (with all optimizations: -march=native and -O3) Datasets

slide-35
SLIDE 35

Experimental Comparison — Setting

Datasets Configurations

slide-36
SLIDE 36

Experimental Comparison — Compression Effectiveness

bits per integer PC-based methods, such as BIC and PEF , are best for space usage.
 Slicing (PU-based) stands in trade-off position.

slide-37
SLIDE 37

Experimental Comparison — Sequential Decoding Time Experimental Comparison — Compression Effectiveness

ns per integer PU-based methods, are as fast as the fastest (vectorized) PC-based methods.

slide-38
SLIDE 38

Experimental Comparison — Intersection Time

musec per intersection PU-based methods outperform PC-based methods.

slide-39
SLIDE 39

Experimental Comparison — Point Queries Experimental Comparison — Compression Effectiveness

Access: ns per query Successor: ns per query

slide-40
SLIDE 40

Experimental Comparison — The Trade-Off Curve Experimental Comparison — Compression Effectiveness

Density = 1/1000

slide-41
SLIDE 41

Future Research Directions The Dynamic Ordered Set Problem The Static Ordered Set Problem

+ insertions / deletions

slide-42
SLIDE 42

Future Research Directions The Dynamic Ordered Set Problem The Static Ordered Set Problem

Theory
 Fusion Trees
 van Emde Boas Trees
 Exponential Search Trees
 Y-Fast Tries Dynamic Elias-Fano Practice
 Red-Black Trees
 B-Trees

Memory management is the
 challenge.

+ insertions / deletions

slide-43
SLIDE 43

The Dynamic Ordered Set Problem — On-going Work

Insert n = 1,000,000 32-bit keys uniformly distributed

slide-44
SLIDE 44

The Dynamic Ordered Set Problem — On-going Work

Successor
 n = 1,000,000 32-bit keys uniformly distributed

slide-45
SLIDE 45

The Dynamic Ordered Set Problem — On-going Work

Heap usage

slide-46
SLIDE 46

Any questions?

Thanks for your attention, time, patience!