Dimension Reduction and Nearest Neighbor Search Advanced - - PowerPoint PPT Presentation

dimension reduction
SMART_READER_LITE
LIVE PREVIEW

Dimension Reduction and Nearest Neighbor Search Advanced - - PowerPoint PPT Presentation

Dimension Reduction and Nearest Neighbor Search Advanced Algorithms Nanjing University, Fall 2018 Dimension reduction: Why we care? High dimension data are common, yet working on them directly is expensive. Dimension reduction: Why we


slide-1
SLIDE 1

Dimension Reduction and Nearest Neighbor Search

Advanced Algorithms Nanjing University, Fall 2018

slide-2
SLIDE 2

Dimension reduction: Why we care?

  • High dimension data are common, yet working on

them directly is expensive.

slide-3
SLIDE 3

Dimension reduction: Why we care?

  • High dimension data are common, yet working on

them directly is expensive.

  • Or, we just want a map!

low-distortion metric embedding

slide-4
SLIDE 4

Dimension reduction: What we want?

slide-5
SLIDE 5

Dimension reduction: What we want?

  • Usually we want 𝑙 ≪ 𝑒.
  • How small can 𝑙 be?
  • For what distance ⋅ ?
  • The embedding should be efficiently constructible.
slide-6
SLIDE 6

The JLT (Johonson-Linenstrauss Theorem)

“ In Euclidian space, it is always possible to embed a set of 𝑜 points in arbitrary dimension to 𝑃(log 𝑜) dimension with constant distortion. ”

slide-7
SLIDE 7

The JLT (Johonson-Linenstrauss Theorem)

“ In Euclidian space, it is always possible to embed a set of 𝑜 points in arbitrary dimension to 𝑃(log 𝑜) dimension with constant distortion. ” Theorem (Johnson-Lindenstrauss 1984):

∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝜚 𝑦𝑗 − 𝜚 𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

slide-8
SLIDE 8

Theorem (Johnson-Lindenstrauss 1984):

∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝜚 𝑦𝑗 − 𝜚 𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

“ Even better, it is very easy to find such 𝜚 ”

slide-9
SLIDE 9

Theorem (Johnson-Lindenstrauss 1984):

∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝜚 𝑦𝑗 − 𝜚 𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

“ Even better, it is very easy to find such 𝜚 ” “ Just sample a random 𝑙 × 𝑒 matrix 𝐵 ”

slide-10
SLIDE 10

Theorem (Johnson-Lindenstrauss 1984):

∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝜚 𝑦𝑗 − 𝜚 𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

“ Even better, it is very easy to find such 𝜚 ” “ Just sample a random 𝑙 × 𝑒 matrix 𝐵 ”

Images extracted from https://graphics.stanford.edu/courses/cs468-06-fall/Slides/aneesh-michael.pdf

slide-11
SLIDE 11

Theorem (Johnson-Lindenstrauss 1984):

∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝜚 𝑦𝑗 − 𝜚 𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

“ Even better, it is very easy to find such 𝜚 ” “ Just sample a random 𝑙 × 𝑒 matrix 𝐵 ”

How to construct (sample) 𝐵?

  • Project onto uniform random 𝑙-dimensional subspace;

(Johnson-Lindenstrauss; Dasgupta-Gupta)

  • Independent Gaussian entries; (Indyk-Motwani)
  • Simply i.i.d. +1/-1 entries. (Achlioptas)
slide-12
SLIDE 12

Theorem (Johnson-Lindenstrauss 1984):

∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝜚 𝑦𝑗 − 𝜚 𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

“ Even better, it is very easy to find such 𝜚 ” “ Just sample a random 𝑙 × 𝑒 matrix 𝐵 ”

How to construct (sample) 𝐵?

  • Project onto uniform random 𝑙-dimensional subspace;

(Johnson-Lindenstrauss; Dasgupta-Gupta)

  • Independent Gaussian entries; (Indyk-Motwani)
  • Simply i.i.d. +1/-1 entries. (Achlioptas)
slide-13
SLIDE 13

Choose some suitable 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜). For each entry 𝑏𝑗,𝑘 of 𝐵 ∈ ℝ𝑙×𝑒: Independently sample 𝑏𝑗,𝑘 from Gaussian distribution 𝑶(0,1/𝑙). For each 𝑚 ∈ [𝑜]: Let 𝑧𝑚 = 𝐵𝑦𝑚.

slide-14
SLIDE 14

Choose some suitable 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜). For each entry 𝑏𝑗,𝑘 of 𝐵 ∈ ℝ𝑙×𝑒: Independently sample 𝑏𝑗,𝑘 from Gaussian distribution 𝑶(0,1/𝑙). For each 𝑚 ∈ [𝑜]: Let 𝑧𝑚 = 𝐵𝑦𝑚.

Gaussian distribution (a.k.a. normal distribution) 𝑶(𝜈, 𝜏2): 𝔽 𝑌 = 𝜈, Var 𝑌 = 𝜏2 Pr 𝑌 ≤ 𝑢 = න

−∞ 𝑢

1 2𝜌𝜏2 𝑓− 𝑦−𝜈 2/(2𝜏2) d𝑦

slide-15
SLIDE 15

∀0 < 𝜗 < 1: let 𝐵 be a 𝑙 × 𝑒 matrix with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), if each entry of 𝐵 is independently drawn from 𝑶(0,1/𝑙), then with high probability: for all 𝑦𝑗, 𝑦𝑘 ∈ 𝑇, 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝐵𝑦𝑗 − 𝐵𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

slide-16
SLIDE 16

∀0 < 𝜗 < 1: let 𝐵 be a 𝑙 × 𝑒 matrix with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), if each entry of 𝐵 is independently drawn from 𝑶(0,1/𝑙), then with high probability: for all 𝑦𝑗, 𝑦𝑘 ∈ 𝑇, 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝐵𝑦𝑗 − 𝐵𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

1 − 𝜗 ≤ 𝐵 𝑦𝑗 − 𝑦𝑘 𝑦𝑗 − 𝑦𝑘

2 2 2

≤ 1 + 𝜗

slide-17
SLIDE 17

∀0 < 𝜗 < 1: let 𝐵 be a 𝑙 × 𝑒 matrix with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), if each entry of 𝐵 is independently drawn from 𝑶(0,1/𝑙), then with high probability: for all 𝑦𝑗, 𝑦𝑘 ∈ 𝑇, 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝐵𝑦𝑗 − 𝐵𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

1 − 𝜗 ≤ 𝐵 𝑦𝑗 − 𝑦𝑘 𝑦𝑗 − 𝑦𝑘

2 2 2

≤ 1 + 𝜗 unit vector!

slide-18
SLIDE 18

Norm Preservation

∀0 < 𝜗 < 1: let 𝐵 be a 𝑙 × 𝑒 matrix with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), if each entry of 𝐵 is independently drawn from 𝑶(0,1/𝑙), then with high probability: for all 𝑦𝑗, 𝑦𝑘 ∈ 𝑇, 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝐵𝑦𝑗 − 𝐵𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

1 − 𝜗 ≤ 𝐵 𝑦𝑗 − 𝑦𝑘 𝑦𝑗 − 𝑦𝑘

2 2 2

≤ 1 + 𝜗 unit vector!

slide-19
SLIDE 19

Norm Preservation

∀0 < 𝜗 < 1: let 𝐵 be a 𝑙 × 𝑒 matrix with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), if each entry of 𝐵 is independently drawn from 𝑶(0,1/𝑙), then with high probability: for all 𝑦𝑗, 𝑦𝑘 ∈ 𝑇, 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝐵𝑦𝑗 − 𝐵𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

1 − 𝜗 ≤ 𝐵 𝑦𝑗 − 𝑦𝑘 𝑦𝑗 − 𝑦𝑘

2 2 2

≤ 1 + 𝜗 unit vector!

For any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3 union bound over 𝑃(𝑜2) pairs of 𝑦𝑗, 𝑦𝑘 ∈ 𝑇

slide-20
SLIDE 20

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

slide-21
SLIDE 21

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

slide-22
SLIDE 22

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

slide-23
SLIDE 23

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

  • Each 𝐵𝑗𝑘 is chosen i.i.d. from 𝑶(0,1/𝑙)
  • Linear combination of independent Gaussian r.v. is also Gaussian
  • 𝑌~𝑶 𝜈𝑌, 𝜏𝑌

2 , 𝑍~𝑶 𝜈𝑍, 𝜏𝑍 2 → 𝑏𝑌 + 𝑐𝑍~𝑶 𝑏𝜈𝑌 + 𝑐𝜈𝑍, 𝑏2𝜏𝑌 2 + 𝑐2𝜏𝑍 2

slide-24
SLIDE 24

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

  • Each 𝐵𝑗𝑘 is chosen i.i.d. from 𝑶(0,1/𝑙)
  • Linear combination of independent Gaussian r.v. is also Gaussian
  • 𝑌~𝑶 𝜈𝑌, 𝜏𝑌

2 , 𝑍~𝑶 𝜈𝑍, 𝜏𝑍 2 → 𝑏𝑌 + 𝑐𝑍~𝑶 𝑏𝜈𝑌 + 𝑐𝜈𝑍, 𝑏2𝜏𝑌 2 + 𝑐2𝜏𝑍 2

slide-25
SLIDE 25

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

  • Each 𝐵𝑗𝑘 is chosen i.i.d. from 𝑶(0,1/𝑙)
  • Linear combination of independent Gaussian r.v. is also Gaussian
  • 𝑌~𝑶 𝜈𝑌, 𝜏𝑌

2 , 𝑍~𝑶 𝜈𝑍, 𝜏𝑍 2 → 𝑏𝑌 + 𝑐𝑍~𝑶 𝑏𝜈𝑌 + 𝑐𝜈𝑍, 𝑏2𝜏𝑌 2 + 𝑐2𝜏𝑍 2

𝑣 is unit vector

slide-26
SLIDE 26

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

  • Each 𝐵𝑗𝑘 is chosen i.i.d. from 𝑶(0,1/𝑙)
  • Linear combination of independent Gaussian r.v. is also Gaussian
  • 𝑌~𝑶 𝜈𝑌, 𝜏𝑌

2 , 𝑍~𝑶 𝜈𝑍, 𝜏𝑍 2 → 𝑏𝑌 + 𝑐𝑍~𝑶 𝑏𝜈𝑌 + 𝑐𝜈𝑍, 𝑏2𝜏𝑌 2 + 𝑐2𝜏𝑍 2

𝑣 is unit vector

Moreover, these 𝐵𝑣 𝑗 are mutually independent!

slide-27
SLIDE 27

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

slide-28
SLIDE 28

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

slide-29
SLIDE 29

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

slide-30
SLIDE 30

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

In terms of expectation we are fine, but how fast do we deviate from expectation?

slide-31
SLIDE 31

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

slide-32
SLIDE 32

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

slide-33
SLIDE 33

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

slide-34
SLIDE 34

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

slide-35
SLIDE 35

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

Notice 𝑌𝑗 = 𝑙 ⋅ 𝑍

𝑗

slide-36
SLIDE 36

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

slide-37
SLIDE 37

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

For suitable 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜)

slide-38
SLIDE 38

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

slide-39
SLIDE 39

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

slide-40
SLIDE 40

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

slide-41
SLIDE 41

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

slide-42
SLIDE 42

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

slide-43
SLIDE 43

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =

1 1−2𝑡

slide-44
SLIDE 44

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =

1 1−2𝑡

slide-45
SLIDE 45

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =

1 1−2𝑡

slide-46
SLIDE 46

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =

1 1−2𝑡

slide-47
SLIDE 47

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =

1 1−2𝑡

when 𝜇 ≤ 1/4

slide-48
SLIDE 48

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =

1 1−2𝑡

when 𝜇 ≤ 1/4 let 𝜇 = 𝜗/4

slide-49
SLIDE 49

Theorem (Johnson-Lindenstrauss 1984):

∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝜚 𝑦𝑗 − 𝜚 𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

“ Even better, it is very easy to find such 𝜚: Just sample a random 𝑙 × 𝑒 matrix 𝐵 ” “ JLT states in Euclidian space, it is always possible to embed a set of 𝑜 points in arbitrary dimension to 𝑃(log 𝑜) dimension with constant distortion. ”

slide-50
SLIDE 50

Nearest Neighbor Search (NNS)

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, find the 𝑧𝑗 which is closest to Ԧ 𝑦

slide-51
SLIDE 51

Nearest Neighbor Search (NNS)

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, find the 𝑧𝑗 which is closest to Ԧ 𝑦

a set a distance function satisfying triangle inequality

slide-52
SLIDE 52

Nearest Neighbor Search (NNS)

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, find the 𝑧𝑗 which is closest to Ԧ 𝑦

slide-53
SLIDE 53

Nearest Neighbor Search (NNS)

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Can find many applications in:

  • database systems
  • pattern recognition
  • machine learning
  • bioinformatics
slide-54
SLIDE 54

Nearest Neighbor Search (NNS)

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Can find many applications in:

  • database systems
  • pattern recognition
  • machine learning
  • bioinformatics

size sound

?

slide-55
SLIDE 55

Nearest Neighbor Search (NNS)

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑉 𝑒 for some finite 𝑉 Query: given a point Ԧ 𝑦 ∈ 𝑉 𝑒, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Goal: Efficiently answer the query

What efficiency we care?

  • Usually space and time

Trivial solution:

  • No preprocessing, just linear search

When dimension 𝑒 is small:

  • Binary search when 𝑒 = 1
  • 𝑙-d tree
  • Voronoi diagram

𝑙-d tree Voronoi diagram

slide-56
SLIDE 56

Nearest Neighbor Search (NNS)

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑉 𝑒 for some finite 𝑉 Query: given a point Ԧ 𝑦 ∈ 𝑉 𝑒, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Goal: Efficiently answer the query What if dimension 𝑒 is large, say 𝑒 ≫ log 𝑜?

slide-57
SLIDE 57

Nearest Neighbor Search (NNS)

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑉 𝑒 for some finite 𝑉 Query: given a point Ԧ 𝑦 ∈ 𝑉 𝑒, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Goal: Efficiently answer the query What if dimension 𝑒 is large, say 𝑒 ≫ log 𝑜? Curse of dimensionality: It is conjectured that to solve NNS in high dimension requires either super-polynomial(𝑜) space or super-polynomial(𝑒) time.

slide-58
SLIDE 58

Nearest Neighbor Search (NNS)

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑉 𝑒 for some finite 𝑉 Query: given a point Ԧ 𝑦 ∈ 𝑉 𝑒, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Goal: Efficiently answer the query What if dimension 𝑒 is large, say 𝑒 ≫ log 𝑜? Curse of dimensionality: It is conjectured that to solve NNS in high dimension requires either super-polynomial(𝑜) space or super-polynomial(𝑒) time. Blessing: Randomization + Approximation

slide-59
SLIDE 59

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min

1≤𝑘≤𝑜 dist( Ԧ

𝑦, 𝑧𝑘)

(𝒅, 𝒔)-ANN (Approximate Near Neighbor):

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

Approximate Near(est) Neighbor (ANN)

slide-60
SLIDE 60

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min

1≤𝑘≤𝑜 dist( Ԧ

𝑦, 𝑧𝑘)

(𝒅, 𝒔)-ANN (Approximate Near Neighbor):

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise
slide-61
SLIDE 61

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min

1≤𝑘≤𝑜 dist( Ԧ

𝑦, 𝑧𝑘)

(𝒅, 𝒔)-ANN (Approximate Near Neighbor):

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead.

slide-62
SLIDE 62

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min

1≤𝑘≤𝑜 dist( Ԧ

𝑦, 𝑧𝑘)

(𝒅, 𝒔)-ANN (Approximate Near Neighbor):

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead. 𝐸𝑛𝑗𝑜 = min

1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)

𝐸𝑛𝑏𝑦 = max

1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)

𝑆 = 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 −1, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 0, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 1, ⋯ , 𝐸𝑛𝑏𝑦

slide-63
SLIDE 63

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min

1≤𝑘≤𝑜 dist( Ԧ

𝑦, 𝑧𝑘)

(𝒅, 𝒔)-ANN (Approximate Near Neighbor):

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead. 𝐸𝑛𝑗𝑜 = min

1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)

𝐸𝑛𝑏𝑦 = max

1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)

𝑆 = 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 −1, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 0, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 1, ⋯ , 𝐸𝑛𝑏𝑦 Let 𝑠∗ be the min in 𝑆 s.t. ( 𝑑, 𝑠∗)-ANN returns yes with 𝑧∗

slide-64
SLIDE 64

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min

1≤𝑘≤𝑜 dist( Ԧ

𝑦, 𝑧𝑘)

(𝒅, 𝒔)-ANN (Approximate Near Neighbor):

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead. 𝐸𝑛𝑗𝑜 = min

1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)

𝐸𝑛𝑏𝑦 = max

1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)

𝑆 = 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 −1, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 0, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 1, ⋯ , 𝐸𝑛𝑏𝑦 Let 𝑠∗ be the min in 𝑆 s.t. ( 𝑑, 𝑠∗)-ANN returns yes with 𝑧∗ dist Ԧ 𝑦, 𝑧∗ ≤ 𝑑 ⋅ 𝑠∗ ∀𝑧𝑗 ∈ 𝑌: dist Ԧ 𝑦, 𝑧𝑗 > 𝑠∗/ 𝑑

slide-65
SLIDE 65

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min

1≤𝑘≤𝑜 dist( Ԧ

𝑦, 𝑧𝑘)

(𝒅, 𝒔)-ANN (Approximate Near Neighbor):

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead. ∀𝑠: ( 𝑑, 𝑠)-ANN can be solved with space 𝑡 and query time 𝑢 𝑑-ANN can be solved with space 𝑃 𝑡 ⋅ log𝑑 ൗ

𝐸𝑛𝑏𝑦 𝐸𝑛𝑗𝑜

and query time 𝑃 𝑢 ⋅ log2 log𝑑 ൗ

𝐸𝑛𝑏𝑦 𝐸𝑛𝑗𝑜

slide-66
SLIDE 66

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min

1≤𝑘≤𝑜 dist( Ԧ

𝑦, 𝑧𝑘)

(𝒅, 𝒔)-ANN (Approximate Near Neighbor):

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead. ∀𝑠: ( 𝑑, 𝑠)-ANN can be solved with space 𝑡 and query time 𝑢 𝑑-ANN can be solved with space 𝑃 𝑡 ⋅ log𝑑 ൗ

𝐸𝑛𝑏𝑦 𝐸𝑛𝑗𝑜

and query time 𝑃 𝑢 ⋅ log2 log𝑑 ൗ

𝐸𝑛𝑏𝑦 𝐸𝑛𝑗𝑜

slide-67
SLIDE 67

Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

Let 𝑙, 𝑞 and 𝑡 to be fixed later. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦). GF(2): two elements {0,1}, XOR as sum, AND as multiplication. Therefore, 𝑨𝑗 𝑘 = 𝐵𝑧𝑗 𝑘 = σ𝑚=1

𝑒

𝐵𝑘𝑚 ⋅ 𝑧𝑗 𝑚 mod 2. Space: 𝑃(𝑜 ⋅ 2𝑙) Query time: 𝑃(𝑒𝑙) computation + 𝑃(1) memory access

slide-68
SLIDE 68

Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

Let 𝑙, 𝑞 and 𝑡 to be fixed later. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦).

slide-69
SLIDE 69

Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

Let 𝑙, 𝑞 and 𝑡 to be fixed later. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦). For suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒:

slide-70
SLIDE 70

Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

Let 𝑙, 𝑞 and 𝑡 to be fixed later. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦). For suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒:

(𝑑, 𝑠)-ANN is solved w.h.p.

slide-71
SLIDE 71

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒:

slide-72
SLIDE 72

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: each row vector 𝐵𝑗 of 𝐵 has i.i.d. entries ∈ Bernoulli(𝑞)

slide-73
SLIDE 73

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: each row vector 𝐵𝑗 of 𝐵 has i.i.d. entries ∈ Bernoulli(𝑞)

slide-74
SLIDE 74

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: each row vector 𝐵𝑗 of 𝐵 has i.i.d. entries ∈ Bernoulli(𝑞)

an alternative view regarding the generation of 𝐵𝑗:

  • build 𝐷 ⊆ [𝑒] s.t. each element in [𝑒] is chosen independently with pr. 2𝑞
  • each coordinate in 𝐷 is independently set to 0 or 1 each with pr. 1/2
slide-75
SLIDE 75

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: each row vector 𝐵𝑗 of 𝐵 has i.i.d. entries ∈ Bernoulli(𝑞)

an alternative view regarding the generation of 𝐵𝑗:

  • build 𝐷 ⊆ [𝑒] s.t. each element in [𝑒] is chosen independently with pr. 2𝑞
  • each coordinate in 𝐷 is independently set to 0 or 1 each with pr. 1/2
  • bservations:
  • if 𝑘 ∉ 𝐷 for all coordinates 𝑘 where Ԧ

𝑦 𝑘 ≠ Ԧ 𝑧 𝑘, then 𝐵 Ԧ 𝑦 𝑗 = 𝐵 Ԧ 𝑧 𝑗

  • otherwise, if exists such 𝑘 ∈ 𝐷, then once all other entries in 𝐵𝑗 are fixed,

exactly one of the two choices for 𝐵𝑗𝑘 will make 𝐵 Ԧ 𝑦 𝑗 = 𝐵 Ԧ 𝑧 𝑗

slide-76
SLIDE 76

choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠 random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: each row vector 𝐵𝑗 of 𝐵 has i.i.d. entries ∈ Bernoulli(𝑞)

slide-77
SLIDE 77

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

slide-78
SLIDE 78

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1

𝑙

𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗

  • therwise
slide-79
SLIDE 79

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1

𝑙

𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗

  • therwise
slide-80
SLIDE 80

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1

𝑙

𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗

  • therwise

choose 𝑡 =

1 4+1 2−2− 𝑑+1

𝑙 2

=

3 8 − 2−(𝑑+2) 𝑙

slide-81
SLIDE 81

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1

𝑙

𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗

  • therwise

choose 𝑡 =

1 4+1 2−2− 𝑑+1

𝑙 2

=

3 8 − 2−(𝑑+2) 𝑙

slide-82
SLIDE 82

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1

𝑙

𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗

  • therwise

choose 𝑡 =

1 4+1 2−2− 𝑑+1

𝑙 2

=

3 8 − 2−(𝑑+2) 𝑙

independent

slide-83
SLIDE 83

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1

𝑙

𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗

  • therwise

choose 𝑡 =

1 4+1 2−2− 𝑑+1

𝑙 2

=

3 8 − 2−(𝑑+2) 𝑙

independent

Chernoff bound: Let independent r.v. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙 ∈ {0,1}, let 𝑌 = σ𝑗=1

𝑙

𝑌𝑗, then for 𝑡 > 0: Pr 𝑌 ≥ 𝔽 𝑌 + 𝑡 ≤ exp − 2𝑡2 𝑙 Pr 𝑌 ≤ 𝔽 𝑌 − 𝑡 ≤ exp − 2𝑡2 𝑙

slide-84
SLIDE 84

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1

𝑙

𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗

  • therwise

choose 𝑡 =

1 4+1 2−2− 𝑑+1

𝑙 2

=

3 8 − 2−(𝑑+2) 𝑙

independent

Chernoff bound: Let independent r.v. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙 ∈ {0,1}, let 𝑌 = σ𝑗=1

𝑙

𝑌𝑗, then for 𝑡 > 0: Pr 𝑌 ≥ 𝔽 𝑌 + 𝑡 ≤ exp − 2𝑡2 𝑙 Pr 𝑌 ≤ 𝔽 𝑌 − 𝑡 ≤ exp − 2𝑡2 𝑙

slide-85
SLIDE 85

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1

𝑙

𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗

  • therwise

choose 𝑡 =

1 4+1 2−2− 𝑑+1

𝑙 2

=

3 8 − 2−(𝑑+2) 𝑙

independent

slide-86
SLIDE 86

Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

Let 𝑙 =

ln 𝑜 Τ 1 8−2−(𝑑+2), 𝑞 =

Τ 1 − 2−1/𝑠 2 and 𝑡 = Τ 3 8 − 2−(𝑑+2) 𝑙. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦). Space: 𝑃(𝑜 ⋅ 2𝑙) Query time: 𝑃(𝑒𝑙) computation + 𝑃(1) memory access

slide-87
SLIDE 87

Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

Let 𝑙 =

ln 𝑜 Τ 1 8−2−(𝑑+2), 𝑞 =

Τ 1 − 2−1/𝑠 2 and 𝑡 = Τ 3 8 − 2−(𝑑+2) 𝑙. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦). Space: 𝑃(𝑜 ⋅ 2𝑙) Query time: 𝑃(𝑒𝑙) computation + 𝑃(1) memory access Space: 𝑜𝑃(1) Query time: 𝑃(𝑒 log 𝑜)

Solve (𝑑, 𝑠)-ANN w.h.p.

slide-88
SLIDE 88

Locality-Sensitive Hashing (LSH)

Given a metric space 𝑌, dist , a random ℎ: 𝑌 → 𝑉 drawn from ℋ is an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH if, for all Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌:

slide-89
SLIDE 89

Locality-Sensitive Hashing (LSH)

Given a metric space 𝑌, dist , a random ℎ: 𝑌 → 𝑉 drawn from ℋ is an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH if, for all Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: 𝑞 > 𝑟

slide-90
SLIDE 90

Locality-Sensitive Hashing (LSH)

Given a metric space 𝑌, dist , a random ℎ: 𝑌 → 𝑉 drawn from ℋ is an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH if, for all Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: If there exists an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH ℎ: 𝑌 → 𝑉, then there exists an (𝑠, 𝑑𝑠, 𝑞𝑙, 𝑟𝑙)-LSH 𝑕: 𝑌 → 𝑉𝑙

slide-91
SLIDE 91

Locality-Sensitive Hashing (LSH)

Given a metric space 𝑌, dist , a random ℎ: 𝑌 → 𝑉 drawn from ℋ is an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH if, for all Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: If there exists an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH ℎ: 𝑌 → 𝑉, then there exists an (𝑠, 𝑑𝑠, 𝑞𝑙, 𝑟𝑙)-LSH 𝑕: 𝑌 → 𝑉𝑙

Independently draw ℎ1, ℎ2, ⋯ , ℎ𝑙 according to the distribution of ℎ 𝑕 𝑦 = ℎ1 𝑦 , ℎ2 𝑦 , ⋯ , ℎ𝑙 𝑦 ∈ 𝑉𝑙

slide-92
SLIDE 92

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

(𝒅, 𝒔)-ANN

Suppose we have (𝑠, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌:

slide-93
SLIDE 93

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

(𝒅, 𝒔)-ANN

Suppose we have (𝑠, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in nondecreasing order of 𝑕(𝑧𝑗). Upon query Ԧ 𝑦 ∈ 𝑌: Find all 𝑧𝑗 such that 𝑕 Ԧ 𝑦 = 𝑕(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

  • therwise return “no”.
slide-94
SLIDE 94

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

(𝒅, 𝒔)-ANN

Suppose we have (𝑠, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in nondecreasing order of 𝑕(𝑧𝑗). Upon query Ԧ 𝑦 ∈ 𝑌: Find all 𝑧𝑗 such that 𝑕 Ԧ 𝑦 = 𝑕(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

  • therwise return “no”.

If the real answer is “no”: always correct If the real answer is “yes”: correct with probability at least 𝑞∗

slide-95
SLIDE 95

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

(𝒅, 𝒔)-ANN

Suppose we have (𝑠, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in nondecreasing order of 𝑕(𝑧𝑗). Upon query Ԧ 𝑦 ∈ 𝑌: Find all 𝑧𝑗 such that 𝑕 Ԧ 𝑦 = 𝑕(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

  • therwise return “no”.

If the real answer is “no”: always correct If the real answer is “yes”: correct with probability at least 𝑞∗ Space: 𝑃(𝑜) Time: 𝑃(log 𝑜)

slide-96
SLIDE 96

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

(𝒅, 𝒔)-ANN

Suppose we have (𝑠, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in nondecreasing order of 𝑕(𝑧𝑗). Upon query Ԧ 𝑦 ∈ 𝑌: Find all 𝑧𝑗 such that 𝑕 Ԧ 𝑦 = 𝑕(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

  • therwise return “no”.

If the real answer is “no”: always correct If the real answer is “yes”: correct with probability at least 𝑞∗ Space: 𝑃(𝑜) Time: 𝑃(log 𝑜) + 𝑃(1) in expectation

slide-97
SLIDE 97

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

Suppose we have (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

  • therwise return “no”.

(𝒅, 𝒔)-ANN

slide-98
SLIDE 98

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

  • therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌

slide-99
SLIDE 99

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

  • therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct.

slide-100
SLIDE 100

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

  • therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤

slide-101
SLIDE 101

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

  • therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑕𝑘 Ԧ 𝑦 ≠ 𝑕𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗)]

slide-102
SLIDE 102

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

  • therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑕𝑘 Ԧ 𝑦 ≠ 𝑕𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗)]

slide-103
SLIDE 103

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

  • therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑕𝑘 Ԧ 𝑦 ≠ 𝑕𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗)] ≤ 1 − 𝑞∗ 𝑚 ≤ Τ 1 𝑓

slide-104
SLIDE 104

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

  • therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑕𝑘 Ԧ 𝑦 ≠ 𝑕𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗)] ≤ 1 − 𝑞∗ 𝑚 ≤ Τ 1 𝑓

slide-105
SLIDE 105

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

  • therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑕𝑘 Ԧ 𝑦 ≠ 𝑕𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗)] ≤ 1 − 𝑞∗ 𝑚 ≤ Τ 1 𝑓 ≤ 𝔽 number of such bad 𝑧𝑗 10𝑚 ≤ 𝑜 ⋅ 𝑚 ⋅ Τ 1 𝑜 10𝑚 = 0.1

slide-106
SLIDE 106

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

  • therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑕𝑘 Ԧ 𝑦 ≠ 𝑕𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗)] ≤ 1 − 𝑞∗ 𝑚 ≤ Τ 1 𝑓 ≤ 𝔽 number of such bad 𝑧𝑗 10𝑚 ≤ 𝑜 ⋅ 𝑚 ⋅ Τ 1 𝑜 10𝑚 = 0.1

+ ≤ Τ 1 𝑓 + 0.1 < 0.5

slide-107
SLIDE 107

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

  • therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌

If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" < 0.5

slide-108
SLIDE 108

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

  • therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌

If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" < 0.5 Space: 𝑃 𝑜𝑚 = 𝑃( Τ 𝑜 𝑞∗) Time: 𝑃 𝑚 ⋅ log 𝑜 = 𝑃 Τ (log 𝑜) 𝑞∗

slide-109
SLIDE 109

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

(𝒅, 𝒔)-ANN Suppose we have (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH ℎ: 𝑌 → 𝑉 We have (𝑠, 𝑑𝑠, 𝑞𝑙, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉𝑙 where 𝑙 = log(1/𝑟) 𝑜, implying 𝑞𝑙 = 𝑞log1/𝑟 𝑜 = 𝑜−𝜍 𝜍 = log 𝑞 log 𝑟 Hence we can solve (𝑑, 𝑠)-ANN with space 𝑃(𝑜1+𝜍) and query time 𝑃(𝑜𝜍 ⋅ log 𝑜) and one-sided error < 0.5

slide-110
SLIDE 110

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆

slide-111
SLIDE 111

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆 ℋ = ℎ𝑗 | ℎ𝑗 Ԧ 𝑦 = Ԧ 𝑦 𝑗 for 𝑗 = 1,2, ⋯ , 𝑒 ℎ is chosen uniformly at random from ℋ

slide-112
SLIDE 112

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆 ℋ = ℎ𝑗 | ℎ𝑗 Ԧ 𝑦 = Ԧ 𝑦 𝑗 for 𝑗 = 1,2, ⋯ , 𝑒 ℎ is chosen uniformly at random from ℋ

slide-113
SLIDE 113

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆 ℋ = ℎ𝑗 | ℎ𝑗 Ԧ 𝑦 = Ԧ 𝑦 𝑗 for 𝑗 = 1,2, ⋯ , 𝑒 ℎ is chosen uniformly at random from ℋ We have a (𝑠, 𝑑𝑠, 1 − Τ 𝑠 𝑒 , 1 − Τ 𝑑𝑠 𝑒)-LSH ℎ: 0,1 𝑒 → {0,1}

slide-114
SLIDE 114

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆 ℋ = ℎ𝑗 | ℎ𝑗 Ԧ 𝑦 = Ԧ 𝑦 𝑗 for 𝑗 = 1,2, ⋯ , 𝑒 ℎ is chosen uniformly at random from ℋ We have a (𝑠, 𝑑𝑠, 1 − Τ 𝑠 𝑒 , 1 − Τ 𝑑𝑠 𝑒)-LSH ℎ: 0,1 𝑒 → {0,1}

𝜍 = log 1 − Τ 𝑠 𝑒 log(1 − Τ 𝑑𝑠 𝑒) ≤ 1 𝑑

slide-115
SLIDE 115

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:

  • Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

  • Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

  • Arbitrary answer otherwise

(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆 ℋ = ℎ𝑗 | ℎ𝑗 Ԧ 𝑦 = Ԧ 𝑦 𝑗 for 𝑗 = 1,2, ⋯ , 𝑒 ℎ is chosen uniformly at random from ℋ We have a (𝑠, 𝑑𝑠, 1 − Τ 𝑠 𝑒 , 1 − Τ 𝑑𝑠 𝑒)-LSH ℎ: 0,1 𝑒 → {0,1}

𝜍 = log 1 − Τ 𝑠 𝑒 log(1 − Τ 𝑑𝑠 𝑒) ≤ 1 𝑑

We can solve (𝑑, 𝑠)-ANN in Hamming space with space 𝑃(𝑜1+1/𝑑), query time 𝑃(𝑜1/𝑑 ⋅ log 𝑜), and one-sided error < 0.5

slide-116
SLIDE 116

Recap

Dimension reduction (low-distortion metric embedding)

  • Johonson-Linenstrauss Theorem: in Euclidian space, it is

easy to embed a set of 𝑜 points in arbitrary dimension to 𝑃(log 𝑜) dimension with constant distortion. Nearest neighbor search

  • Exact version of it is difficult in high-dimensional space.
  • Approximation and randomization helps.
  • If we can solve (𝒅, 𝒔)-ANN (Approximate Near Neighbor),

then we can solve 𝒅-ANN (Approximate Nearest Neighbor) with limited overhead.

  • Locality-sensitive hashing (LSH) is a powerful tool to solve

(𝑑, 𝑠)-ANN. (Collisions could be helpful for hashing.)