[PPT] - Dimension Reduction and Nearest Neighbor Search Advanced PowerPoint Presentation

SLIDE 1

Dimension Reduction and Nearest Neighbor Search

Advanced Algorithms Nanjing University, Fall 2018

SLIDE 2

Dimension reduction: Why we care?

High dimension data are common, yet working on

them directly is expensive.

SLIDE 3

Dimension reduction: Why we care?

High dimension data are common, yet working on

them directly is expensive.

Or, we just want a map!

low-distortion metric embedding

SLIDE 4

Dimension reduction: What we want?

SLIDE 5

Dimension reduction: What we want?

Usually we want 𝑙 ≪ 𝑒.
How small can 𝑙 be?
For what distance ⋅ ?
The embedding should be efficiently constructible.

SLIDE 6

The JLT (Johonson-Linenstrauss Theorem)

“ In Euclidian space, it is always possible to embed a set of 𝑜 points in arbitrary dimension to 𝑃(log 𝑜) dimension with constant distortion. ”

SLIDE 7

The JLT (Johonson-Linenstrauss Theorem)

“ In Euclidian space, it is always possible to embed a set of 𝑜 points in arbitrary dimension to 𝑃(log 𝑜) dimension with constant distortion. ” Theorem (Johnson-Lindenstrauss 1984):

∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝜚 𝑦𝑗 − 𝜚 𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

SLIDE 8

Theorem (Johnson-Lindenstrauss 1984):

∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝜚 𝑦𝑗 − 𝜚 𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

“ Even better, it is very easy to find such 𝜚 ”

SLIDE 9

Theorem (Johnson-Lindenstrauss 1984):

∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝜚 𝑦𝑗 − 𝜚 𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

“ Even better, it is very easy to find such 𝜚 ” “ Just sample a random 𝑙 × 𝑒 matrix 𝐵 ”

SLIDE 10

Theorem (Johnson-Lindenstrauss 1984):

∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝜚 𝑦𝑗 − 𝜚 𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

“ Even better, it is very easy to find such 𝜚 ” “ Just sample a random 𝑙 × 𝑒 matrix 𝐵 ”

Images extracted from https://graphics.stanford.edu/courses/cs468-06-fall/Slides/aneesh-michael.pdf

SLIDE 11

Theorem (Johnson-Lindenstrauss 1984):

∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝜚 𝑦𝑗 − 𝜚 𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

“ Even better, it is very easy to find such 𝜚 ” “ Just sample a random 𝑙 × 𝑒 matrix 𝐵 ”

How to construct (sample) 𝐵?

Project onto uniform random 𝑙-dimensional subspace;

(Johnson-Lindenstrauss; Dasgupta-Gupta)

Independent Gaussian entries; (Indyk-Motwani)
Simply i.i.d. +1/-1 entries. (Achlioptas)

SLIDE 12

Theorem (Johnson-Lindenstrauss 1984):

∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝜚 𝑦𝑗 − 𝜚 𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

“ Even better, it is very easy to find such 𝜚 ” “ Just sample a random 𝑙 × 𝑒 matrix 𝐵 ”

How to construct (sample) 𝐵?

Project onto uniform random 𝑙-dimensional subspace;

(Johnson-Lindenstrauss; Dasgupta-Gupta)

Independent Gaussian entries; (Indyk-Motwani)
Simply i.i.d. +1/-1 entries. (Achlioptas)

SLIDE 13

Choose some suitable 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜). For each entry 𝑏𝑗,𝑘 of 𝐵 ∈ ℝ𝑙×𝑒: Independently sample 𝑏𝑗,𝑘 from Gaussian distribution 𝑶(0,1/𝑙). For each 𝑚 ∈ [𝑜]: Let 𝑧𝑚 = 𝐵𝑦𝑚.

SLIDE 14

Choose some suitable 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜). For each entry 𝑏𝑗,𝑘 of 𝐵 ∈ ℝ𝑙×𝑒: Independently sample 𝑏𝑗,𝑘 from Gaussian distribution 𝑶(0,1/𝑙). For each 𝑚 ∈ [𝑜]: Let 𝑧𝑚 = 𝐵𝑦𝑚.

Gaussian distribution (a.k.a. normal distribution) 𝑶(𝜈, 𝜏2): 𝔽 𝑌 = 𝜈, Var 𝑌 = 𝜏2 Pr 𝑌 ≤ 𝑢 = න

−∞ 𝑢

1 2𝜌𝜏2 𝑓− 𝑦−𝜈 2/(2𝜏2) d𝑦

SLIDE 15

∀0 < 𝜗 < 1: let 𝐵 be a 𝑙 × 𝑒 matrix with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), if each entry of 𝐵 is independently drawn from 𝑶(0,1/𝑙), then with high probability: for all 𝑦𝑗, 𝑦𝑘 ∈ 𝑇, 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝐵𝑦𝑗 − 𝐵𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

SLIDE 16

∀0 < 𝜗 < 1: let 𝐵 be a 𝑙 × 𝑒 matrix with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), if each entry of 𝐵 is independently drawn from 𝑶(0,1/𝑙), then with high probability: for all 𝑦𝑗, 𝑦𝑘 ∈ 𝑇, 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝐵𝑦𝑗 − 𝐵𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

1 − 𝜗 ≤ 𝐵 𝑦𝑗 − 𝑦𝑘 𝑦𝑗 − 𝑦𝑘

2 2 2

≤ 1 + 𝜗

SLIDE 17

∀0 < 𝜗 < 1: let 𝐵 be a 𝑙 × 𝑒 matrix with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), if each entry of 𝐵 is independently drawn from 𝑶(0,1/𝑙), then with high probability: for all 𝑦𝑗, 𝑦𝑘 ∈ 𝑇, 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝐵𝑦𝑗 − 𝐵𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

1 − 𝜗 ≤ 𝐵 𝑦𝑗 − 𝑦𝑘 𝑦𝑗 − 𝑦𝑘

2 2 2

≤ 1 + 𝜗 unit vector!

SLIDE 18

Norm Preservation

∀0 < 𝜗 < 1: let 𝐵 be a 𝑙 × 𝑒 matrix with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), if each entry of 𝐵 is independently drawn from 𝑶(0,1/𝑙), then with high probability: for all 𝑦𝑗, 𝑦𝑘 ∈ 𝑇, 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝐵𝑦𝑗 − 𝐵𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

1 − 𝜗 ≤ 𝐵 𝑦𝑗 − 𝑦𝑘 𝑦𝑗 − 𝑦𝑘

2 2 2

≤ 1 + 𝜗 unit vector!

SLIDE 19

Norm Preservation

∀0 < 𝜗 < 1: let 𝐵 be a 𝑙 × 𝑒 matrix with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), if each entry of 𝐵 is independently drawn from 𝑶(0,1/𝑙), then with high probability: for all 𝑦𝑗, 𝑦𝑘 ∈ 𝑇, 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝐵𝑦𝑗 − 𝐵𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

1 − 𝜗 ≤ 𝐵 𝑦𝑗 − 𝑦𝑘 𝑦𝑗 − 𝑦𝑘

2 2 2

≤ 1 + 𝜗 unit vector!

For any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3 union bound over 𝑃(𝑜2) pairs of 𝑦𝑗, 𝑦𝑘 ∈ 𝑇

SLIDE 20

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

SLIDE 21

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

SLIDE 22

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

SLIDE 23

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

Each 𝐵𝑗𝑘 is chosen i.i.d. from 𝑶(0,1/𝑙)
Linear combination of independent Gaussian r.v. is also Gaussian
𝑌~𝑶 𝜈𝑌, 𝜏𝑌

2 , 𝑍~𝑶 𝜈𝑍, 𝜏𝑍 2 → 𝑏𝑌 + 𝑐𝑍~𝑶 𝑏𝜈𝑌 + 𝑐𝜈𝑍, 𝑏2𝜏𝑌 2 + 𝑐2𝜏𝑍 2

SLIDE 24

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

Each 𝐵𝑗𝑘 is chosen i.i.d. from 𝑶(0,1/𝑙)
Linear combination of independent Gaussian r.v. is also Gaussian
𝑌~𝑶 𝜈𝑌, 𝜏𝑌

2 , 𝑍~𝑶 𝜈𝑍, 𝜏𝑍 2 → 𝑏𝑌 + 𝑐𝑍~𝑶 𝑏𝜈𝑌 + 𝑐𝜈𝑍, 𝑏2𝜏𝑌 2 + 𝑐2𝜏𝑍 2

SLIDE 25

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

Each 𝐵𝑗𝑘 is chosen i.i.d. from 𝑶(0,1/𝑙)
Linear combination of independent Gaussian r.v. is also Gaussian
𝑌~𝑶 𝜈𝑌, 𝜏𝑌

2 , 𝑍~𝑶 𝜈𝑍, 𝜏𝑍 2 → 𝑏𝑌 + 𝑐𝑍~𝑶 𝑏𝜈𝑌 + 𝑐𝜈𝑍, 𝑏2𝜏𝑌 2 + 𝑐2𝜏𝑍 2

𝑣 is unit vector

SLIDE 26

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

Each 𝐵𝑗𝑘 is chosen i.i.d. from 𝑶(0,1/𝑙)
Linear combination of independent Gaussian r.v. is also Gaussian
𝑌~𝑶 𝜈𝑌, 𝜏𝑌

2 , 𝑍~𝑶 𝜈𝑍, 𝜏𝑍 2 → 𝑏𝑌 + 𝑐𝑍~𝑶 𝑏𝜈𝑌 + 𝑐𝜈𝑍, 𝑏2𝜏𝑌 2 + 𝑐2𝜏𝑍 2

𝑣 is unit vector

Moreover, these 𝐵𝑣 𝑗 are mutually independent!

SLIDE 27

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

SLIDE 28

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

SLIDE 29

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

SLIDE 30

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

In terms of expectation we are fine, but how fast do we deviate from expectation?

SLIDE 31

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

SLIDE 32

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

SLIDE 33

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

SLIDE 34

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

SLIDE 35

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

Notice 𝑌𝑗 = 𝑙 ⋅ 𝑍

𝑗

SLIDE 36

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

SLIDE 37

Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2

2 − 1 > 𝜗 < 1/𝑜3

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

For suitable 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜)

SLIDE 38

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

SLIDE 39

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

SLIDE 40

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

SLIDE 41

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

SLIDE 42

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

SLIDE 43

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =

1 1−2𝑡

SLIDE 44

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =

1 1−2𝑡

SLIDE 45

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =

1 1−2𝑡

SLIDE 46

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =

1 1−2𝑡

SLIDE 47

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =

1 1−2𝑡

when 𝜇 ≤ 1/4

SLIDE 48

Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr

1 𝑙 σ𝑗=1 𝑙

𝑌𝑗

2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8

If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =

1 1−2𝑡

when 𝜇 ≤ 1/4 let 𝜇 = 𝜗/4

SLIDE 49

Theorem (Johnson-Lindenstrauss 1984):

∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘

2 2 ≤

𝜚 𝑦𝑗 − 𝜚 𝑦𝑘

2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2

“ Even better, it is very easy to find such 𝜚: Just sample a random 𝑙 × 𝑒 matrix 𝐵 ” “ JLT states in Euclidian space, it is always possible to embed a set of 𝑜 points in arbitrary dimension to 𝑃(log 𝑜) dimension with constant distortion. ”

SLIDE 50

Nearest Neighbor Search (NNS)

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, find the 𝑧𝑗 which is closest to Ԧ 𝑦

SLIDE 51

Nearest Neighbor Search (NNS)

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, find the 𝑧𝑗 which is closest to Ԧ 𝑦

a set a distance function satisfying triangle inequality

SLIDE 52

Nearest Neighbor Search (NNS)

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, find the 𝑧𝑗 which is closest to Ԧ 𝑦

SLIDE 53

Nearest Neighbor Search (NNS)

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Can find many applications in:

database systems
pattern recognition
machine learning
bioinformatics
…

SLIDE 54

Nearest Neighbor Search (NNS)

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Can find many applications in:

database systems
pattern recognition
machine learning
bioinformatics
…

size sound

?

SLIDE 55

Nearest Neighbor Search (NNS)

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑉 𝑒 for some finite 𝑉 Query: given a point Ԧ 𝑦 ∈ 𝑉 𝑒, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Goal: Efficiently answer the query

What efficiency we care?

Usually space and time

Trivial solution:

No preprocessing, just linear search

When dimension 𝑒 is small:

Binary search when 𝑒 = 1
𝑙-d tree
Voronoi diagram
…

𝑙-d tree Voronoi diagram

SLIDE 56

Nearest Neighbor Search (NNS)

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑉 𝑒 for some finite 𝑉 Query: given a point Ԧ 𝑦 ∈ 𝑉 𝑒, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Goal: Efficiently answer the query What if dimension 𝑒 is large, say 𝑒 ≫ log 𝑜?

SLIDE 57

Nearest Neighbor Search (NNS)

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑉 𝑒 for some finite 𝑉 Query: given a point Ԧ 𝑦 ∈ 𝑉 𝑒, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Goal: Efficiently answer the query What if dimension 𝑒 is large, say 𝑒 ≫ log 𝑜? Curse of dimensionality: It is conjectured that to solve NNS in high dimension requires either super-polynomial(𝑜) space or super-polynomial(𝑒) time.

SLIDE 58

Nearest Neighbor Search (NNS)

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑉 𝑒 for some finite 𝑉 Query: given a point Ԧ 𝑦 ∈ 𝑉 𝑒, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Goal: Efficiently answer the query What if dimension 𝑒 is large, say 𝑒 ≫ log 𝑜? Curse of dimensionality: It is conjectured that to solve NNS in high dimension requires either super-polynomial(𝑜) space or super-polynomial(𝑒) time. Blessing: Randomization + Approximation

SLIDE 59

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min

1≤𝑘≤𝑜 dist( Ԧ

𝑦, 𝑧𝑘)

(𝒅, 𝒔)-ANN (Approximate Near Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

Approximate Near(est) Neighbor (ANN)

SLIDE 60

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min

1≤𝑘≤𝑜 dist( Ԧ

𝑦, 𝑧𝑘)

(𝒅, 𝒔)-ANN (Approximate Near Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

SLIDE 61

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min

1≤𝑘≤𝑜 dist( Ԧ

𝑦, 𝑧𝑘)

(𝒅, 𝒔)-ANN (Approximate Near Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead.

SLIDE 62

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min

1≤𝑘≤𝑜 dist( Ԧ

𝑦, 𝑧𝑘)

(𝒅, 𝒔)-ANN (Approximate Near Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead. 𝐸𝑛𝑗𝑜 = min

1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)

𝐸𝑛𝑏𝑦 = max

1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)

𝑆 = 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 −1, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 0, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 1, ⋯ , 𝐸𝑛𝑏𝑦

SLIDE 63

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min

1≤𝑘≤𝑜 dist( Ԧ

𝑦, 𝑧𝑘)

(𝒅, 𝒔)-ANN (Approximate Near Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead. 𝐸𝑛𝑗𝑜 = min

1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)

𝐸𝑛𝑏𝑦 = max

1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)

𝑆 = 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 −1, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 0, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 1, ⋯ , 𝐸𝑛𝑏𝑦 Let 𝑠∗ be the min in 𝑆 s.t. ( 𝑑, 𝑠∗)-ANN returns yes with 𝑧∗

SLIDE 64

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min

1≤𝑘≤𝑜 dist( Ԧ

𝑦, 𝑧𝑘)

(𝒅, 𝒔)-ANN (Approximate Near Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead. 𝐸𝑛𝑗𝑜 = min

1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)

𝐸𝑛𝑏𝑦 = max

1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)

𝑆 = 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 −1, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 0, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 1, ⋯ , 𝐸𝑛𝑏𝑦 Let 𝑠∗ be the min in 𝑆 s.t. ( 𝑑, 𝑠∗)-ANN returns yes with 𝑧∗ dist Ԧ 𝑦, 𝑧∗ ≤ 𝑑 ⋅ 𝑠∗ ∀𝑧𝑗 ∈ 𝑌: dist Ԧ 𝑦, 𝑧𝑗 > 𝑠∗/ 𝑑

SLIDE 65

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min

1≤𝑘≤𝑜 dist( Ԧ

𝑦, 𝑧𝑘)

(𝒅, 𝒔)-ANN (Approximate Near Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead. ∀𝑠: ( 𝑑, 𝑠)-ANN can be solved with space 𝑡 and query time 𝑢 𝑑-ANN can be solved with space 𝑃 𝑡 ⋅ log𝑑 ൗ

𝐸𝑛𝑏𝑦 𝐸𝑛𝑗𝑜

and query time 𝑃 𝑢 ⋅ log2 log𝑑 ൗ

𝐸𝑛𝑏𝑦 𝐸𝑛𝑗𝑜

SLIDE 66

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min

1≤𝑘≤𝑜 dist( Ԧ

𝑦, 𝑧𝑘)

(𝒅, 𝒔)-ANN (Approximate Near Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead. ∀𝑠: ( 𝑑, 𝑠)-ANN can be solved with space 𝑡 and query time 𝑢 𝑑-ANN can be solved with space 𝑃 𝑡 ⋅ log𝑑 ൗ

𝐸𝑛𝑏𝑦 𝐸𝑛𝑗𝑜

and query time 𝑃 𝑢 ⋅ log2 log𝑑 ൗ

𝐸𝑛𝑏𝑦 𝐸𝑛𝑗𝑜

SLIDE 67

Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

Let 𝑙, 𝑞 and 𝑡 to be fixed later. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦). GF(2): two elements {0,1}, XOR as sum, AND as multiplication. Therefore, 𝑨𝑗 𝑘 = 𝐵𝑧𝑗 𝑘 = σ𝑚=1

𝑒

𝐵𝑘𝑚 ⋅ 𝑧𝑗 𝑚 mod 2. Space: 𝑃(𝑜 ⋅ 2𝑙) Query time: 𝑃(𝑒𝑙) computation + 𝑃(1) memory access

SLIDE 68

Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

Let 𝑙, 𝑞 and 𝑡 to be fixed later. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦).

SLIDE 69

Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

Let 𝑙, 𝑞 and 𝑡 to be fixed later. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦). For suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒:

SLIDE 70

Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

Let 𝑙, 𝑞 and 𝑡 to be fixed later. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦). For suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒:

(𝑑, 𝑠)-ANN is solved w.h.p.

SLIDE 71

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒:

SLIDE 72

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: each row vector 𝐵𝑗 of 𝐵 has i.i.d. entries ∈ Bernoulli(𝑞)

SLIDE 73

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: each row vector 𝐵𝑗 of 𝐵 has i.i.d. entries ∈ Bernoulli(𝑞)

SLIDE 74

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: each row vector 𝐵𝑗 of 𝐵 has i.i.d. entries ∈ Bernoulli(𝑞)

an alternative view regarding the generation of 𝐵𝑗:

build 𝐷 ⊆ [𝑒] s.t. each element in [𝑒] is chosen independently with pr. 2𝑞
each coordinate in 𝐷 is independently set to 0 or 1 each with pr. 1/2

SLIDE 75

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: each row vector 𝐵𝑗 of 𝐵 has i.i.d. entries ∈ Bernoulli(𝑞)

an alternative view regarding the generation of 𝐵𝑗:

build 𝐷 ⊆ [𝑒] s.t. each element in [𝑒] is chosen independently with pr. 2𝑞
each coordinate in 𝐷 is independently set to 0 or 1 each with pr. 1/2
bservations:
if 𝑘 ∉ 𝐷 for all coordinates 𝑘 where Ԧ

𝑦 𝑘 ≠ Ԧ 𝑧 𝑘, then 𝐵 Ԧ 𝑦 𝑗 = 𝐵 Ԧ 𝑧 𝑗

otherwise, if exists such 𝑘 ∈ 𝐷, then once all other entries in 𝐵𝑗 are fixed,

exactly one of the two choices for 𝐵𝑗𝑘 will make 𝐵 Ԧ 𝑦 𝑗 = 𝐵 Ԧ 𝑧 𝑗

SLIDE 76

choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠 random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: each row vector 𝐵𝑗 of 𝐵 has i.i.d. entries ∈ Bernoulli(𝑞)

SLIDE 77

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

SLIDE 78

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1

𝑙

𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗

therwise

SLIDE 79

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1

𝑙

𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗

therwise

SLIDE 80

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1

𝑙

𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗

therwise

choose 𝑡 =

1 4+1 2−2− 𝑑+1

𝑙 2

=

3 8 − 2−(𝑑+2) 𝑙

SLIDE 81

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1

𝑙

𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗

therwise

choose 𝑡 =

1 4+1 2−2− 𝑑+1

𝑙 2

=

3 8 − 2−(𝑑+2) 𝑙

SLIDE 82

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1

𝑙

𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗

therwise

choose 𝑡 =

1 4+1 2−2− 𝑑+1

𝑙 2

=

3 8 − 2−(𝑑+2) 𝑙

independent

SLIDE 83

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1

𝑙

𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗

therwise

choose 𝑡 =

1 4+1 2−2− 𝑑+1

𝑙 2

=

3 8 − 2−(𝑑+2) 𝑙

independent

Chernoff bound: Let independent r.v. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙 ∈ {0,1}, let 𝑌 = σ𝑗=1

𝑙

𝑌𝑗, then for 𝑡 > 0: Pr 𝑌 ≥ 𝔽 𝑌 + 𝑡 ≤ exp − 2𝑡2 𝑙 Pr 𝑌 ≤ 𝔽 𝑌 − 𝑡 ≤ exp − 2𝑡2 𝑙

SLIDE 84

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1

𝑙

𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗

therwise

choose 𝑡 =

1 4+1 2−2− 𝑑+1

𝑙 2

=

3 8 − 2−(𝑑+2) 𝑙

independent

Chernoff bound: Let independent r.v. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙 ∈ {0,1}, let 𝑌 = σ𝑗=1

𝑙

𝑌𝑗, then for 𝑡 > 0: Pr 𝑌 ≥ 𝔽 𝑌 + 𝑡 ≤ exp − 2𝑡2 𝑙 Pr 𝑌 ≤ 𝔽 𝑌 − 𝑡 ≤ exp − 2𝑡2 𝑙

SLIDE 85

random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1

𝑒

𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠

dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1

𝑙

𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗

therwise

choose 𝑡 =

1 4+1 2−2− 𝑑+1

𝑙 2

=

3 8 − 2−(𝑑+2) 𝑙

independent

SLIDE 86

Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

Let 𝑙 =

ln 𝑜 Τ 1 8−2−(𝑑+2), 𝑞 =

Τ 1 − 2−1/𝑠 2 and 𝑡 = Τ 3 8 − 2−(𝑑+2) 𝑙. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦). Space: 𝑃(𝑜 ⋅ 2𝑙) Query time: 𝑃(𝑒𝑙) computation + 𝑃(1) memory access

SLIDE 87

Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

Let 𝑙 =

ln 𝑜 Τ 1 8−2−(𝑑+2), 𝑞 =

Τ 1 − 2−1/𝑠 2 and 𝑡 = Τ 3 8 − 2−(𝑑+2) 𝑙. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦). Space: 𝑃(𝑜 ⋅ 2𝑙) Query time: 𝑃(𝑒𝑙) computation + 𝑃(1) memory access Space: 𝑜𝑃(1) Query time: 𝑃(𝑒 log 𝑜)

Solve (𝑑, 𝑠)-ANN w.h.p.

SLIDE 88

Locality-Sensitive Hashing (LSH)

Given a metric space 𝑌, dist , a random ℎ: 𝑌 → 𝑉 drawn from ℋ is an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH if, for all Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌:

SLIDE 89

Locality-Sensitive Hashing (LSH)

Given a metric space 𝑌, dist , a random ℎ: 𝑌 → 𝑉 drawn from ℋ is an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH if, for all Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: 𝑞 > 𝑟

SLIDE 90

Locality-Sensitive Hashing (LSH)

Given a metric space 𝑌, dist , a random ℎ: 𝑌 → 𝑉 drawn from ℋ is an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH if, for all Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: If there exists an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH ℎ: 𝑌 → 𝑉, then there exists an (𝑠, 𝑑𝑠, 𝑞𝑙, 𝑟𝑙)-LSH 𝑕: 𝑌 → 𝑉𝑙

SLIDE 91

Locality-Sensitive Hashing (LSH)

Given a metric space 𝑌, dist , a random ℎ: 𝑌 → 𝑉 drawn from ℋ is an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH if, for all Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: If there exists an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH ℎ: 𝑌 → 𝑉, then there exists an (𝑠, 𝑑𝑠, 𝑞𝑙, 𝑟𝑙)-LSH 𝑕: 𝑌 → 𝑉𝑙

Independently draw ℎ1, ℎ2, ⋯ , ℎ𝑙 according to the distribution of ℎ 𝑕 𝑦 = ℎ1 𝑦 , ℎ2 𝑦 , ⋯ , ℎ𝑙 𝑦 ∈ 𝑉𝑙

SLIDE 92

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

(𝒅, 𝒔)-ANN

Suppose we have (𝑠, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌:

SLIDE 93

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

(𝒅, 𝒔)-ANN

Suppose we have (𝑠, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in nondecreasing order of 𝑕(𝑧𝑗). Upon query Ԧ 𝑦 ∈ 𝑌: Find all 𝑧𝑗 such that 𝑕 Ԧ 𝑦 = 𝑕(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

therwise return “no”.

SLIDE 94

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

(𝒅, 𝒔)-ANN

Suppose we have (𝑠, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in nondecreasing order of 𝑕(𝑧𝑗). Upon query Ԧ 𝑦 ∈ 𝑌: Find all 𝑧𝑗 such that 𝑕 Ԧ 𝑦 = 𝑕(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

therwise return “no”.

If the real answer is “no”: always correct If the real answer is “yes”: correct with probability at least 𝑞∗

SLIDE 95

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

(𝒅, 𝒔)-ANN

Suppose we have (𝑠, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in nondecreasing order of 𝑕(𝑧𝑗). Upon query Ԧ 𝑦 ∈ 𝑌: Find all 𝑧𝑗 such that 𝑕 Ԧ 𝑦 = 𝑕(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

therwise return “no”.

If the real answer is “no”: always correct If the real answer is “yes”: correct with probability at least 𝑞∗ Space: 𝑃(𝑜) Time: 𝑃(log 𝑜)

SLIDE 96

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

(𝒅, 𝒔)-ANN

Suppose we have (𝑠, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in nondecreasing order of 𝑕(𝑧𝑗). Upon query Ԧ 𝑦 ∈ 𝑌: Find all 𝑧𝑗 such that 𝑕 Ԧ 𝑦 = 𝑕(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

therwise return “no”.

If the real answer is “no”: always correct If the real answer is “yes”: correct with probability at least 𝑞∗ Space: 𝑃(𝑜) Time: 𝑃(log 𝑜) + 𝑃(1) in expectation

SLIDE 97

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

Suppose we have (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

therwise return “no”.

(𝒅, 𝒔)-ANN

SLIDE 98

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌

SLIDE 99

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct.

SLIDE 100

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤

SLIDE 101

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑕𝑘 Ԧ 𝑦 ≠ 𝑕𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗)]

SLIDE 102

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑕𝑘 Ԧ 𝑦 ≠ 𝑕𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗)]

SLIDE 103

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑕𝑘 Ԧ 𝑦 ≠ 𝑕𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗)] ≤ 1 − 𝑞∗ 𝑚 ≤ Τ 1 𝑓

SLIDE 104

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑕𝑘 Ԧ 𝑦 ≠ 𝑕𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗)] ≤ 1 − 𝑞∗ 𝑚 ≤ Τ 1 𝑓

SLIDE 105

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑕𝑘 Ԧ 𝑦 ≠ 𝑕𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗)] ≤ 1 − 𝑞∗ 𝑚 ≤ Τ 1 𝑓 ≤ 𝔽 number of such bad 𝑧𝑗 10𝑚 ≤ 𝑜 ⋅ 𝑚 ⋅ Τ 1 𝑜 10𝑚 = 0.1

SLIDE 106

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑕𝑘 Ԧ 𝑦 ≠ 𝑕𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗)] ≤ 1 − 𝑞∗ 𝑚 ≤ Τ 1 𝑓 ≤ 𝔽 number of such bad 𝑧𝑗 10𝑚 ≤ 𝑜 ⋅ 𝑚 ⋅ Τ 1 𝑜 10𝑚 = 0.1

+ ≤ Τ 1 𝑓 + 0.1 < 0.5

SLIDE 107

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌

If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" < 0.5

SLIDE 108

Let 𝑚 = Τ 1 𝑞∗, independently draw 𝑕1, 𝑕2, ⋯ , 𝑕𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑕𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑕𝑘 Ԧ 𝑦 = 𝑕𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;

therwise return “no”.

(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌

If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" < 0.5 Space: 𝑃 𝑜𝑚 = 𝑃( Τ 𝑜 𝑞∗) Time: 𝑃 𝑚 ⋅ log 𝑜 = 𝑃 Τ (log 𝑜) 𝑞∗

SLIDE 109

Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

(𝒅, 𝒔)-ANN Suppose we have (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH ℎ: 𝑌 → 𝑉 We have (𝑠, 𝑑𝑠, 𝑞𝑙, Τ 1 𝑜)-LSH 𝑕: 𝑌 → 𝑉𝑙 where 𝑙 = log(1/𝑟) 𝑜, implying 𝑞𝑙 = 𝑞log1/𝑟 𝑜 = 𝑜−𝜍 𝜍 = log 𝑞 log 𝑟 Hence we can solve (𝑑, 𝑠)-ANN with space 𝑃(𝑜1+𝜍) and query time 𝑃(𝑜𝜍 ⋅ log 𝑜) and one-sided error < 0.5

SLIDE 110

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆

SLIDE 111

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆 ℋ = ℎ𝑗 | ℎ𝑗 Ԧ 𝑦 = Ԧ 𝑦 𝑗 for 𝑗 = 1,2, ⋯ , 𝑒 ℎ is chosen uniformly at random from ℋ

SLIDE 112

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆 ℋ = ℎ𝑗 | ℎ𝑗 Ԧ 𝑦 = Ԧ 𝑦 𝑗 for 𝑗 = 1,2, ⋯ , 𝑒 ℎ is chosen uniformly at random from ℋ

SLIDE 113

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆 ℋ = ℎ𝑗 | ℎ𝑗 Ԧ 𝑦 = Ԧ 𝑦 𝑗 for 𝑗 = 1,2, ⋯ , 𝑒 ℎ is chosen uniformly at random from ℋ We have a (𝑠, 𝑑𝑠, 1 − Τ 𝑠 𝑒 , 1 − Τ 𝑑𝑠 𝑒)-LSH ℎ: 0,1 𝑒 → {0,1}

SLIDE 114

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆 ℋ = ℎ𝑗 | ℎ𝑗 Ԧ 𝑦 = Ԧ 𝑦 𝑗 for 𝑗 = 1,2, ⋯ , 𝑒 ℎ is chosen uniformly at random from ℋ We have a (𝑠, 𝑑𝑠, 1 − Τ 𝑠 𝑒 , 1 − Τ 𝑑𝑠 𝑒)-LSH ℎ: 0,1 𝑒 → {0,1}

𝜍 = log 1 − Τ 𝑠 𝑒 log(1 − Τ 𝑑𝑠 𝑒) ≤ 1 𝑑

SLIDE 115

Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:

Return a 𝑧𝑗 s.t. dist Ԧ

𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠

Return “no” if ∀𝑧𝑘: dist Ԧ

𝑦, 𝑧𝑘 > 𝑑𝑠

Arbitrary answer otherwise

(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆 ℋ = ℎ𝑗 | ℎ𝑗 Ԧ 𝑦 = Ԧ 𝑦 𝑗 for 𝑗 = 1,2, ⋯ , 𝑒 ℎ is chosen uniformly at random from ℋ We have a (𝑠, 𝑑𝑠, 1 − Τ 𝑠 𝑒 , 1 − Τ 𝑑𝑠 𝑒)-LSH ℎ: 0,1 𝑒 → {0,1}

𝜍 = log 1 − Τ 𝑠 𝑒 log(1 − Τ 𝑑𝑠 𝑒) ≤ 1 𝑑

We can solve (𝑑, 𝑠)-ANN in Hamming space with space 𝑃(𝑜1+1/𝑑), query time 𝑃(𝑜1/𝑑 ⋅ log 𝑜), and one-sided error < 0.5

SLIDE 116

Recap

Dimension reduction (low-distortion metric embedding)

Johonson-Linenstrauss Theorem: in Euclidian space, it is

easy to embed a set of 𝑜 points in arbitrary dimension to 𝑃(log 𝑜) dimension with constant distortion. Nearest neighbor search

Exact version of it is difficult in high-dimensional space.
Approximation and randomization helps.
If we can solve (𝒅, 𝒔)-ANN (Approximate Near Neighbor),

then we can solve 𝒅-ANN (Approximate Nearest Neighbor) with limited overhead.

Locality-sensitive hashing (LSH) is a powerful tool to solve

(𝑑, 𝑠)-ANN. (Collisions could be helpful for hashing.)