Dimension Reduction and Nearest Neighbor Search
Advanced Algorithms Nanjing University, Fall 2018
Dimension Reduction and Nearest Neighbor Search Advanced - - PowerPoint PPT Presentation
Dimension Reduction and Nearest Neighbor Search Advanced Algorithms Nanjing University, Fall 2018 Dimension reduction: Why we care? High dimension data are common, yet working on them directly is expensive. Dimension reduction: Why we
Advanced Algorithms Nanjing University, Fall 2018
low-distortion metric embedding
∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘
2 2 ≤
𝜚 𝑦𝑗 − 𝜚 𝑦𝑘
2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2
∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘
2 2 ≤
𝜚 𝑦𝑗 − 𝜚 𝑦𝑘
2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2
∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘
2 2 ≤
𝜚 𝑦𝑗 − 𝜚 𝑦𝑘
2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2
∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘
2 2 ≤
𝜚 𝑦𝑗 − 𝜚 𝑦𝑘
2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2
Images extracted from https://graphics.stanford.edu/courses/cs468-06-fall/Slides/aneesh-michael.pdf
∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘
2 2 ≤
𝜚 𝑦𝑗 − 𝜚 𝑦𝑘
2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2
How to construct (sample) 𝐵?
(Johnson-Lindenstrauss; Dasgupta-Gupta)
∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘
2 2 ≤
𝜚 𝑦𝑗 − 𝜚 𝑦𝑘
2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2
How to construct (sample) 𝐵?
(Johnson-Lindenstrauss; Dasgupta-Gupta)
Choose some suitable 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜). For each entry 𝑏𝑗,𝑘 of 𝐵 ∈ ℝ𝑙×𝑒: Independently sample 𝑏𝑗,𝑘 from Gaussian distribution 𝑶(0,1/𝑙). For each 𝑚 ∈ [𝑜]: Let 𝑧𝑚 = 𝐵𝑦𝑚.
Choose some suitable 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜). For each entry 𝑏𝑗,𝑘 of 𝐵 ∈ ℝ𝑙×𝑒: Independently sample 𝑏𝑗,𝑘 from Gaussian distribution 𝑶(0,1/𝑙). For each 𝑚 ∈ [𝑜]: Let 𝑧𝑚 = 𝐵𝑦𝑚.
Gaussian distribution (a.k.a. normal distribution) 𝑶(𝜈, 𝜏2): 𝔽 𝑌 = 𝜈, Var 𝑌 = 𝜏2 Pr 𝑌 ≤ 𝑢 = න
−∞ 𝑢
1 2𝜌𝜏2 𝑓− 𝑦−𝜈 2/(2𝜏2) d𝑦
∀0 < 𝜗 < 1: let 𝐵 be a 𝑙 × 𝑒 matrix with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), if each entry of 𝐵 is independently drawn from 𝑶(0,1/𝑙), then with high probability: for all 𝑦𝑗, 𝑦𝑘 ∈ 𝑇, 1 − 𝜗 𝑦𝑗 − 𝑦𝑘
2 2 ≤
𝐵𝑦𝑗 − 𝐵𝑦𝑘
2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2
∀0 < 𝜗 < 1: let 𝐵 be a 𝑙 × 𝑒 matrix with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), if each entry of 𝐵 is independently drawn from 𝑶(0,1/𝑙), then with high probability: for all 𝑦𝑗, 𝑦𝑘 ∈ 𝑇, 1 − 𝜗 𝑦𝑗 − 𝑦𝑘
2 2 ≤
𝐵𝑦𝑗 − 𝐵𝑦𝑘
2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2
1 − 𝜗 ≤ 𝐵 𝑦𝑗 − 𝑦𝑘 𝑦𝑗 − 𝑦𝑘
2 2 2
≤ 1 + 𝜗
∀0 < 𝜗 < 1: let 𝐵 be a 𝑙 × 𝑒 matrix with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), if each entry of 𝐵 is independently drawn from 𝑶(0,1/𝑙), then with high probability: for all 𝑦𝑗, 𝑦𝑘 ∈ 𝑇, 1 − 𝜗 𝑦𝑗 − 𝑦𝑘
2 2 ≤
𝐵𝑦𝑗 − 𝐵𝑦𝑘
2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2
1 − 𝜗 ≤ 𝐵 𝑦𝑗 − 𝑦𝑘 𝑦𝑗 − 𝑦𝑘
2 2 2
≤ 1 + 𝜗 unit vector!
∀0 < 𝜗 < 1: let 𝐵 be a 𝑙 × 𝑒 matrix with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), if each entry of 𝐵 is independently drawn from 𝑶(0,1/𝑙), then with high probability: for all 𝑦𝑗, 𝑦𝑘 ∈ 𝑇, 1 − 𝜗 𝑦𝑗 − 𝑦𝑘
2 2 ≤
𝐵𝑦𝑗 − 𝐵𝑦𝑘
2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2
1 − 𝜗 ≤ 𝐵 𝑦𝑗 − 𝑦𝑘 𝑦𝑗 − 𝑦𝑘
2 2 2
≤ 1 + 𝜗 unit vector!
∀0 < 𝜗 < 1: let 𝐵 be a 𝑙 × 𝑒 matrix with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), if each entry of 𝐵 is independently drawn from 𝑶(0,1/𝑙), then with high probability: for all 𝑦𝑗, 𝑦𝑘 ∈ 𝑇, 1 − 𝜗 𝑦𝑗 − 𝑦𝑘
2 2 ≤
𝐵𝑦𝑗 − 𝐵𝑦𝑘
2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2
1 − 𝜗 ≤ 𝐵 𝑦𝑗 − 𝑦𝑘 𝑦𝑗 − 𝑦𝑘
2 2 2
≤ 1 + 𝜗 unit vector!
For any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3 union bound over 𝑃(𝑜2) pairs of 𝑦𝑗, 𝑦𝑘 ∈ 𝑇
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
2 , 𝑍~𝑶 𝜈𝑍, 𝜏𝑍 2 → 𝑏𝑌 + 𝑐𝑍~𝑶 𝑏𝜈𝑌 + 𝑐𝜈𝑍, 𝑏2𝜏𝑌 2 + 𝑐2𝜏𝑍 2
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
2 , 𝑍~𝑶 𝜈𝑍, 𝜏𝑍 2 → 𝑏𝑌 + 𝑐𝑍~𝑶 𝑏𝜈𝑌 + 𝑐𝜈𝑍, 𝑏2𝜏𝑌 2 + 𝑐2𝜏𝑍 2
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
2 , 𝑍~𝑶 𝜈𝑍, 𝜏𝑍 2 → 𝑏𝑌 + 𝑐𝑍~𝑶 𝑏𝜈𝑌 + 𝑐𝜈𝑍, 𝑏2𝜏𝑌 2 + 𝑐2𝜏𝑍 2
𝑣 is unit vector
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
2 , 𝑍~𝑶 𝜈𝑍, 𝜏𝑍 2 → 𝑏𝑌 + 𝑐𝑍~𝑶 𝑏𝜈𝑌 + 𝑐𝜈𝑍, 𝑏2𝜏𝑌 2 + 𝑐2𝜏𝑍 2
𝑣 is unit vector
Moreover, these 𝐵𝑣 𝑗 are mutually independent!
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
In terms of expectation we are fine, but how fast do we deviate from expectation?
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr
1 𝑙 σ𝑗=1 𝑙
𝑌𝑗
2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr
1 𝑙 σ𝑗=1 𝑙
𝑌𝑗
2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8
Notice 𝑌𝑗 = 𝑙 ⋅ 𝑍
𝑗
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr
1 𝑙 σ𝑗=1 𝑙
𝑌𝑗
2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8
Let 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜) and 𝐵 ∈ ℝ𝑙×𝑒, let each entry of 𝐵 is chosen i.i.d. from 𝑶(0,1/𝑙), then for any unit vector 𝑣 ∈ ℝ𝑒: Pr 𝐵𝑣 2
2 − 1 > 𝜗 < 1/𝑜3
Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr
1 𝑙 σ𝑗=1 𝑙
𝑌𝑗
2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8
For suitable 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜)
Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr
1 𝑙 σ𝑗=1 𝑙
𝑌𝑗
2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8
Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr
1 𝑙 σ𝑗=1 𝑙
𝑌𝑗
2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8
Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr
1 𝑙 σ𝑗=1 𝑙
𝑌𝑗
2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8
Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr
1 𝑙 σ𝑗=1 𝑙
𝑌𝑗
2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8
Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr
1 𝑙 σ𝑗=1 𝑙
𝑌𝑗
2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8
Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr
1 𝑙 σ𝑗=1 𝑙
𝑌𝑗
2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8
If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =
1 1−2𝑡
Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr
1 𝑙 σ𝑗=1 𝑙
𝑌𝑗
2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8
If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =
1 1−2𝑡
Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr
1 𝑙 σ𝑗=1 𝑙
𝑌𝑗
2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8
If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =
1 1−2𝑡
Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr
1 𝑙 σ𝑗=1 𝑙
𝑌𝑗
2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8
If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =
1 1−2𝑡
Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr
1 𝑙 σ𝑗=1 𝑙
𝑌𝑗
2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8
If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =
1 1−2𝑡
when 𝜇 ≤ 1/4
Chernoff bound for 𝝍𝟑-distribution: For i.i.d. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙~𝑶(0,1) and 0 < 𝜗 < 1, Pr
1 𝑙 σ𝑗=1 𝑙
𝑌𝑗
2 − 1 > 𝜗 < 2𝑓−𝑙𝜗2/8
If 𝑌~𝑶(0,1) and 𝑡 < 1/2, then 𝔽 𝑓𝑡𝑌2 =
1 1−2𝑡
when 𝜇 ≤ 1/4 let 𝜇 = 𝜗/4
∀0 < 𝜗 < 1, for any set 𝑇 of 𝑜 points from ℝ𝑒, there is a 𝜚: ℝ𝑒 → ℝ𝑙 with 𝑙 ∈ 𝑃(𝜗−2 ⋅ log 𝑜), such that ∀𝑦𝑗, 𝑦𝑘 ∈ 𝑇: 1 − 𝜗 𝑦𝑗 − 𝑦𝑘
2 2 ≤
𝜚 𝑦𝑗 − 𝜚 𝑦𝑘
2 2 ≤ (1 + 𝜗) 𝑦𝑗 − 𝑦𝑘 2 2
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, find the 𝑧𝑗 which is closest to Ԧ 𝑦
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, find the 𝑧𝑗 which is closest to Ԧ 𝑦
a set a distance function satisfying triangle inequality
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, find the 𝑧𝑗 which is closest to Ԧ 𝑦
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Can find many applications in:
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Can find many applications in:
size sound
?
Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑉 𝑒 for some finite 𝑉 Query: given a point Ԧ 𝑦 ∈ 𝑉 𝑒, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Goal: Efficiently answer the query
What efficiency we care?
Trivial solution:
When dimension 𝑒 is small:
𝑙-d tree Voronoi diagram
Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑉 𝑒 for some finite 𝑉 Query: given a point Ԧ 𝑦 ∈ 𝑉 𝑒, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Goal: Efficiently answer the query What if dimension 𝑒 is large, say 𝑒 ≫ log 𝑜?
Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑉 𝑒 for some finite 𝑉 Query: given a point Ԧ 𝑦 ∈ 𝑉 𝑒, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Goal: Efficiently answer the query What if dimension 𝑒 is large, say 𝑒 ≫ log 𝑜? Curse of dimensionality: It is conjectured that to solve NNS in high dimension requires either super-polynomial(𝑜) space or super-polynomial(𝑒) time.
Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑉 𝑒 for some finite 𝑉 Query: given a point Ԧ 𝑦 ∈ 𝑉 𝑒, find the 𝑧𝑗 which is closest to Ԧ 𝑦 Goal: Efficiently answer the query What if dimension 𝑒 is large, say 𝑒 ≫ log 𝑜? Curse of dimensionality: It is conjectured that to solve NNS in high dimension requires either super-polynomial(𝑜) space or super-polynomial(𝑒) time. Blessing: Randomization + Approximation
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):
Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min
1≤𝑘≤𝑜 dist( Ԧ
𝑦, 𝑧𝑘)
(𝒅, 𝒔)-ANN (Approximate Near Neighbor):
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):
Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min
1≤𝑘≤𝑜 dist( Ԧ
𝑦, 𝑧𝑘)
(𝒅, 𝒔)-ANN (Approximate Near Neighbor):
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):
Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min
1≤𝑘≤𝑜 dist( Ԧ
𝑦, 𝑧𝑘)
(𝒅, 𝒔)-ANN (Approximate Near Neighbor):
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead.
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):
Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min
1≤𝑘≤𝑜 dist( Ԧ
𝑦, 𝑧𝑘)
(𝒅, 𝒔)-ANN (Approximate Near Neighbor):
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead. 𝐸𝑛𝑗𝑜 = min
1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)
𝐸𝑛𝑏𝑦 = max
1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)
𝑆 = 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 −1, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 0, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 1, ⋯ , 𝐸𝑛𝑏𝑦
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):
Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min
1≤𝑘≤𝑜 dist( Ԧ
𝑦, 𝑧𝑘)
(𝒅, 𝒔)-ANN (Approximate Near Neighbor):
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead. 𝐸𝑛𝑗𝑜 = min
1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)
𝐸𝑛𝑏𝑦 = max
1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)
𝑆 = 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 −1, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 0, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 1, ⋯ , 𝐸𝑛𝑏𝑦 Let 𝑠∗ be the min in 𝑆 s.t. ( 𝑑, 𝑠∗)-ANN returns yes with 𝑧∗
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):
Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min
1≤𝑘≤𝑜 dist( Ԧ
𝑦, 𝑧𝑘)
(𝒅, 𝒔)-ANN (Approximate Near Neighbor):
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead. 𝐸𝑛𝑗𝑜 = min
1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)
𝐸𝑛𝑏𝑦 = max
1≤𝑗<𝑘≤𝑜 dist(𝑧𝑗, 𝑧𝑘)
𝑆 = 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 −1, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 0, 𝐸𝑛𝑗𝑜 2 ⋅ 𝑑 1, ⋯ , 𝐸𝑛𝑏𝑦 Let 𝑠∗ be the min in 𝑆 s.t. ( 𝑑, 𝑠∗)-ANN returns yes with 𝑧∗ dist Ԧ 𝑦, 𝑧∗ ≤ 𝑑 ⋅ 𝑠∗ ∀𝑧𝑗 ∈ 𝑌: dist Ԧ 𝑦, 𝑧𝑗 > 𝑠∗/ 𝑑
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):
Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min
1≤𝑘≤𝑜 dist( Ԧ
𝑦, 𝑧𝑘)
(𝒅, 𝒔)-ANN (Approximate Near Neighbor):
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead. ∀𝑠: ( 𝑑, 𝑠)-ANN can be solved with space 𝑡 and query time 𝑢 𝑑-ANN can be solved with space 𝑃 𝑡 ⋅ log𝑑 ൗ
𝐸𝑛𝑏𝑦 𝐸𝑛𝑗𝑜
and query time 𝑃 𝑢 ⋅ log2 log𝑑 ൗ
𝐸𝑛𝑏𝑦 𝐸𝑛𝑗𝑜
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌, 𝒅-ANN (Approximate Nearest Neighbor):
Return a 𝑧𝑗 s.t. dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑 ⋅ min
1≤𝑘≤𝑜 dist( Ԧ
𝑦, 𝑧𝑘)
(𝒅, 𝒔)-ANN (Approximate Near Neighbor):
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
If we can solve (𝑑, 𝑠)-ANN, then we can solve 𝑑-ANN with little overhead. ∀𝑠: ( 𝑑, 𝑠)-ANN can be solved with space 𝑡 and query time 𝑢 𝑑-ANN can be solved with space 𝑃 𝑡 ⋅ log𝑑 ൗ
𝐸𝑛𝑏𝑦 𝐸𝑛𝑗𝑜
and query time 𝑃 𝑢 ⋅ log2 log𝑑 ൗ
𝐸𝑛𝑏𝑦 𝐸𝑛𝑗𝑜
Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
Let 𝑙, 𝑞 and 𝑡 to be fixed later. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦). GF(2): two elements {0,1}, XOR as sum, AND as multiplication. Therefore, 𝑨𝑗 𝑘 = 𝐵𝑧𝑗 𝑘 = σ𝑚=1
𝑒
𝐵𝑘𝑚 ⋅ 𝑧𝑗 𝑚 mod 2. Space: 𝑃(𝑜 ⋅ 2𝑙) Query time: 𝑃(𝑒𝑙) computation + 𝑃(1) memory access
Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
Let 𝑙, 𝑞 and 𝑡 to be fixed later. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦).
Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
Let 𝑙, 𝑞 and 𝑡 to be fixed later. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦). For suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒:
Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
Let 𝑙, 𝑞 and 𝑡 to be fixed later. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦). For suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒:
(𝑑, 𝑠)-ANN is solved w.h.p.
random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1
𝑒
𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒:
random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1
𝑒
𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: each row vector 𝐵𝑗 of 𝐵 has i.i.d. entries ∈ Bernoulli(𝑞)
random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1
𝑒
𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: each row vector 𝐵𝑗 of 𝐵 has i.i.d. entries ∈ Bernoulli(𝑞)
random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1
𝑒
𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: each row vector 𝐵𝑗 of 𝐵 has i.i.d. entries ∈ Bernoulli(𝑞)
an alternative view regarding the generation of 𝐵𝑗:
random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1
𝑒
𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: each row vector 𝐵𝑗 of 𝐵 has i.i.d. entries ∈ Bernoulli(𝑞)
an alternative view regarding the generation of 𝐵𝑗:
𝑦 𝑘 ≠ Ԧ 𝑧 𝑘, then 𝐵 Ԧ 𝑦 𝑗 = 𝐵 Ԧ 𝑧 𝑗
exactly one of the two choices for 𝐵𝑗𝑘 will make 𝐵 Ԧ 𝑦 𝑗 = 𝐵 Ԧ 𝑧 𝑗
choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠 random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1
𝑒
𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: each row vector 𝐵𝑗 of 𝐵 has i.i.d. entries ∈ Bernoulli(𝑞)
random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1
𝑒
𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠
random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1
𝑒
𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠
dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1
𝑙
𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗
random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1
𝑒
𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠
dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1
𝑙
𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗
random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1
𝑒
𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠
dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1
𝑙
𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗
choose 𝑡 =
1 4+1 2−2− 𝑑+1
𝑙 2
=
3 8 − 2−(𝑑+2) 𝑙
random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1
𝑒
𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠
dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1
𝑙
𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗
choose 𝑡 =
1 4+1 2−2− 𝑑+1
𝑙 2
=
3 8 − 2−(𝑑+2) 𝑙
random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1
𝑒
𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠
dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1
𝑙
𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗
choose 𝑡 =
1 4+1 2−2− 𝑑+1
𝑙 2
=
3 8 − 2−(𝑑+2) 𝑙
independent
random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1
𝑒
𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠
dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1
𝑙
𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗
choose 𝑡 =
1 4+1 2−2− 𝑑+1
𝑙 2
=
3 8 − 2−(𝑑+2) 𝑙
independent
Chernoff bound: Let independent r.v. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙 ∈ {0,1}, let 𝑌 = σ𝑗=1
𝑙
𝑌𝑗, then for 𝑡 > 0: Pr 𝑌 ≥ 𝔽 𝑌 + 𝑡 ≤ exp − 2𝑡2 𝑙 Pr 𝑌 ≤ 𝔽 𝑌 − 𝑡 ≤ exp − 2𝑡2 𝑙
random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1
𝑒
𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠
dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1
𝑙
𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗
choose 𝑡 =
1 4+1 2−2− 𝑑+1
𝑙 2
=
3 8 − 2−(𝑑+2) 𝑙
independent
Chernoff bound: Let independent r.v. 𝑌1, 𝑌2, ⋯ , 𝑌𝑙 ∈ {0,1}, let 𝑌 = σ𝑗=1
𝑙
𝑌𝑗, then for 𝑡 > 0: Pr 𝑌 ≥ 𝔽 𝑌 + 𝑡 ≤ exp − 2𝑡2 𝑙 Pr 𝑌 ≤ 𝔽 𝑌 − 𝑡 ≤ exp − 2𝑡2 𝑙
random 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries ∈ Bernoulli(𝑞) computation on GF(2): 𝐵 Ԧ 𝑦 𝑗 = σ𝑘=1
𝑒
𝐵𝑗𝑘 ⋅ Ԧ 𝑦 𝑘 mod 2 for suitable 𝑙 ∈ 𝑃(log 𝑜), p and s; ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 0,1 𝑒: choose 𝑞 to satisfy 1 − 2𝑞 = 2−1/𝑠
dist 𝐵 Ԧ 𝑦, 𝐵 Ԧ 𝑧 = 𝑌 = σ𝑗=1
𝑙
𝑌𝑗 where 𝑌𝑗 = ቊ1 if 𝐵 Ԧ 𝑦 𝑗 ≠ 𝐵 Ԧ 𝑧 𝑗
choose 𝑡 =
1 4+1 2−2− 𝑑+1
𝑙 2
=
3 8 − 2−(𝑑+2) 𝑙
independent
Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
Let 𝑙 =
ln 𝑜 Τ 1 8−2−(𝑑+2), 𝑞 =
Τ 1 − 2−1/𝑠 2 and 𝑡 = Τ 3 8 − 2−(𝑑+2) 𝑙. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦). Space: 𝑃(𝑜 ⋅ 2𝑙) Query time: 𝑃(𝑒𝑙) computation + 𝑃(1) memory access
Setup: consider Hamming space 0,1 𝑒 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒, (𝒅, 𝒔)-ANN (Approximate Near Neighbor):
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
Let 𝑙 =
ln 𝑜 Τ 1 8−2−(𝑑+2), 𝑞 =
Τ 1 − 2−1/𝑠 2 and 𝑡 = Τ 3 8 − 2−(𝑑+2) 𝑙. Sample a 𝑙 × 𝑒 Boolean matrix 𝐵 with i.i.d. entries from Bernoulli(𝑞). For 𝑗 = 1,2, ⋯ , 𝑜: let 𝑨𝑗 = 𝐵𝑧𝑗 ∈ 0,1 𝑙 on finite field GF(2). Store all 𝑡-balls 𝐶𝑡 𝑣 = 𝑧𝑗 dist 𝑣, 𝑨𝑗 ≤ 𝑡 for all 𝑣 ∈ 0,1 𝑙. Now, upon a query Ԧ 𝑦 ∈ 0,1 𝑒: Retrieve 𝐶𝑡(𝐵 Ԧ 𝑦). If 𝐶𝑡 𝐵 Ԧ 𝑦 = ∅ return “no”, else return any 𝑧𝑗 ∈ 𝐶𝑡(𝐵 Ԧ 𝑦). Space: 𝑃(𝑜 ⋅ 2𝑙) Query time: 𝑃(𝑒𝑙) computation + 𝑃(1) memory access Space: 𝑜𝑃(1) Query time: 𝑃(𝑒 log 𝑜)
Solve (𝑑, 𝑠)-ANN w.h.p.
Given a metric space 𝑌, dist , a random ℎ: 𝑌 → 𝑉 drawn from ℋ is an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH if, for all Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌:
Given a metric space 𝑌, dist , a random ℎ: 𝑌 → 𝑉 drawn from ℋ is an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH if, for all Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: 𝑞 > 𝑟
Given a metric space 𝑌, dist , a random ℎ: 𝑌 → 𝑉 drawn from ℋ is an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH if, for all Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: If there exists an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH ℎ: 𝑌 → 𝑉, then there exists an (𝑠, 𝑑𝑠, 𝑞𝑙, 𝑟𝑙)-LSH : 𝑌 → 𝑉𝑙
Given a metric space 𝑌, dist , a random ℎ: 𝑌 → 𝑉 drawn from ℋ is an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH if, for all Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: If there exists an (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH ℎ: 𝑌 → 𝑉, then there exists an (𝑠, 𝑑𝑠, 𝑞𝑙, 𝑟𝑙)-LSH : 𝑌 → 𝑉𝑙
Independently draw ℎ1, ℎ2, ⋯ , ℎ𝑙 according to the distribution of ℎ 𝑦 = ℎ1 𝑦 , ℎ2 𝑦 , ⋯ , ℎ𝑙 𝑦 ∈ 𝑉𝑙
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
(𝒅, 𝒔)-ANN
Suppose we have (𝑠, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌:
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
(𝒅, 𝒔)-ANN
Suppose we have (𝑠, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in nondecreasing order of (𝑧𝑗). Upon query Ԧ 𝑦 ∈ 𝑌: Find all 𝑧𝑗 such that Ԧ 𝑦 = (𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
(𝒅, 𝒔)-ANN
Suppose we have (𝑠, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in nondecreasing order of (𝑧𝑗). Upon query Ԧ 𝑦 ∈ 𝑌: Find all 𝑧𝑗 such that Ԧ 𝑦 = (𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;
If the real answer is “no”: always correct If the real answer is “yes”: correct with probability at least 𝑞∗
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
(𝒅, 𝒔)-ANN
Suppose we have (𝑠, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in nondecreasing order of (𝑧𝑗). Upon query Ԧ 𝑦 ∈ 𝑌: Find all 𝑧𝑗 such that Ԧ 𝑦 = (𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;
If the real answer is “no”: always correct If the real answer is “yes”: correct with probability at least 𝑞∗ Space: 𝑃(𝑜) Time: 𝑃(log 𝑜)
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
(𝒅, 𝒔)-ANN
Suppose we have (𝑠, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in nondecreasing order of (𝑧𝑗). Upon query Ԧ 𝑦 ∈ 𝑌: Find all 𝑧𝑗 such that Ԧ 𝑦 = (𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;
If the real answer is “no”: always correct If the real answer is “yes”: correct with probability at least 𝑞∗ Space: 𝑃(𝑜) Time: 𝑃(log 𝑜) + 𝑃(1) in expectation
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
Suppose we have (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 ∀ Ԧ 𝑦, Ԧ 𝑧 ∈ 𝑌: Let 𝑚 = Τ 1 𝑞∗, independently draw 1, 2, ⋯ , 𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;
(𝒅, 𝒔)-ANN
Let 𝑚 = Τ 1 𝑞∗, independently draw 1, 2, ⋯ , 𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;
(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌
Let 𝑚 = Τ 1 𝑞∗, independently draw 1, 2, ⋯ , 𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;
(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct.
Let 𝑚 = Τ 1 𝑞∗, independently draw 1, 2, ⋯ , 𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;
(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤
Let 𝑚 = Τ 1 𝑞∗, independently draw 1, 2, ⋯ , 𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;
(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑘 Ԧ 𝑦 ≠ 𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗)]
Let 𝑚 = Τ 1 𝑞∗, independently draw 1, 2, ⋯ , 𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;
(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑘 Ԧ 𝑦 ≠ 𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗)]
Let 𝑚 = Τ 1 𝑞∗, independently draw 1, 2, ⋯ , 𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;
(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑘 Ԧ 𝑦 ≠ 𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗)] ≤ 1 − 𝑞∗ 𝑚 ≤ Τ 1 𝑓
Let 𝑚 = Τ 1 𝑞∗, independently draw 1, 2, ⋯ , 𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;
(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑘 Ԧ 𝑦 ≠ 𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗)] ≤ 1 − 𝑞∗ 𝑚 ≤ Τ 1 𝑓
Let 𝑚 = Τ 1 𝑞∗, independently draw 1, 2, ⋯ , 𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;
(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑘 Ԧ 𝑦 ≠ 𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗)] ≤ 1 − 𝑞∗ 𝑚 ≤ Τ 1 𝑓 ≤ 𝔽 number of such bad 𝑧𝑗 10𝑚 ≤ 𝑜 ⋅ 𝑚 ⋅ Τ 1 𝑜 10𝑚 = 0.1
Let 𝑚 = Τ 1 𝑞∗, independently draw 1, 2, ⋯ , 𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;
(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌 If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" ≤ Pr[∀𝑘, 𝑘 Ԧ 𝑦 ≠ 𝑘 𝑧𝑡 ] + Pr[exist 10𝑚 bad 𝑧𝑗 that dist Ԧ 𝑦, 𝑧𝑗 > 𝑑𝑠 yet ∃𝑘 s.t. 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗)] ≤ 1 − 𝑞∗ 𝑚 ≤ Τ 1 𝑓 ≤ 𝔽 number of such bad 𝑧𝑗 10𝑚 ≤ 𝑜 ⋅ 𝑚 ⋅ Τ 1 𝑜 10𝑚 = 0.1
+ ≤ Τ 1 𝑓 + 0.1 < 0.5
Let 𝑚 = Τ 1 𝑞∗, independently draw 1, 2, ⋯ , 𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;
(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌
If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" < 0.5
Let 𝑚 = Τ 1 𝑞∗, independently draw 1, 2, ⋯ , 𝑚. Maintain 𝒎 sorted tables: For 𝑘 = 1,2, ⋯ , 𝑚: Store 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 in table-𝑘 in nondecreasing order of 𝑘(𝑧𝑗). Upon query 𝒚 ∈ 𝒀: Find first 10 ⋅ 𝑚 such 𝑧𝑗 that ∃𝑘: 𝑘 Ԧ 𝑦 = 𝑘(𝑧𝑗) by binary search. If encounter some 𝑧𝑗 such that dist Ԧ 𝑦, 𝑧𝑗 ≤ 𝑑𝑠 then return this 𝑧𝑗;
(𝑑, 𝑠)-ANN in metric space (𝑌, dist) (𝑑, 𝑑𝑠, 𝑞∗, Τ 1 𝑜)-LSH : 𝑌 → 𝑉 Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: some point Ԧ 𝑦 ∈ 𝑌
If the real answer is “no”: always correct. If exists 𝑧𝑡 such that dist Ԧ 𝑦, 𝑧𝑡 ≤ 𝑠, then Pr answer "no" < 0.5 Space: 𝑃 𝑜𝑚 = 𝑃( Τ 𝑜 𝑞∗) Time: 𝑃 𝑚 ⋅ log 𝑜 = 𝑃 Τ (log 𝑜) 𝑞∗
Setup: metric space (𝑌, dist) Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 𝑌 Query: given a point Ԧ 𝑦 ∈ 𝑌:
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
(𝒅, 𝒔)-ANN Suppose we have (𝑠, 𝑑𝑠, 𝑞, 𝑟)-LSH ℎ: 𝑌 → 𝑉 We have (𝑠, 𝑑𝑠, 𝑞𝑙, Τ 1 𝑜)-LSH : 𝑌 → 𝑉𝑙 where 𝑙 = log(1/𝑟) 𝑜, implying 𝑞𝑙 = 𝑞log1/𝑟 𝑜 = 𝑜−𝜍 𝜍 = log 𝑞 log 𝑟 Hence we can solve (𝑑, 𝑠)-ANN with space 𝑃(𝑜1+𝜍) and query time 𝑃(𝑜𝜍 ⋅ log 𝑜) and one-sided error < 0.5
Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆
Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆 ℋ = ℎ𝑗 | ℎ𝑗 Ԧ 𝑦 = Ԧ 𝑦 𝑗 for 𝑗 = 1,2, ⋯ , 𝑒 ℎ is chosen uniformly at random from ℋ
Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆 ℋ = ℎ𝑗 | ℎ𝑗 Ԧ 𝑦 = Ԧ 𝑦 𝑗 for 𝑗 = 1,2, ⋯ , 𝑒 ℎ is chosen uniformly at random from ℋ
Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆 ℋ = ℎ𝑗 | ℎ𝑗 Ԧ 𝑦 = Ԧ 𝑦 𝑗 for 𝑗 = 1,2, ⋯ , 𝑒 ℎ is chosen uniformly at random from ℋ We have a (𝑠, 𝑑𝑠, 1 − Τ 𝑠 𝑒 , 1 − Τ 𝑑𝑠 𝑒)-LSH ℎ: 0,1 𝑒 → {0,1}
Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆 ℋ = ℎ𝑗 | ℎ𝑗 Ԧ 𝑦 = Ԧ 𝑦 𝑗 for 𝑗 = 1,2, ⋯ , 𝑒 ℎ is chosen uniformly at random from ℋ We have a (𝑠, 𝑑𝑠, 1 − Τ 𝑠 𝑒 , 1 − Τ 𝑑𝑠 𝑒)-LSH ℎ: 0,1 𝑒 → {0,1}
𝜍 = log 1 − Τ 𝑠 𝑒 log(1 − Τ 𝑑𝑠 𝑒) ≤ 1 𝑑
Data: 𝑜 points 𝑧1, 𝑧2, ⋯ , 𝑧𝑜 ∈ 0,1 𝑒 Query: given a point Ԧ 𝑦 ∈ 0,1 𝑒:
𝑦, 𝑧𝑗 ≤ 𝑑𝑠 if ∃𝑧𝑘: dist Ԧ 𝑦, 𝑧𝑘 ≤ 𝑠
𝑦, 𝑧𝑘 > 𝑑𝑠
(𝒅, 𝒔)-ANN in Hamming Space 𝟏, 𝟐 𝒆 ℋ = ℎ𝑗 | ℎ𝑗 Ԧ 𝑦 = Ԧ 𝑦 𝑗 for 𝑗 = 1,2, ⋯ , 𝑒 ℎ is chosen uniformly at random from ℋ We have a (𝑠, 𝑑𝑠, 1 − Τ 𝑠 𝑒 , 1 − Τ 𝑑𝑠 𝑒)-LSH ℎ: 0,1 𝑒 → {0,1}
𝜍 = log 1 − Τ 𝑠 𝑒 log(1 − Τ 𝑑𝑠 𝑒) ≤ 1 𝑑
We can solve (𝑑, 𝑠)-ANN in Hamming space with space 𝑃(𝑜1+1/𝑑), query time 𝑃(𝑜1/𝑑 ⋅ log 𝑜), and one-sided error < 0.5
Dimension reduction (low-distortion metric embedding)
easy to embed a set of 𝑜 points in arbitrary dimension to 𝑃(log 𝑜) dimension with constant distortion. Nearest neighbor search
then we can solve 𝒅-ANN (Approximate Nearest Neighbor) with limited overhead.
(𝑑, 𝑠)-ANN. (Collisions could be helpful for hashing.)