[PPT] - Empirical Comparisons of Fast Methods Dustin Lang and Mike Klaas PowerPoint Presentation

SLIDE 1

Empirical Comparisons of Fast Methods

Dustin Lang and Mike Klaas

{dalang, klaas}@cs.ubc.ca

University of British Columbia December 17, 2004

Fast N-Body Learning - Empirical Comparisons – p. 1

SLIDE 2

A Map of Fast Methods

Dual−Tree KD−tree Anchors Fast Gauss Transform Gaussian Kernel Improved FGT Fast Multipole Method Sum−Kernel Methods Regular Grid Box Filter

Dual−Tree KD−tree Anchors Distance Transform Regular Grid Max−Kernel Methods

Fast N-Body Learning - Empirical Comparisons – p. 2

SLIDE 3

The Role of Fast Methods

We claim that to be useful for other researchers, Fast Methods need:

guaranteed, adjustable error bounds: users

can set the error bound low during development stage, then experiment once they know their code works.

no parameters that need to be adjusted by

users (other than error tolerance).

documented error behaviour: we must explain

the properties of our approximation errors.

Fast N-Body Learning - Empirical Comparisons – p. 3

SLIDE 4

Testing Framework

We tested: Sum-Kernel: fj =

N

i=1

wi exp

−xi − yj2

2

h2

Max-Kernel:

x∗

j = N

argmax

i=1

wi exp
−xi − yj2

2

h2 Gaussian kernel, fixed bandwidth h, non-negative weights wi, j = 1 . . . N.

Fast N-Body Learning - Empirical Comparisons – p. 4

SLIDE 5

Testing Framework (2)

For the Sum-Kernel problem, we allow a given error tolerance ǫ: |fj − ftrue| ≤ ǫ for each j. We tested:

Fast Gauss Transform (FGT)
Improved Fast Gauss Transform (IFGT)
Dual-Tree with kd-tree (KDtree)
Dual-Tree with ball-tree constructed via

Anchors Hierarchy (Anchors)

Fast N-Body Learning - Empirical Comparisons – p. 5

SLIDE 6

Methods Tested

Fast Gauss Transform (FGT) code by Firas Hamze of UBC. KDtree and Anchors Dual-Tree code by Dustin. The same Dual-Tree code was used for KDtree and Anchors.

Fast N-Body Learning - Empirical Comparisons – p. 6

SLIDE 7

Methods Tested (2)

Ramani Duraiswami and Changjiang Yang generously gave their code for the Improved Fast Gauss Transform (IFGT). To make the IFGT fit in our testing framework, we had to devise a method for choosing parameters. Our method seems reasonable but is probably not optimal. All methods: in C with Matlab bindings.

Fast N-Body Learning - Empirical Comparisons – p. 7

SLIDE 8

Results (1): A Worst-Case Scenario

Uniformly distributed points, uniformly distributed weights, 3 dimensions, large bandwidth h = 0.1, ǫ = 10−6: Time.

Naive is usually

fastest.

Only FGT is faster -

but only ∼ 3×.

IFGT may become

faster

after

1.5 hours of compute time.

10

2

10

3

10

4

10

5

10

−2

10 10

2

10

4

N CPU Time (s) Naive FGT IFGT Anchors KDtree

Fast N-Body Learning - Empirical Comparisons – p. 8

SLIDE 9

Results (1): A Worst-Case Scenario

Uniformly distributed points, uniformly distributed weights, 3 dimensions, large bandwidth h = 0.1, ǫ = 10−6: Memory.

Dual-Tree memory

requirements are an issue.

10

2

10

3

10

4

10

5

10

6

10

7

10

8

10

9

N Memory Usage (bytes) FGT IFGT Anchors KDtree

Fast N-Body Learning - Empirical Comparisons – p. 8

SLIDE 10

Results (2)

Uniformly distributed points, uniformly distributed weights, 3 dimensions, smaller bandwidth h = 0.01, ǫ = 10−6.

IFGT cannot be

run– more than 1010 expansion terms required for N = 100 points.

Dual-Tree and FGT

are fast, but not O(N).

10

2

10

3

10

4

10

5

10

−2

10

−1

10 10

1

10

2

N CPU Time (s) Naive FGT Anchors KDtree Order N*sqrt(N) Order N

Fast N-Body Learning - Empirical Comparisons – p. 9

SLIDE 11

Results (2)

Uniformly distributed points, uniformly distributed weights, 3 dimensions, smaller bandwidth h = 0.01, ǫ = 10−6.

Memory

require- ments are still an issue.

10

2

10

3

10

4

10

5

10

6

10

7

10

8

10

9

N Memory Usage (bytes) FGT Anchors KDtree

Fast N-Body Learning - Empirical Comparisons – p. 9

SLIDE 12

Results (3)

Uniform data and weights, N = 10,000, ǫ = 10−3, h = 0.01, varying dimension: CPU time.

IFGT very fast for

1D, infeasible beyond 2D.

KDtree,

Anchors show (unex- pected?)

ptimal

behaviour around 3

r 4 dimensions.

10 10

1

10

2

10

−1

10 10

1

10

2

10

3

Dimension CPU Time (s) Naive FGT IFGT Anchors KDtree

Fast N-Body Learning - Empirical Comparisons – p. 10

SLIDE 13

Results (3)

Uniform data and weights, N = 10,000, ǫ = 10−3, h = 0.01, varying dimension: Memory usage.

10 10

1

10

2

10

6

10

7

10

8

10

9

Dimension Memory Usage (bytes) Naive FGT IFGT Anchors KDtree

Fast N-Body Learning - Empirical Comparisons – p. 10

SLIDE 14

Results (4)

Uniform sources, uniform targets, N = 10,000, h = 0.01, D = 3, ǫ = 10−6: CPU time.

Cost of Dual-Tree

methods increases slowly with accuracy.

FGT

cost rises more quickly.

10

−11

10

−9

10

−7

10

−5

10

−3

10

−1

10 10

1

Epsilon CPU Time Naive FGT Anchors KDtree

Fast N-Body Learning - Empirical Comparisons – p. 11

SLIDE 15

Results (4)

Uniform sources, uniform targets, N = 10,000, h = 0.01, D = 3, ǫ = 10−6: CPU time relative to Uniform.

Error of Dual-Tree

methods almost exactly as large as allowed (ǫ).

FGT (and presum-

ably IFGT) overes- timate the error– thus do more work than required.

10

−10

10

−5

10

−10

10

−5

Epsilon Real Error FGT Anchors KDtree

Fast N-Body Learning - Empirical Comparisons – p. 11

SLIDE 16