On the Variety of Static Control Parts in Real-World Applications: - - PowerPoint PPT Presentation

on the variety of static control parts in real world
SMART_READER_LITE
LIVE PREVIEW

On the Variety of Static Control Parts in Real-World Applications: - - PowerPoint PPT Presentation

On the Variety of Static Control Parts in Real-World Applications: from Affine via Multi-dimensional to Polynomial and Just-in-Time Andreas Simbrger Armin Grlinger 4th International Workshop on Polyhedral Compilation Techniques 1 / 25


slide-1
SLIDE 1

On the Variety of Static Control Parts in Real-World Applications: from Affine via Multi-dimensional to Polynomial and Just-in-Time

Andreas Simbürger Armin Größlinger 4th International Workshop on Polyhedral Compilation Techniques

1 / 25

slide-2
SLIDE 2

Defining the Real World

2 / 25

slide-3
SLIDE 3

Defining the Real World

◮ LLVM (llvm.org)

2 / 25

slide-4
SLIDE 4

Defining the Real World

◮ LLVM (llvm.org) ◮ Polly (polly.llvm.org)

2 / 25

slide-5
SLIDE 5

Defining the Real World

◮ LLVM (llvm.org) ◮ Polly (polly.llvm.org) ◮ PolyJIT

(www.infosun.fim.uni-passau.de/cl/PolyJIT)

2 / 25

slide-6
SLIDE 6

Automatic Detection of SCoPs in LLVM

LLVM IR Loop detection Loop normalization Scalar evolution Scop detection Polly IR

3 / 25

slide-7
SLIDE 7

Effectiveness of Automatic Polyhedral Optimization

LLVM IR SCoP detection Polly

  • ptimizer
  • ptimized

LLVM IR

4 / 25

slide-8
SLIDE 8

Effectiveness of Automatic Polyhedral Optimization

LLVM IR SCoP detection Polly

  • ptimizer
  • ptimized

LLVM IR Exploitation of parallelism (transformations)

4 / 25

slide-9
SLIDE 9

Effectiveness of Automatic Polyhedral Optimization

LLVM IR SCoP detection Polly

  • ptimizer
  • ptimized

LLVM IR Detection: Applicability and potential of valid loops Exploitation of parallelism (transformations)

4 / 25

slide-10
SLIDE 10

Effectiveness of Automatic Polyhedral Optimization

LLVM IR SCoP detection Polly

  • ptimizer
  • ptimized

LLVM IR Detection: Applicability and potential of valid loops Exploitation of parallelism (transformations) The detection process lacks thorough empirical evaluation!

4 / 25

slide-11
SLIDE 11

PolyJIT: pprof

◮ Set of 50 programs commonly used in various domains. ◮ 8 domains (Multimedia, Scientific, Simulation, Encryption,

Compilation, Compression, Databases, Verification).

◮ Extract run time and compile time statistics.

5 / 25

slide-12
SLIDE 12

Measuring a SCoP’s fraction of the total run time

What fraction of a program’s total run time is spent inside SCoPs? for (int i=0; i<=n; ++i) for (int j=i; j<=n; ++j) if (i >= n-j) { S: A[i+n][j+i] = B[n+2*i-1][j]; T: B[i+n][j-i] = A[n-2*i+1][j]; }

Definition (Execution SCoP coverage)

ExecCov = Time spent inside SCoPs Total program run time

6 / 25

slide-13
SLIDE 13

Measuring a SCoP’s fraction of the total run time

What fraction of a program’s total run time is spent inside SCoPs?

Definition (Execution SCoP coverage)

ExecCov = Time spent inside SCoPs Total program run time

6 / 25

slide-14
SLIDE 14

Static Control Parts: Class Static

Detection at compile time

for (int i=0; i<=n; ++i) for (int j=i; j<=n; ++j) if (i >= n-j) { S: A[i+n][j+i] = B[n+2*i-1][j]; T: B[i+n][j-i] = A[n-2*i+1][j]; }

7 / 25

slide-15
SLIDE 15

Static Control Parts: Class Static

Detection at compile time

for (int i=0; i<=n; ++i) for (int j=i; j<=n; ++j) if (i >= n-j) { S: A[i+n][j+i] = B[n+2*i-1][j]; T: B[i+n][j-i] = A[n-2*i+1][j]; }

  • 1. Affine expressions in

◮ Loop bounds ◮ Conditions ◮ Memory accesses 7 / 25

slide-16
SLIDE 16

Static Control Parts: Class Static

Detection at compile time

for (int i=0; i<=n; ++i) for (int j=i; j<=n; ++j) if (i >= n-j) { S: A[i+n][j+i] = B[n+2*i-1][j]; T: B[i+n][j-i] = A[n-2*i+1][j]; }

  • 1. Affine expressions in

◮ Loop bounds ◮ Conditions ◮ Memory accesses

  • 2. Static control flow

7 / 25

slide-17
SLIDE 17

Static Control Parts: Class Static

Detection at compile time

for (int i=0; i<=n; ++i) for (int j=i; j<=n; ++j) if (i >= n-j) { S: A[i+n][j+i] = B[n+2*i-1][j]; T: B[i+n][j-i] = A[n-2*i+1][j]; }

  • 1. Affine expressions in

◮ Loop bounds ◮ Conditions ◮ Memory accesses

  • 2. Static control flow
  • 3. Side-effect known function calls

7 / 25

slide-18
SLIDE 18

Static Control Parts: Class Static

Detection at compile time

for (int i=0; i<=n; ++i) for (int j=i; j<=n; ++j) if (i >= n-j) { S: A[i+n][j+i] = B[n+2*i-1][j]; T: B[i+n][j-i] = A[n-2*i+1][j]; }

  • 1. Affine expressions in

◮ Loop bounds ◮ Conditions ◮ Memory accesses

  • 2. Static control flow
  • 3. Side-effect known function calls

What can we do, if it is not a static (affine) SCoP?

7 / 25

slide-19
SLIDE 19

Problem 1: Multi-dimensional array accesses

Contiguous

A[i][j];

8 / 25

slide-20
SLIDE 20

Problem 1: Multi-dimensional array accesses

Contiguous

clang -O0

%0 = mul nsw i32 %i, %n %idx = getelementptr float* %A, i32 %0 %idx1 = getelementptr float* %idx, i32 %j A[i][j]

8 / 25

slide-21
SLIDE 21

Problem 1: Multi-dimensional array accesses

Contiguous

clang -O1

%0 = mul nsw i32 %i, %n %idx.s = add i32 %0, %j %idx1 = getelementptr float* %A, i32 %idx.s A[n*i+j]

8 / 25

slide-22
SLIDE 22

Problem 1: Multi-dimensional array accesses

Contiguous

clang -O1

%0 = mul nsw i32 %i, %n %idx.s = add i32 %0, %j %idx1 = getelementptr float* %A, i32 %idx.s A[n*i+j]

8 / 25

slide-23
SLIDE 23

Delinearization of array accesses

A[n*i+i+j] n ∗ i + i + j = (n + 1) ∗ i + j

9 / 25

slide-24
SLIDE 24

Delinearization of array accesses

A[n*i+i+j] n ∗ i + i + j = (n + 1) ∗ i + j A[i][i+j]

j i n

9 / 25

slide-25
SLIDE 25

Delinearization of array accesses

A[n*i+i+j] n ∗ i + i + j = (n + 1) ∗ i + j A[i][i+j] A[i][j]

j i n j i n+1

9 / 25

slide-26
SLIDE 26

Static Control Parts: Class Multi

Let’s allow delinearizeable accesses!

A[(n+2+m)*i] A[i’+2*m+2*n]

10 / 25

slide-27
SLIDE 27

Static Control Parts: Class Multi

Let’s allow delinearizeable accesses!

A[(n+2+m)*i] A[i’+2*m+2*n] a = a′ ni + 2i + mi = i′ + 2m + 2n

10 / 25

slide-28
SLIDE 28

Static Control Parts: Class Multi

Let’s allow delinearizeable accesses!

A[(n+2+m)*i] A[i’+2*m+2*n] a = a′ ni + 2i + mi = i′ + 2m + 2n

10 / 25

slide-29
SLIDE 29

Static Control Parts: Class Multi

Let’s allow delinearizeable accesses!

A[(n+2+m)*i] A[i’+2*m+2*n] a = a′ ni + 2i + mi = i′ + 2m + 2n a − a′ = 0 ni + 2i + mi − i′ − 2m − 2n = 0

10 / 25

slide-30
SLIDE 30

Static Control Parts: Class Multi

Let’s allow delinearizeable accesses!

A[(n+2+m)*i] A[i’+2*m+2*n] a = a′ ni + 2i + mi = i′ + 2m + 2n a − a′ = 0 ni + 2i + mi − i′ − 2m − 2n = 0 Split into terms ni, 2i, mi, −i′, −2m, −2n

10 / 25

slide-31
SLIDE 31

Static Control Parts: Class Multi

Let’s allow delinearizeable accesses!

A[(n+2+m)*i] A[i’+2*m+2*n] a = a′ ni + 2i + mi = i′ + 2m + 2n a − a′ = 0 ni + 2i + mi − i′ − 2m − 2n = 0 Split into terms ni, 2i, mi, −i′, −2m, −2n Group by parameters n(i − 2) + m(i − 2) + 1(2i − i′)

10 / 25

slide-32
SLIDE 32

Static Control Parts: Class Multi

Let’s allow delinearizeable accesses!

A[(n+2+m)*i] A[i’+2*m+2*n] a = a′ ni + 2i + mi = i′ + 2m + 2n a − a′ = 0 ni + 2i + mi − i′ − 2m − 2n = 0 Split into terms ni, 2i, mi, −i′, −2m, −2n Group by parameters n(i − 2) + m(i − 2) + 1(2i − i′) Factor out common expressions (n + m)(i − 2) + (1)(2i − i′)

10 / 25

slide-33
SLIDE 33

Static Control Parts: Class Multi

Let’s allow delinearizeable accesses!

A[(n+2+m)*i] A[i’+2*m+2*n] a = a′ ni + 2i + mi = i′ + 2m + 2n a − a′ = 0 ni + 2i + mi − i′ − 2m − 2n = 0 Split into terms ni, 2i, mi, −i′, −2m, −2n Group by parameters n(i − 2) + m(i − 2) + 1(2i − i′) Factor out common expressions (n + m)(i − 2) + (1)(2i − i′)

10 / 25

slide-34
SLIDE 34

Static Control Parts: Class Multi

Let’s allow delinearizeable accesses!

A[(n+2+m)*i] A[i’+2*m+2*n] a = a′ ni + 2i + mi = i′ + 2m + 2n a − a′ = 0 ni + 2i + mi − i′ − 2m − 2n = 0 Split into terms ni, 2i, mi, −i′, −2m, −2n Group by parameters n(i − 2) + m(i − 2) + 1(2i − i′) Factor out common expressions (n + m)(i − 2) + (1)(2i − i′)

i

  • n-m

n+m

Bounds check |1(2i − i′)| ≤ |n + m| − 1

10 / 25

slide-35
SLIDE 35

Static Control Parts: Class Multi

Let’s allow delinearizeable accesses!

A[(n+2+m)*i] A[i’+2*m+2*n] a = a′ ni + 2i + mi = i′ + 2m + 2n a − a′ = 0 ni + 2i + mi − i′ − 2m − 2n = 0 Split into terms ni, 2i, mi, −i′, −2m, −2n Group by parameters n(i − 2) + m(i − 2) + 1(2i − i′) Factor out common expressions (n + m)(i − 2) + (1)(2i − i′)

i

  • n-m

n+m

Bounds check |1(2i − i′)| ≤ |n + m| − 1

10 / 25

slide-36
SLIDE 36

Static Control Parts: Class Multi

Let’s allow delinearizeable accesses!

A[(n+2+m)*i] A[i’+2*m+2*n] a = a′ ni + 2i + mi = i′ + 2m + 2n a − a′ = 0 ni + 2i + mi − i′ − 2m − 2n = 0 Split into terms ni, 2i, mi, −i′, −2m, −2n Group by parameters n(i − 2) + m(i − 2) + 1(2i − i′) Factor out common expressions (n + m)(i − 2) + (1)(2i − i′)

i

  • n-m

n+m

Bounds check |1(2i − i′)| ≤ |n + m| − 1 a − a′ = 0 ⇔ i − 2 = 0 ∧ 2i − i′ = 0

10 / 25

slide-37
SLIDE 37

Static Control Parts: Class Multi

Let’s allow delinearizeable accesses!

A[(n+2+m)*i] A[i’+2*m+2*n] a = a′ ni + 2i + mi = i′ + 2m + 2n a − a′ = 0 ni + 2i + mi − i′ − 2m − 2n = 0 Split into terms ni, 2i, mi, −i′, −2m, −2n Group by parameters n(i − 2) + m(i − 2) + 1(2i − i′) Factor out common expressions (n + m)(i − 2) + (1)(2i − i′)

i

  • n-m

n+m

Bounds check |1(2i − i′)| ≤ |n + m| − 1 a − a′ = 0 ⇔ i − 2 = 0 ∧ 2i − i′ = 0 i = 2 and i′ = 4

10 / 25

slide-38
SLIDE 38

Static Control Parts: Class Multi

Let’s allow delinearizeable accesses!

a − a′ =

k

  • x=1

πxγx (1) Where πx are polynomials in the parameters and γx are affine expressions in the iterators. ∀ i ∈ D : |πxγx| ≤ |πx+1| − 1 for 1 ≤ x < k (2) When (1) and (2) hold, a − a′ = 0 is equivalent to γ1 = 0 ∧ · · · ∧ γk = 0

11 / 25

slide-39
SLIDE 39

Static Control Parts: Class Algebraic

Let’s allow polynomials!

for (int i=0; i<=n; i++) { A[m*i+n] = __; __ = A[m*(i-1)+n]; }

12 / 25

slide-40
SLIDE 40

Static Control Parts: Class Algebraic

Let’s allow polynomials!

for (int i=0; i<=n; i++) { A[m*i+n] = __; __ = A[m*(i-1)+n]; } Accept arbitrary polynomials in

◮ Loop bounds ◮ Array subscripts ◮ Unsupported: Products in the iterators (i*i)

12 / 25

slide-41
SLIDE 41

Static Control Parts: Class Algebraic

Let’s allow polynomials!

for (int i=0; i<=n; i++) { A[m*i+n] = __; __ = A[m*(i-1)+n]; } Accept arbitrary polynomials in

◮ Loop bounds ◮ Array subscripts ◮ Unsupported: Products in the iterators (i*i)

Multi is a subset of Algebraic

12 / 25

slide-42
SLIDE 42

Problem 2: Multi-dimensional array accesses

Non-Contiguous

float **A; A[i][j] = A[i-1][j-1];

13 / 25

slide-43
SLIDE 43

Problem 2: Multi-dimensional array accesses

Non-Contiguous

%i = load i64* %i.addr %j = load i64* %j.addr %outer = load float*** %A %arrayidx3 = getelementptr inbounds float** %outer, i64 %i %inner = load float** %arrayidx3 %arrayidx4 = getelementptr inbounds float* %inner, i64 %j

13 / 25

slide-44
SLIDE 44

Static Control Parts: Class Pointer to Pointer

Let’s allow pointers to pointers!

float **A; for (int i=0; i<=n; ++i) for (int j=i; j<=n; ++j) A[i][j] = A[i-1][j-1];

14 / 25

slide-45
SLIDE 45

Static Control Parts: Class Pointer to Pointer

Let’s allow pointers to pointers!

float **A; for (int i=0; i<=n; ++i) for (int j=i; j<=n; ++j) A[i][j] = A[i-1][j-1];

◮ No aliasing between inner dimensions.

14 / 25

slide-46
SLIDE 46

Static Control Parts: Class Pointer to Pointer

Let’s allow pointers to pointers!

float **A; for (int i=0; i<=n; ++i) for (int j=i; j<=n; ++j) A[i][j] = A[i-1][j-1];

◮ No aliasing between inner dimensions. ◮ No aliasing of the outer dimension with other pointers/arrays.

14 / 25

slide-47
SLIDE 47

Static Control Parts: Class Dynamic

Let’s be lazy and do everything at run time

for (int i=0; i<=n; i++) { A[m*i+n] = __; __ = A[m*(i-1)+n]; }

15 / 25

slide-48
SLIDE 48

Static Control Parts: Class Dynamic

Let’s be lazy and do everything at run time

for (int i=0; i<=n; i++) { A[m*i+n] = __; __ = A[m*(i-1)+n]; }

◮ Run time specialization for

15 / 25

slide-49
SLIDE 49

Static Control Parts: Class Dynamic

Let’s be lazy and do everything at run time

for (int i=0; i<=n; i++) { A[42*i+n] = __; __ = A[42*(i-1)+n]; }

◮ Run time specialization for

◮ Known parameter values 15 / 25

slide-50
SLIDE 50

Static Control Parts: Class Dynamic

Let’s be lazy and do everything at run time

for (int i=0; i<=n; i++) { A[42*i+n] = __; __ = A[42*(i-1)+n]; }

◮ Run time specialization for

◮ Known parameter values ◮ Known aliasing 15 / 25

slide-51
SLIDE 51

Static Control Parts: Class Dynamic

Let’s be lazy and do everything at run time

for (int i=0; i<=n; i++) { A[42*i+n] = __; __ = B[42*(i-1)+n]; }

◮ Run time specialization for

◮ Known parameter values ◮ Known aliasing 15 / 25

slide-52
SLIDE 52

Static Control Parts: Class Dynamic

Let’s be lazy and do everything at run time

for (int i=0; i<=n; i++) { A[42*i+n] = __; __ = B[42*(i-1)+n]; }

◮ Run time specialization for

◮ Known parameter values ◮ Known aliasing

◮ Function calls to other SCoPs

15 / 25

slide-53
SLIDE 53

Expectations

Compile time

◮ Multi-dimensional array accesses are used often, so Multi

(Algebraic) should contain a lot more SCoPs than Static.

◮ Pointer to Pointer should cover a few more SCoPs than Static.

Run time

Static ⊆ Multi ⊆ Algebraic ⊆ Dynamic Static ⊆ Pointer to Pointer ⊆ Pointer to Pointer (No Alias)

16 / 25

slide-54
SLIDE 54

Reality

17 / 25

slide-55
SLIDE 55

Reality

Name Stat | Alg | Dyn Sst Compilation js 120 python 84 ruby 100 tcc 18 Compression 7za 100 bzip2 30 gzip 23 xz 23 Multimedia avconv 950 povray 110 x264 55 Scientific linpack 9 xeigtstc 630 xeigtstd 610 xeigtsts 610 xlintstd 690 xlintstds 160 xlintstrfc 170 xlintstrfd 150 xlintstrfs 150 xlintstrfz 170 xlintsts 530 xlintstzc 190 Name Stat | Alg | Dyn Sst Encryption blowfish bn 24 cast 2 ccrypt 3 des 9 dsa 24 ecdsa 24 hmac 24 mcrypt-aes 38 mcrypt-cip 39 md5 24

  • penssl

67 rc4 1 rsa 24 sha1 24 sha256 24 sha512 24 ssl 32 Simulation crafty 57 lammps 330 lulesh-omp 16 lulesh 14 18 / 25

slide-56
SLIDE 56

Findings

Class Algebraic and Multi

19 / 25

slide-57
SLIDE 57

Findings

Class Algebraic and Multi

◮ Very low increment in SCoPs (between 2 and 37) over Static.

19 / 25

slide-58
SLIDE 58

Findings

Class Algebraic and Multi

◮ Very low increment in SCoPs (between 2 and 37) over Static. ◮ Only 10 out of 50 experiments show a small number of

non-affine expressions

19 / 25

slide-59
SLIDE 59

Findings

Class Algebraic and Multi

◮ Very low increment in SCoPs (between 2 and 37) over Static. ◮ Only 10 out of 50 experiments show a small number of

non-affine expressions

◮ Povray shows an ExecCovAlgebraic of 41% as compared to an

ExecCovStatic of 8.5%.

19 / 25

slide-60
SLIDE 60

Findings

Class Algebraic and Multi

◮ Very low increment in SCoPs (between 2 and 37) over Static. ◮ Only 10 out of 50 experiments show a small number of

non-affine expressions

◮ Povray shows an ExecCovAlgebraic of 41% as compared to an

ExecCovStatic of 8.5%.

◮ No notable increase in ExecCovAlgebraic in the other

experiments.

19 / 25

slide-61
SLIDE 61

Findings

Name Sst Spp Sppa Compilation js 120 36 95 python 84 6 33 ruby 100 19 40 tcc 18 Compression 7za 100 9 24 bzip2 30 gzip 23 xz 23 1 1 Multimedia avconv 950 11 310 povray 110 3 71 x264 55 14 44 Scientific linpack 9 xeigtstc 630 5 5 xeigtstd 610 xeigtsts 610 xlintstd 690 xlintstds 160 xlintstrfc 170 xlintstrfd 150 xlintstrfs 150 xlintstrfz 170 xlintsts 530 xlintstzc 190 Name Sst Spp Sppa Encryption blowfish bn 24 1 10 cast 2 ccrypt 3 des 9 7 dsa 24 1 10 ecdsa 24 1 10 hmac 24 1 10 mcrypt-aes 38 mcrypt-cip 39 md5 24 1 10

  • penssl

67 1 18 rc4 1 rsa 24 1 10 sha1 24 1 10 sha256 24 1 10 sha512 24 1 10 ssl 32 1 12 Simulation crafty 57 9 lammps 330 45 470 lulesh-omp 16 lulesh 14 20 / 25

slide-62
SLIDE 62

Findings

Class Pointer to Pointer

21 / 25

slide-63
SLIDE 63

Findings

Class Pointer to Pointer

◮ Low increment in SCoPs compared to Static (between 1 and

45). 22 out of 50 experiments show an increment in SCoP count.

21 / 25

slide-64
SLIDE 64

Findings

Class Pointer to Pointer

◮ Low increment in SCoPs compared to Static (between 1 and

45). 22 out of 50 experiments show an increment in SCoP count.

◮ The SCoP count can be (optimistically) increased by disabling

alias checks.

21 / 25

slide-65
SLIDE 65

Findings

Class Pointer to Pointer

◮ Low increment in SCoPs compared to Static (between 1 and

45). 22 out of 50 experiments show an increment in SCoP count.

◮ The SCoP count can be (optimistically) increased by disabling

alias checks.

◮ No ExecCovPointer to Pointer information available yet.

21 / 25

slide-66
SLIDE 66

Run-time findings

  • 20

40 60 80 Static Dynamic Algebraic

Class ExecCov [%]

22 / 25

slide-67
SLIDE 67

Threats to Validity

23 / 25

slide-68
SLIDE 68

Threats to Validity

  • 1. Construct Validity: Timing causes overhead.

23 / 25

slide-69
SLIDE 69

Threats to Validity

  • 1. Construct Validity: Timing causes overhead.
  • 2. External Validity: Generalizability depends on the sample size.

23 / 25

slide-70
SLIDE 70

Threats to Validity

  • 1. Construct Validity: Timing causes overhead.
  • 2. External Validity: Generalizability depends on the sample size.
  • 3. Internal Validity: Quality of the input data. Relying on

developer’s testing.

23 / 25

slide-71
SLIDE 71

Questions?

Defining the Real World

2 / 25

The Real World

Static Control Parts: Class Static

Detection at compile time for (int i=0; i<=n; ++i) for (int j=i; j<=n; ++j) if (i >= n-j) { S: A[i+n][j+i] = B[n+2*i-1][j]; T: B[i+n][j-i] = A[n-2*i+1][j]; }

  • 1. Affine expressions in
◮ Loop bounds ◮ Conditions ◮ Memory accesses 7 / 25

SCoPs: Static

Problem 1: Multi-dimensional array accesses

Contiguous

clang -O0 %0 = mul nsw i32 %i, %n %idx = getelementptr float* %A, i32 %0 %idx1 = getelementptr float* %idx, i32 %j A[i][j]

8 / 25

SCoPs: Multi

Static Control Parts: Class Algebraic

Let’s allow polynomials! for (int i=0; i<=n; i++) { A[m*i+n] = __; __ = A[m*(i-1)+n]; } Accept arbitrary polynomials in ◮ Loop bounds ◮ Array subscripts ◮ Unsupported: Products in the iterators (i*i)

12 / 25

SCoPs: Algebraic

Problem 2: Multi-dimensional array accesses

Non-Contiguous %i = load i64* %i.addr %j = load i64* %j.addr %outer = load float*** %A %arrayidx3 = getelementptr inbounds float** %outer, i64 %i %inner = load float** %arrayidx3 %arrayidx4 = getelementptr inbounds float* %inner, i64 %j

13 / 25

SCoPs: Pointer to Pointer

Static Control Parts: Class Dynamic

Let’s be lazy and do everything at run time for (int i=0; i<=n; i++) { A[m*i+n] = __; __ = A[m*(i-1)+n]; } ◮ Run time specialization for

15 / 25

SCoPs: Dynamic

Expectations Compile time

◮ Multi-dimensional array accesses are used often, so Multi (Algebraic) should contain a lot more SCoPs than Static. ◮ Pointer to Pointer should cover a few more SCoPs than Static.

Run time Static ⊆ Multi ⊆ Algebraic ⊆ Dynamic Static ⊆ Pointer to Pointer ⊆ Pointer to Pointer (No Alias)

16 / 25

Expectations

Reality

17 / 25

Reality 24 / 25

slide-72
SLIDE 72

Static Control Parts: Class Multi

Let’s allow delinearizeable accesses!

  • 1. Split a − a′ into its terms, i.e., a − a′ = l

x=1 tx where each

tx is a product of iterators and parameters (and a constant).

  • 2. Group terms by their parameters, i.e., a − a′ = m

x=1 ρxγx

where each ρx is a product of parameters (or a constant).

  • 3. Factor out common γx, i.e., a − a′ = k

x=1 πxγx

  • 4. Check the (total) ordering criterion using quantifier

elimination, i.e., ∀ i, i′, p ( i ∈ D ∧ i′ ∈ D′ → |πxγx| ≤ |πy| − 1)

25 / 25