RFC: A new divergence analysis for LLVM Simon Moll, Thorsten Klner - - PowerPoint PPT Presentation

rfc a new divergence analysis for llvm
SMART_READER_LITE
LIVE PREVIEW

RFC: A new divergence analysis for LLVM Simon Moll, Thorsten Klner - - PowerPoint PPT Presentation

RFC: A new divergence analysis for LLVM Simon Moll, Thorsten Klner and Sebastian Hack http://compilers.cs.uni-saarland.de Compiler Design Lab Saarland University Saarland Informatics Campus 1 Today: Divergence Analysis Recap: VPlan+RV


slide-1
SLIDE 1

RFC: A new divergence analysis for LLVM

Simon Moll, Thorsten Klößner and Sebastian Hack

http://compilers.cs.uni-saarland.de Compiler Design Lab Saarland University Saarland Informatics Campus 1

slide-2
SLIDE 2

Recap: VPlan+RV

  • VPlan: new vectorization infrastructure for LLVM.

→ under development.

  • RV: The Region Vectorizer github.com/uni-saarland/rv

Vectorizer for outer loops and whole functions. available today!

  • VPlan+RV: Bring RV’s analyses and transformations to VPlan.

Today: Divergence Analysis Coming up: Partial Control-Flow Linearization (PLDI ’18).

2

slide-3
SLIDE 3

Recap: VPlan+RV

  • VPlan: new vectorization infrastructure for LLVM.

→ under development.

  • RV: The Region Vectorizer github.com/uni-saarland/rv

→ Vectorizer for outer loops and whole functions. → available today!

  • VPlan+RV: Bring RV’s analyses and transformations to VPlan.

Today: Divergence Analysis Coming up: Partial Control-Flow Linearization (PLDI ’18).

2

slide-4
SLIDE 4

Recap: VPlan+RV

  • VPlan: new vectorization infrastructure for LLVM.

→ under development.

  • RV: The Region Vectorizer github.com/uni-saarland/rv

→ Vectorizer for outer loops and whole functions. → available today!

  • VPlan+RV: Bring RV’s analyses and transformations to VPlan.

→ Today: Divergence Analysis → Coming up: Partial Control-Flow Linearization (PLDI ’18).

2

slide-5
SLIDE 5

DivergenceAnalysis

for (int i = 0; i < n; ++i) { for (int j = 0; j < m; ++j) { uni_var = f(i); varying_var = foo(i) + bar(j); } }

vectorized 7 7 7 7

  • 1 1

6 2

  • Integrated with LoopVectorizer (vplan-rv fork).

Not much to do: only single block loops with LLVM’s LV unit tests show what’s possible.

  • Won’t be required by VPlan before patch series #3.

until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs.

3

slide-6
SLIDE 6

DivergenceAnalysis

for (int i = 0; i < n; ++i) { for (int j = 0; j < m; ++j) { uni_var = f(i); varying_var = foo(i) + bar(j); } }

vectorized 7 7 7 7

  • 1 1

6 2

  • Integrated with LoopVectorizer (vplan-rv fork).

Not much to do: only single block loops with LLVM’s LV unit tests show what’s possible.

  • Won’t be required by VPlan before patch series #3.

until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs.

3

slide-7
SLIDE 7

DivergenceAnalysis

for (int i = 0; i < n; ++i) { for (int j = 0; j < m; ++j) { uni_var = f(i); varying_var = foo(i) + bar(j); } }

vectorized 7 7 7 7

  • 1 1

6 2

  • Integrated with LoopVectorizer (vplan-rv fork).

Not much to do: only single block loops with LLVM’s LV unit tests show what’s possible.

  • Won’t be required by VPlan before patch series #3.

until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs.

3

slide-8
SLIDE 8

DivergenceAnalysis

for (int i = 0; i < n; ++i) { for (int j = 0; j < m; ++j) { uni_var = f(i); varying_var = foo(i) + bar(j); } }

vectorized 7 7 7 7

  • 1 1

6 2

  • Integrated with LoopVectorizer (vplan-rv fork).

Not much to do: only single block loops with LLVM’s LV unit tests show what’s possible.

  • Won’t be required by VPlan before patch series #3.

until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs.

3

slide-9
SLIDE 9

DivergenceAnalysis

for (int i = 0; i < n; ++i) { for (int j = 0; j < m; ++j) { uni_var = f(i); varying_var = foo(i) + bar(j); } }

vectorized 7 7 7 7

  • 1 1

6 2

  • Integrated with LoopVectorizer (vplan-rv fork).

→ Not much to do: only single block loops with LLVM’s LV → unit tests show what’s possible.

  • Won’t be required by VPlan before patch series #3.

until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs.

3

slide-10
SLIDE 10

DivergenceAnalysis

for (int i = 0; i < n; ++i) { for (int j = 0; j < m; ++j) { uni_var = f(i); varying_var = foo(i) + bar(j); } }

vectorized 7 7 7 7

  • 1 1

6 2

  • Integrated with LoopVectorizer (vplan-rv fork).

→ Not much to do: only single block loops with LLVM’s LV → unit tests show what’s possible.

  • Won’t be required by VPlan before patch series #3.

until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs.

3

slide-11
SLIDE 11

DivergenceAnalysis

for (int i = 0; i < n; ++i) { for (int j = 0; j < m; ++j) { uni_var = f(i); varying_var = foo(i) + bar(j); } }

vectorized 7 7 7 7

  • 1 1

6 2

  • Integrated with LoopVectorizer (vplan-rv fork).

→ Not much to do: only single block loops with LLVM’s LV → unit tests show what’s possible.

  • Won’t be required by VPlan before patch series #3.

→ until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs.

3

slide-12
SLIDE 12

LLVM’s DivergenceAnalysis (NVPTX/AMDGPU)

A B φ φ

divergent branch uniform

  • LLVM’s DivergenceAnalysis invalid for unstructured CFGs.
  • Our analysis supports unstructured control.

4

slide-13
SLIDE 13

LLVM’s DivergenceAnalysis (NVPTX/AMDGPU)

A B φ φ

divergent branch uniform

  • LLVM’s DivergenceAnalysis invalid for unstructured CFGs.
  • Our analysis supports unstructured control.

4

slide-14
SLIDE 14

LLVM’s DivergenceAnalysis (NVPTX/AMDGPU)

A B φ φ

divergent branch varying φ uniform

  • LLVM’s DivergenceAnalysis invalid for unstructured CFGs.
  • Our analysis supports unstructured control.

4

slide-15
SLIDE 15

LLVM’s DivergenceAnalysis (NVPTX/AMDGPU)

A B φ φ

divergent branch varying φ uniform φ

  • LLVM’s DivergenceAnalysis invalid for unstructured CFGs.
  • Our analysis supports unstructured control.

4

slide-16
SLIDE 16

LLVM’s DivergenceAnalysis (NVPTX/AMDGPU)

A B φ φ

divergent branch varying φ uniform φ

?

  • LLVM’s DivergenceAnalysis invalid for unstructured CFGs.
  • Our analysis supports unstructured control.

4

slide-17
SLIDE 17

LLVM’s DivergenceAnalysis (NVPTX/AMDGPU)

A B φ φ

divergent branch varying φ uniform φ

?

  • LLVM’s DivergenceAnalysis invalid for unstructured CFGs.
  • Our analysis supports unstructured control.

4

slide-18
SLIDE 18

LLVM’s DivergenceAnalysis (NVPTX/AMDGPU)

A B φ φ

divergent branch varying φ uniform φ

?

  • LLVM’s DivergenceAnalysis invalid for unstructured CFGs.
  • Our analysis supports unstructured control.

4

slide-19
SLIDE 19

DivergenceAnalysis GPUDivergenceAnalysis NVPTX/AMDGPU StructurizeCFG

  • use-rv-da

LoopDivergenceAnalysis LoopVectorizer

  • vectorizer-use-da

Available at github.com/cdl-saarland/vplan-rv

5