RFC: A new divergence analysis for LLVM Simon Moll, Thorsten Klner - - PowerPoint PPT Presentation

▶

Feb 19, 2023 40 likes •234 views

RFC: A new divergence analysis for LLVM Simon Moll, Thorsten Klner and Sebastian Hack http://compilers.cs.uni-saarland.de Compiler Design Lab Saarland University Saarland Informatics Campus 1 Today: Divergence Analysis Recap: VPlan+RV

SLIDE 1

RFC: A new divergence analysis for LLVM

Simon Moll, Thorsten Klößner and Sebastian Hack

http://compilers.cs.uni-saarland.de Compiler Design Lab Saarland University Saarland Informatics Campus 1

SLIDE 2

Recap: VPlan+RV

VPlan: new vectorization infrastructure for LLVM.

→ under development.

RV: The Region Vectorizer github.com/uni-saarland/rv

Vectorizer for outer loops and whole functions. available today!

VPlan+RV: Bring RV’s analyses and transformations to VPlan.

Today: Divergence Analysis Coming up: Partial Control-Flow Linearization (PLDI ’18).

2

SLIDE 3

Recap: VPlan+RV

VPlan: new vectorization infrastructure for LLVM.

→ under development.

RV: The Region Vectorizer github.com/uni-saarland/rv

→ Vectorizer for outer loops and whole functions. → available today!

VPlan+RV: Bring RV’s analyses and transformations to VPlan.

Today: Divergence Analysis Coming up: Partial Control-Flow Linearization (PLDI ’18).

2

SLIDE 4

Recap: VPlan+RV

VPlan: new vectorization infrastructure for LLVM.

→ under development.

RV: The Region Vectorizer github.com/uni-saarland/rv

→ Vectorizer for outer loops and whole functions. → available today!

VPlan+RV: Bring RV’s analyses and transformations to VPlan.

→ Today: Divergence Analysis → Coming up: Partial Control-Flow Linearization (PLDI ’18).

2

SLIDE 5

DivergenceAnalysis

for (int i = 0; i < n; ++i) { for (int j = 0; j < m; ++j) { uni_var = f(i); varying_var = foo(i) + bar(j); } }

vectorized 7 7 7 7

6 2

Integrated with LoopVectorizer (vplan-rv fork).

Not much to do: only single block loops with LLVM’s LV unit tests show what’s possible.

Won’t be required by VPlan before patch series #3.

until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs.

3

SLIDE 6

DivergenceAnalysis

for (int i = 0; i < n; ++i) { for (int j = 0; j < m; ++j) { uni_var = f(i); varying_var = foo(i) + bar(j); } }

vectorized 7 7 7 7

6 2

Integrated with LoopVectorizer (vplan-rv fork).

Not much to do: only single block loops with LLVM’s LV unit tests show what’s possible.

Won’t be required by VPlan before patch series #3.

until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs.

3

SLIDE 7

DivergenceAnalysis

for (int i = 0; i < n; ++i) { for (int j = 0; j < m; ++j) { uni_var = f(i); varying_var = foo(i) + bar(j); } }

vectorized 7 7 7 7

6 2

Integrated with LoopVectorizer (vplan-rv fork).

Not much to do: only single block loops with LLVM’s LV unit tests show what’s possible.

Won’t be required by VPlan before patch series #3.

until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs.

3

SLIDE 8

DivergenceAnalysis

for (int i = 0; i < n; ++i) { for (int j = 0; j < m; ++j) { uni_var = f(i); varying_var = foo(i) + bar(j); } }

vectorized 7 7 7 7

6 2

Integrated with LoopVectorizer (vplan-rv fork).

Not much to do: only single block loops with LLVM’s LV unit tests show what’s possible.

Won’t be required by VPlan before patch series #3.

until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs.

3

SLIDE 9

DivergenceAnalysis

for (int i = 0; i < n; ++i) { for (int j = 0; j < m; ++j) { uni_var = f(i); varying_var = foo(i) + bar(j); } }

vectorized 7 7 7 7

6 2

Integrated with LoopVectorizer (vplan-rv fork).

→ Not much to do: only single block loops with LLVM’s LV → unit tests show what’s possible.

Won’t be required by VPlan before patch series #3.

until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs.

3

SLIDE 10

DivergenceAnalysis

for (int i = 0; i < n; ++i) { for (int j = 0; j < m; ++j) { uni_var = f(i); varying_var = foo(i) + bar(j); } }

vectorized 7 7 7 7

6 2

Integrated with LoopVectorizer (vplan-rv fork).

→ Not much to do: only single block loops with LLVM’s LV → unit tests show what’s possible.

Won’t be required by VPlan before patch series #3.

until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs.

3

SLIDE 11

DivergenceAnalysis

for (int i = 0; i < n; ++i) { for (int j = 0; j < m; ++j) { uni_var = f(i); varying_var = foo(i) + bar(j); } }

vectorized 7 7 7 7

6 2

Integrated with LoopVectorizer (vplan-rv fork).

→ Not much to do: only single block loops with LLVM’s LV → unit tests show what’s possible.

Won’t be required by VPlan before patch series #3.

→ until then, let’s fjx LLVM’s DivergenceAnalysis for GPUs.

3

SLIDE 12

LLVM’s DivergenceAnalysis (NVPTX/AMDGPU)

A B φ φ

divergent branch uniform

LLVM’s DivergenceAnalysis invalid for unstructured CFGs.
Our analysis supports unstructured control.

4

SLIDE 13

LLVM’s DivergenceAnalysis (NVPTX/AMDGPU)

A B φ φ

divergent branch uniform

LLVM’s DivergenceAnalysis invalid for unstructured CFGs.
Our analysis supports unstructured control.

4

SLIDE 14

LLVM’s DivergenceAnalysis (NVPTX/AMDGPU)

A B φ φ

divergent branch varying φ uniform

LLVM’s DivergenceAnalysis invalid for unstructured CFGs.
Our analysis supports unstructured control.

4

SLIDE 15

LLVM’s DivergenceAnalysis (NVPTX/AMDGPU)

A B φ φ

divergent branch varying φ uniform φ

LLVM’s DivergenceAnalysis invalid for unstructured CFGs.
Our analysis supports unstructured control.

4

SLIDE 16

LLVM’s DivergenceAnalysis (NVPTX/AMDGPU)

A B φ φ

divergent branch varying φ uniform φ

?

LLVM’s DivergenceAnalysis invalid for unstructured CFGs.
Our analysis supports unstructured control.

4

SLIDE 17

LLVM’s DivergenceAnalysis (NVPTX/AMDGPU)

A B φ φ

divergent branch varying φ uniform φ

?

LLVM’s DivergenceAnalysis invalid for unstructured CFGs.
Our analysis supports unstructured control.

4

SLIDE 18

LLVM’s DivergenceAnalysis (NVPTX/AMDGPU)

A B φ φ

divergent branch varying φ uniform φ

?

LLVM’s DivergenceAnalysis invalid for unstructured CFGs.
Our analysis supports unstructured control.

4

SLIDE 19

DivergenceAnalysis GPUDivergenceAnalysis NVPTX/AMDGPU StructurizeCFG

use-rv-da

LoopDivergenceAnalysis LoopVectorizer

vectorizer-use-da