Introduction to the LLVM Compiler System Chris Lattner llvm.org - - PowerPoint PPT Presentation

introduction to the llvm compiler system
SMART_READER_LITE
LIVE PREVIEW

Introduction to the LLVM Compiler System Chris Lattner llvm.org - - PowerPoint PPT Presentation

Introduction to the LLVM Compiler System Chris Lattner llvm.org Architect November 4, 2008 ACAT08 - Erice, Sicily What is the LLVM Project? Collection of industrial strength compiler technology Optimizer and Code Generator


slide-1
SLIDE 1

Introduction to the LLVM Compiler System

Chris Lattner llvm.org Architect November 4, 2008 ACAT’08 - Erice, Sicily

slide-2
SLIDE 2

http://llvm.org

What is the LLVM Project?

  • Collection of industrial strength compiler technology

■ Optimizer and Code Generator ■ llvm-gcc and Clang Front-ends ■ MSIL and .NET Virtual Machines

slide-3
SLIDE 3

http://llvm.org

What is the LLVM Project?

  • Collection of industrial strength compiler technology

■ Optimizer and Code Generator ■ llvm-gcc and Clang Front-ends ■ MSIL and .NET Virtual Machines

  • Open Source Project with many contributors

■ Industry, Research Groups, Individuals

http://llvm.org/

slide-4
SLIDE 4

http://llvm.org

Why New Compilers?

slide-5
SLIDE 5

http://llvm.org

Why New Compilers?

  • Existing Open Source C Compilers have Stagnated!
slide-6
SLIDE 6

http://llvm.org

Why New Compilers?

  • Existing Open Source C Compilers have Stagnated!
  • How?

■ Based on decades old code generation technology ■ No modern techniques like cross-file optimization and JIT codegen ■ Aging code bases: difficult to learn, hard to change substantially ■ Can’t be reused in other applications ■ Keep getting slower with every release

slide-7
SLIDE 7

http://llvm.org

What I want!

slide-8
SLIDE 8

http://llvm.org

What I want!

  • A set of production-quality reusable libraries:

■ ... which implement the best known techniques drawing from

modern literature

■ ... which focus on compile time ■ ... and performance of the generated code

  • Ideally support many different languages and applications!
slide-9
SLIDE 9

http://llvm.org

LLVM Vision and Approach

  • Primary mission: build a set of modular compiler components:

■ Reduces the time & cost to construct a particular compiler ■ Components are shared across different compilers ■ Allows choice of the right component for the job

Core Optzn

xforms

X86

Support

Code gen Target PPC

DWARF

analysis

LTO linker LL IO BC IO

System

CBE GC IPO GCC JIT clang

... ... ...

slide-10
SLIDE 10

http://llvm.org

LLVM Vision and Approach

  • Primary mission: build a set of modular compiler components:

■ Reduces the time & cost to construct a particular compiler ■ Components are shared across different compilers ■ Allows choice of the right component for the job

  • Secondary mission: Build compilers out of these components

■ ... for example, a truly great C compiler ■ ... for example, a runtime specialization engine

Core Optzn

xforms

X86

Support

Code gen Target PPC

DWARF

analysis

LTO linker LL IO BC IO

System

CBE GC IPO GCC JIT clang

... ... ...

slide-11
SLIDE 11

http://llvm.org

Talk Overview

  • Intro and Motivation
  • LLVM as a C and C++ Compiler
  • Other LLVM Capabilities
  • Going Forward
slide-12
SLIDE 12

http://llvm.org

LLVM-GCC 4.2

  • C, C++, Objective C, Ada and Fortran
  • Standard GCC command line options
  • Supports almost all GCC language features and extensions
  • Supports many targets, including X86, X86-64, PowerPC, etc.
  • Extremely compatible with GCC 4.2
slide-13
SLIDE 13

http://llvm.org

LLVM-GCC 4.2

  • C, C++, Objective C, Ada and Fortran
  • Standard GCC command line options
  • Supports almost all GCC language features and extensions
  • Supports many targets, including X86, X86-64, PowerPC, etc.
  • Extremely compatible with GCC 4.2

What does it mean to be both LLVM and GCC?

slide-14
SLIDE 14

http://llvm.org

LLVM GCC 4.2 Design

  • Replace GCC optimizer and code generator with LLVM

■ Reuses GCC parser and runtime libraries

.s file

GCC 4.2 Front-end GCC Optimizer GCC Code Generator

GCC 4.2

C C++ . . .

slide-15
SLIDE 15

http://llvm.org

LLVM GCC 4.2 Design

  • Replace GCC optimizer and code generator with LLVM

■ Reuses GCC parser and runtime libraries

LLVM GCC 4.2

LLVM Optimizer LLVM Code Generator GCC 4.2 Front-end

C C++ . . . .s file .s file

GCC 4.2 Front-end GCC Optimizer GCC Code Generator

GCC 4.2

C C++ . . .

slide-16
SLIDE 16

http://llvm.org

Linking LLVM and GCC compiled code

  • Safe to mix and match .o files between compilers
  • Safe to call into libraries built with other compilers

Disk Storage

b.c

llvm-gcc -O3 gcc -O2

a.c

.o files

slide-17
SLIDE 17

http://llvm.org

Linking LLVM and GCC compiled code

  • Safe to mix and match .o files between compilers
  • Safe to call into libraries built with other compilers

Disk Storage

b.c

llvm-gcc -O3 gcc -O2

a.c

.o files

slide-18
SLIDE 18

http://llvm.org

Linking LLVM and GCC compiled code

  • Safe to mix and match .o files between compilers
  • Safe to call into libraries built with other compilers

Disk Storage

b.c

llvm-gcc -O3 gcc -O2

a.c

Linker

exe

.o files

slide-19
SLIDE 19

http://llvm.org

Potential Impact of LLVM Optimizer

  • Generated Code

■ How fast does the code run?

slide-20
SLIDE 20

http://llvm.org

Potential Impact of LLVM Optimizer

  • Generated Code

■ How fast does the code run?

  • Compile Times

■ How fast can we get code from the compiler?

slide-21
SLIDE 21

http://llvm.org

Potential Impact of LLVM Optimizer

  • Generated Code

■ How fast does the code run?

  • Compile Times

■ How fast can we get code from the compiler?

  • New Features
slide-22
SLIDE 22

http://llvm.org

Potential Impact of LLVM Optimizer

  • Generated Code

■ How fast does the code run?

  • Compile Times

■ How fast can we get code from the compiler?

  • New Features

Link Time Optimization

slide-23
SLIDE 23

http://llvm.org

New Feature: Link Time Optimization

  • Optimize (e.g. inline, constant fold, etc) across files with -O4
  • Optimize across language boundaries too!

Disk Storage

b.c

llvm-gcc -O4 gcc -O3

a.c

Linker

.o files

exe

slide-24
SLIDE 24

http://llvm.org

New Feature: Link Time Optimization

  • Optimize (e.g. inline, constant fold, etc) across files with -O4
  • Optimize across language boundaries too!

Disk Storage

b.c c.c

llvm-gcc -O4 llvm-gcc -O4 gcc -O3

a.c

Linker

.o files

exe

slide-25
SLIDE 25

http://llvm.org

LLVM Link Time Optimizer

New Feature: Link Time Optimization

  • Optimize (e.g. inline, constant fold, etc) across files with -O4
  • Optimize across language boundaries too!

Disk Storage

b.c c.c

llvm-gcc -O4 llvm-gcc -O4 gcc -O3

a.c

Linker

.o files

exe

slide-26
SLIDE 26

http://llvm.org

LLVM Link Time Optimizer

New Feature: Link Time Optimization

  • Optimize (e.g. inline, constant fold, etc) across files with -O4
  • Optimize across language boundaries too!

Disk Storage

b.c d.cpp

llvm-g++ -O4

c.c

llvm-gcc -O4 llvm-gcc -O4 gcc -O3

a.c

Linker

.o files

exe

slide-27
SLIDE 27

http://llvm.org

SPEC INT 2000 Compile Time

Optimization Level

In seconds: Lower is Better

slide-28
SLIDE 28

http://llvm.org

SPEC INT 2000 Compile Time

  • O0
  • O0 -g
  • O1
  • O2
  • O3
  • O4: LTO

0s 40s 80s 120s 160s 200s

187s 164s 133s 90s 79s

Optimization Level

In seconds: Lower is Better

GCC 4.2 LLVM GCC 4.2

slide-29
SLIDE 29

http://llvm.org

SPEC INT 2000 Compile Time

  • O0
  • O0 -g
  • O1
  • O2
  • O3
  • O4: LTO

0s 40s 80s 120s 160s 200s

187s 164s 133s 90s 79s 144s 131s 126s 1 12s 97s 74s

Optimization Level

In seconds: Lower is Better

GCC 4.2 LLVM GCC 4.2

slide-30
SLIDE 30

http://llvm.org

SPEC INT 2000 Compile Time

  • O0
  • O0 -g
  • O1
  • O2
  • O3
  • O4: LTO

0s 40s 80s 120s 160s 200s

187s 164s 133s 90s 79s 144s 131s 126s 1 12s 97s 74s

Optimization Level

In seconds: Lower is Better

GCC 4.2 LLVM GCC 4.2

18% Faster at -O1! 30% Faster at -O2! 42% Faster at -O3! Faster than GCC at -O2!

slide-31
SLIDE 31

http://llvm.org

SPEC 2000 Execution Time

Optimization Level

Relative to GCC -O2: Lower is Faster

GCC 4.2 LLVM GCC 4.2

slide-32
SLIDE 32

http://llvm.org

SPEC 2000 Execution Time

  • O2
  • O3
  • O4 (LTO)

75% 80% 85% 90% 95% 100%

96.3%

Optimization Level

Relative to GCC -O2: Lower is Faster

GCC 4.2 LLVM GCC 4.2

slide-33
SLIDE 33

http://llvm.org

SPEC 2000 Execution Time

  • O2
  • O3
  • O4 (LTO)

75% 80% 85% 90% 95% 100%

96.3% 95.1% 92.5%

Optimization Level

Relative to GCC -O2: Lower is Faster

GCC 4.2 LLVM GCC 4.2

5% Faster at -O2! 4% Faster at -O3!

slide-34
SLIDE 34

http://llvm.org

SPEC 2000 Execution Time

  • O2
  • O3
  • O4 (LTO)

75% 80% 85% 90% 95% 100%

96.3% 95.1% 92.5% 80.3%

Optimization Level

Relative to GCC -O2: Lower is Faster

GCC 4.2 LLVM GCC 4.2

5% Faster at -O2! 4% Faster at -O3! 20% Faster than -O3!

slide-35
SLIDE 35

http://llvm.org

llvm-gcc 4.2 Summary

  • Drop in replacement for GCC 4.2

■ Compatible with GCC command line options and languages ■ Works with existing makefiles (e.g. “make CC=llvm-gcc”)

  • Benefits of LLVM Optimizer and Code Generator

■ Much faster optimizer: ~30-40% at -O3 in most cases ■ Slightly better codegen at a given level: ~5-10% on x86/x86-64 ■ Link-Time Optimization at -O4: optimize across source files

slide-36
SLIDE 36

http://llvm.org

Talk Overview

  • Intro and Motivation
  • LLVM as a C and C++ Compiler
  • Other LLVM Capabilities
  • LLVM Going Forward
slide-37
SLIDE 37

http://llvm.org

LLVM For Compiler Hackers

  • LLVM is a great target for new languages

■ Well defined, simple to program for ■ Easy to retarget existing compiler to use LLVM backend

  • LLVM supports Just-In-Time optimization and compilation

■ Optimize code at runtime based on dynamic information ■ Easy to retarget existing bytecode interpreter to LLVM JIT ■ Great for performance, not just for traditional “compilers”

slide-38
SLIDE 38

http://llvm.org

Colorspace Conversion JIT Optimization

  • Code to convert from one color format to another:

■ e.g. BGRA 444R -> RGBA 8888 ■ Hundreds of combinations, importance depends on input

for each pixel { switch (infmt) { case RGBA 5551: R = (*in >> 11) & C G = (*in >> 6) & C B = (*in >> 1) & C ... } switch (outfmt) { case RGB888: *outptr = R << 16 | G << 8 ... } }

slide-39
SLIDE 39

http://llvm.org

Colorspace Conversion JIT Optimization

  • Code to convert from one color format to another:

■ e.g. BGRA 444R -> RGBA 8888 ■ Hundreds of combinations, importance depends on input

for each pixel { switch (infmt) { case RGBA 5551: R = (*in >> 11) & C G = (*in >> 6) & C B = (*in >> 1) & C ... } switch (outfmt) { case RGB888: *outptr = R << 16 | G << 8 ... } } for each pixel { R = (*in >> 11) & C; G = (*in >> 6) & C; B = (*in >> 1) & C; *outptr = R << 16 | G << 8 ... }

Run-time specialize

Compiler optimizes shifts and masking

slide-40
SLIDE 40

http://llvm.org

Colorspace Conversion JIT Optimization

  • Code to convert from one color format to another:

■ e.g. BGRA 444R -> RGBA 8888 ■ Hundreds of combinations, importance depends on input

for each pixel { switch (infmt) { case RGBA 5551: R = (*in >> 11) & C G = (*in >> 6) & C B = (*in >> 1) & C ... } switch (outfmt) { case RGB888: *outptr = R << 16 | G << 8 ... } } for each pixel { R = (*in >> 11) & C; G = (*in >> 6) & C; B = (*in >> 1) & C; *outptr = R << 16 | G << 8 ... }

Run-time specialize

Compiler optimizes shifts and masking

Speedup depends on src/dest format: 5.4x speedup on average, 19.3x max speedup: (13.3MB/s to 257.7MB/s)

slide-41
SLIDE 41

http://llvm.org

Another example: RegEx Compilation

  • Many regex’s are matched millions of times:

■ Match time is critical

  • Common regex engines ‘compile’ to ‘bytecode’ and interpret:

■ regcomp/regexec

  • Why not compile to native code? Partial Evaluation!

■ regcomp compiles regex to a native function ■ Much faster matching, could even vectorize common idioms

  • Excellent way to handle multiple different Unicode encodings
slide-42
SLIDE 42

http://llvm.org

Talk Overview

  • Intro and Motivation
  • LLVM as a C and C++ Compiler
  • Other LLVM Capabilities
  • LLVM Going Forward
slide-43
SLIDE 43

http://llvm.org

LLVM Going Forward

  • More of the same...
slide-44
SLIDE 44

http://llvm.org

LLVM Going Forward

  • More of the same...

■ Even faster optimizer ■ Even better optimizations ■ More features for non-C languages ■ Debug Info Improvements ■ Many others...

slide-45
SLIDE 45

http://llvm.org

LLVM Going Forward

  • More of the same...

■ Even faster optimizer ■ Even better optimizations ■ More features for non-C languages ■ Debug Info Improvements ■ Many others...

Better tools for source level analysis

  • f C/C++ programs!
slide-46
SLIDE 46

http://llvm.org

Clang Frontend: What is it?

  • C, Objective-C, and C++ front-end
  • Aggressive project with many goals...

■ Compatibility with GCC ■ Fast compilation ■ Expressive error messages

  • Host for a broad range of source-level tools
slide-47
SLIDE 47

http://llvm.org

Clang Frontend: What is it?

  • C, Objective-C, and C++ front-end
  • Aggressive project with many goals...

■ Compatibility with GCC ■ Fast compilation ■ Expressive error messages

  • Host for a broad range of source-level tools

t.c:6:49: error: invalid operands to binary expression ('int' and 'struct A') return intArg + func(intArg ? ((someA.X+40) + someA) / 42 : someA.X)); ~~~~~~~~~~~~ ^ ~~~~~

slide-48
SLIDE 48

http://llvm.org

Clang Frontend: What is it?

  • C, Objective-C, and C++ front-end
  • Aggressive project with many goals...

■ Compatibility with GCC ■ Fast compilation ■ Expressive error messages

  • Host for a broad range of source-level tools

t.c:6:49: error: invalid operands to binary expression ('int' and 'struct A') return intArg + func(intArg ? ((someA.X+40) + someA) / 42 : someA.X)); ~~~~~~~~~~~~ ^ ~~~~~

slide-49
SLIDE 49

http://llvm.org

Clang Compile Time

PostgreSQL -fsyntax-only Time: 665K lines of C code in 619 files

slide-50
SLIDE 50

http://llvm.org

Clang Compile Time

49s

PostgreSQL -fsyntax-only Time: 665K lines of C code in 619 files

GCC 4.2 clang

slide-51
SLIDE 51

http://llvm.org

Clang Compile Time

49s 21s

PostgreSQL -fsyntax-only Time: 665K lines of C code in 619 files

GCC 4.2 clang

2.3x faster

slide-52
SLIDE 52

http://llvm.org

LLVM Overview

  • New compiler architecture built with reusable components

■ Retarget existing languages to JIT or static compilation ■ Many optimizations and supported targets

slide-53
SLIDE 53

http://llvm.org

LLVM Overview

  • New compiler architecture built with reusable components

■ Retarget existing languages to JIT or static compilation ■ Many optimizations and supported targets

  • llvm-gcc: drop in GCC-compatible compiler

■ Better & faster optimizer ■ Production quality

slide-54
SLIDE 54

http://llvm.org

LLVM Overview

  • New compiler architecture built with reusable components

■ Retarget existing languages to JIT or static compilation ■ Many optimizations and supported targets

  • llvm-gcc: drop in GCC-compatible compiler

■ Better & faster optimizer ■ Production quality

  • Clang front-end: C/ObjC/C++ front-end

■ Several times faster than GCC ■ Much better end-user features (warnings/errors)

slide-55
SLIDE 55

http://llvm.org

LLVM Overview

  • New compiler architecture built with reusable components

■ Retarget existing languages to JIT or static compilation ■ Many optimizations and supported targets

  • llvm-gcc: drop in GCC-compatible compiler

■ Better & faster optimizer ■ Production quality

  • Clang front-end: C/ObjC/C++ front-end

■ Several times faster than GCC ■ Much better end-user features (warnings/errors)

  • LLVM 2.4 release this week!

http://llvm.org http://clang.llvm.org Come join us at: