Skip the FFI! Embedding Clang for C Interoperability Jordan Rose - - PowerPoint PPT Presentation

skip the ffi
SMART_READER_LITE
LIVE PREVIEW

Skip the FFI! Embedding Clang for C Interoperability Jordan Rose - - PowerPoint PPT Presentation

Skip the FFI! Embedding Clang for C Interoperability Jordan Rose John McCall Compiler Engineer, Apple Compiler Engineer, Apple Problem Problem Languages dont exist in a vacuum Problem Languages dont exist in a vacuum But C has its


slide-1
SLIDE 1

Skip the FFI!

Embedding Clang for C Interoperability

Jordan Rose

Compiler Engineer, Apple

John McCall

Compiler Engineer, Apple

slide-2
SLIDE 2

Problem

slide-3
SLIDE 3

Problem

Languages don’t exist in a vacuum

slide-4
SLIDE 4

Problem

Languages don’t exist in a vacuum But C has its own ABI

slide-5
SLIDE 5

Problem

Languages don’t exist in a vacuum But C has its own ABI And its APIs are written in C, not ${LANG}

slide-6
SLIDE 6

Solutions?

slide-7
SLIDE 7

Solutions?

Manually write glue code (JNI, Python, Ruby)

slide-8
SLIDE 8

Solutions?

Manually write glue code (JNI, Python, Ruby) Generate the glue code (SWIG)

slide-9
SLIDE 9

Solutions?

Manually write glue code (JNI, Python, Ruby) Generate the glue code (SWIG) Extend C (C++, Objective-C)

slide-10
SLIDE 10

Better solution: just use Clang

slide-11
SLIDE 11

Embedding Clang for C Interoperability

Clang as a library Importing from C ABI compatibility Sharing an llvm::Module

slide-12
SLIDE 12

Goal

static inline
 Point2f flipOverXAxis(Point2f point) {
 // ...
 } 
 
 
 
 let flipped = flipOverXAxis(originalPoint)

slide-13
SLIDE 13

Goal

static inline
 Point2f flipOverXAxis(Point2f point) {
 // ...
 } 
 typedef struct {
 float x, y;
 } Point2f;

slide-14
SLIDE 14

Goal

static inline
 Point2f flipOverXAxis(Point2f point) {
 // ...
 } 
 
 
 
 let flipped = flipOverXAxis(originalPoint)

slide-15
SLIDE 15

Goal

static inline
 Point2f flipOverXAxis(Point2f point) {
 // ...
 } 
 
 
 
 let flipped = flipOverXAxis(originalPoint)

No external symbol!

slide-16
SLIDE 16

Goal

static inline
 Point2f flipOverXAxis(Point2f point) {
 // ...
 } 
 
 
 
 let flipped = flipOverXAxis(originalPoint)

No external symbol! Handled differently on different platforms!

slide-17
SLIDE 17

From C to ${LANG}…

slide-18
SLIDE 18

Roadmap

slide-19
SLIDE 19

Roadmap

Set up a clang::CompilerInstance

slide-20
SLIDE 20

Roadmap

Set up a clang::CompilerInstance Load Clang modules

slide-21
SLIDE 21

Roadmap

Set up a clang::CompilerInstance Load Clang modules Import declarations we care about

slide-22
SLIDE 22

Setting up a clang::CompilerInstance

slide-23
SLIDE 23

Setting up a clang::CompilerInstance

createInvocationFromCommandLine()

slide-24
SLIDE 24

Setting up a clang::CompilerInstance

createInvocationFromCommandLine() "clang -fsyntax-only -x c …"

slide-25
SLIDE 25

Setting up a clang::CompilerInstance

createInvocationFromCommandLine() "clang -fsyntax-only -x c …"

CompilerInvocation

slide-26
SLIDE 26

Setting up a clang::CompilerInstance

createInvocationFromCommandLine() "clang -fsyntax-only -x c …"

CompilerInvocation CompilerInstance

slide-27
SLIDE 27

Setting up a clang::CompilerInstance

createInvocationFromCommandLine()

Attach custom observers

"clang -fsyntax-only -x c …"

CompilerInvocation CompilerInstance

slide-28
SLIDE 28

Setting up a clang::CompilerInstance

createInvocationFromCommandLine()

Attach custom observers

  • Diagnostic consumer

"clang -fsyntax-only -x c …"

CompilerInvocation CompilerInstance

slide-29
SLIDE 29

Setting up a clang::CompilerInstance

createInvocationFromCommandLine()

Attach custom observers

  • Diagnostic consumer

"clang -fsyntax-only -x c …"

CompilerInvocation CompilerInstance

DiagnosticConsumer

slide-30
SLIDE 30

Setting up a clang::CompilerInstance

createInvocationFromCommandLine()

Attach custom observers

  • Diagnostic consumer
  • PP callbacks (for module import)

"clang -fsyntax-only -x c …"

CompilerInvocation CompilerInstance

DiagnosticConsumer

slide-31
SLIDE 31

Setting up a clang::CompilerInstance

createInvocationFromCommandLine()

Attach custom observers

  • Diagnostic consumer
  • PP callbacks (for module import)

"clang -fsyntax-only -x c …"

CompilerInvocation CompilerInstance

DiagnosticConsumer PPCallbacks

slide-32
SLIDE 32

Setting up a clang::CompilerInstance

createInvocationFromCommandLine()

Attach custom observers

  • Diagnostic consumer
  • PP callbacks (for module import)

Manually run most of ExecuteAction()

"clang -fsyntax-only -x c …"

CompilerInvocation

DiagnosticConsumer PPCallbacks

CompilerInstance

slide-33
SLIDE 33

Setting up a clang::CompilerInstance

createInvocationFromCommandLine()

Attach custom observers

  • Diagnostic consumer
  • PP callbacks (for module import)

Manually run most of ExecuteAction()

  • Set up several compiler components

"clang -fsyntax-only -x c …"

CompilerInvocation

DiagnosticConsumer PPCallbacks

CompilerInstance

slide-34
SLIDE 34

Setting up a clang::CompilerInstance

createInvocationFromCommandLine()

Attach custom observers

  • Diagnostic consumer
  • PP callbacks (for module import)

Manually run most of ExecuteAction()

  • Set up several compiler components
  • Parse a single decl from a dummy file

"clang -fsyntax-only -x c …"

CompilerInvocation

DiagnosticConsumer PPCallbacks

CompilerInstance

slide-35
SLIDE 35

Setting up a clang::CompilerInstance

createInvocationFromCommandLine()

Attach custom observers

  • Diagnostic consumer
  • PP callbacks (for module import)

Manually run most of ExecuteAction()

  • Set up several compiler components
  • Parse a single decl from a dummy file
  • Finalize the AST

"clang -fsyntax-only -x c …"

CompilerInvocation

DiagnosticConsumer PPCallbacks

CompilerInstance

slide-36
SLIDE 36

Setting up a clang::CompilerInstance

createInvocationFromCommandLine()

Attach custom observers

  • Diagnostic consumer
  • PP callbacks (for module import)

Manually run most of ExecuteAction()

  • Set up several compiler components
  • Parse a single decl from a dummy file
  • Finalize the AST

"clang -fsyntax-only -x c …"

CompilerInvocation

DiagnosticConsumer PPCallbacks

CompilerInstance

slide-37
SLIDE 37

Setting up a clang::CompilerInstance

createInvocationFromCommandLine()

Attach custom observers

  • Diagnostic consumer
  • PP callbacks (for module import)

Manually run most of ExecuteAction()

  • Set up several compiler components
  • Parse a single decl from a dummy file
  • Finalize the AST

Actually works well

slide-38
SLIDE 38

Setting up a clang::CompilerInstance

createInvocationFromCommandLine()

Attach custom observers

  • Diagnostic consumer
  • PP callbacks (for module import)

Manually run most of ExecuteAction()

  • Set up several compiler components
  • Parse a single decl from a dummy file
  • Finalize the AST

A bit harder than it should be

slide-39
SLIDE 39

Setting up a clang::CompilerInstance

createInvocationFromCommandLine()

Attach custom observers

  • Diagnostic consumer
  • PP callbacks (for module import)

Manually run most of ExecuteAction()

  • Set up several compiler components
  • Parse a single decl from a dummy file
  • Finalize the AST

Mostly okay

slide-40
SLIDE 40

Setting up a clang::CompilerInstance

createInvocationFromCommandLine()

Attach custom observers

  • Diagnostic consumer
  • PP callbacks (for module import)

Manually run most of ExecuteAction()

  • Set up several compiler components
  • Parse a single decl from a dummy file
  • Finalize the AST

*sadness* (this is really the only reason)

slide-41
SLIDE 41

Clang Modules

slide-42
SLIDE 42

Clang Modules

Self-contained units of API

slide-43
SLIDE 43

Clang Modules

Self-contained units of API

  • No cross-header pollution!
slide-44
SLIDE 44

Clang Modules

Self-contained units of API

  • No cross-header pollution!

Separate semantics from syntax

slide-45
SLIDE 45

Clang Modules

Self-contained units of API

  • No cross-header pollution!

Separate semantics from syntax

  • Same mechanism as PCH
slide-46
SLIDE 46

Clang Modules

Self-contained units of API

  • No cross-header pollution!

Separate semantics from syntax

  • Same mechanism as PCH
  • Modules

Doug Gregor 2012 Developers’ Meeting

slide-47
SLIDE 47

Importing Clang Modules

slide-48
SLIDE 48

Importing Clang Modules

CompilerInstance::loadModule

slide-49
SLIDE 49

Importing Clang Modules

CompilerInstance::loadModule

Geometry

typedef … Point2f; Point2f flipOverXAxis(…); Point2f flipOverYAxis(…); void drawGraph(…); …

slide-50
SLIDE 50

Importing Clang Modules

CompilerInstance::loadModule

Look up the decls we want

Geometry

typedef … Point2f; Point2f flipOverXAxis(…); Point2f flipOverYAxis(…); void drawGraph(…); …

slide-51
SLIDE 51

Importing Clang Modules

CompilerInstance::loadModule

Look up the decls we want

Geometry

typedef … Point2f; Point2f flipOverXAxis(…); Point2f flipOverYAxis(…); void drawGraph(…); … flipOverXAxis(originalPoint)

slide-52
SLIDE 52

Importing Clang Modules

CompilerInstance::loadModule

Look up the decls we want

Geometry

typedef … Point2f; Point2f flipOverXAxis(…); Point2f flipOverYAxis(…); void drawGraph(…); … flipOverXAxis(originalPoint)

slide-53
SLIDE 53

Importing Clang Modules

CompilerInstance::loadModule

Look up the decls we want

  • Use TU-wide lookup and filter

Geometry

typedef … Point2f; Point2f flipOverXAxis(…); Point2f flipOverYAxis(…); void drawGraph(…); … flipOverXAxis(originalPoint)

slide-54
SLIDE 54

Importing Clang Modules

CompilerInstance::loadModule

Look up the decls we want

  • Use TU-wide lookup and filter

Requires a SourceLocation Awkward for submodules

slide-55
SLIDE 55

Importing Clang Modules

CompilerInstance::loadModule

Look up the decls we want

  • Use TU-wide lookup and filter

Definitely something to improve

slide-56
SLIDE 56

Importing Clang Modules

CompilerInstance::loadModule

Look up the decls we want

  • Use TU-wide lookup and filter

What if two modules conflict?

slide-57
SLIDE 57

Importing Clang Modules

CompilerInstance::loadModule

Look up the decls we want

  • Use TU-wide lookup and filter

What if two modules conflict? OldLibrary

… typedef unsigned status_t; …

slide-58
SLIDE 58

Importing Clang Modules

CompilerInstance::loadModule

Look up the decls we want

  • Use TU-wide lookup and filter

What if two modules conflict? OldLibrary

… typedef unsigned status_t; …

NewLibrary

… typedef enum {…} status_t; …

slide-59
SLIDE 59

Importing Declarations

static inline
 Point2f flipOverXAxis(Point2f point) {
 // ...
 }

slide-60
SLIDE 60

Importing Declarations

clang::FunctionDecl

Point2f flipOverXAxis(Point2f point)

slide-61
SLIDE 61

Importing Declarations

clang::FunctionDecl

Point2f flipOverXAxis(Point2f point)

clang::TypedefDecl

typedef … Point2f

slide-62
SLIDE 62

Importing Declarations

clang::FunctionDecl

Point2f flipOverXAxis(Point2f point)

clang::TypedefDecl

typedef … Point2f

clang::StructDecl

struct [anonymous] { … }

slide-63
SLIDE 63

Importing Declarations

clang::FunctionDecl

Point2f flipOverXAxis(Point2f point)

clang::TypedefDecl

typedef … Point2f

clang::StructDecl

struct [anonymous] { … }

clang::FieldDecl

float x

slide-64
SLIDE 64

Importing Declarations

clang::FunctionDecl

Point2f flipOverXAxis(Point2f point)

clang::TypedefDecl

typedef … Point2f

clang::StructDecl

struct [anonymous] { … }

clang::FieldDecl

float x

clang::FieldDecl

float y

slide-65
SLIDE 65

Importing Declarations

clang::FunctionDecl

Point2f flipOverXAxis(Point2f point)

clang::TypedefDecl

typedef … Point2f

clang::StructDecl

struct [anonymous] { … }

clang::FieldDecl

float x

clang::FieldDecl

float y

slide-66
SLIDE 66

Importing Declarations

…using clang::ASTVisitor

flipOverXAxis(…) typedef … Point2f struct {…} float x float y

slide-67
SLIDE 67

Importing Declarations

…using clang::ASTVisitor

flipOverXAxis(…) typedef … Point2f struct {…} float x float y

slide-68
SLIDE 68

Importing Declarations

…using clang::ASTVisitor

flipOverXAxis(…) typedef … Point2f struct {…} float x float y

slide-69
SLIDE 69

Importing Declarations

…using clang::ASTVisitor

flipOverXAxis(…) typedef … Point2f struct {…} float x float y

slide-70
SLIDE 70

Importing Declarations

…using clang::ASTVisitor

flipOverXAxis(…) typedef … Point2f struct {…} float x float y

slide-71
SLIDE 71

var x: Float

Importing Declarations

…using clang::ASTVisitor

flipOverXAxis(…) typedef … Point2f struct {…} float x float y

slide-72
SLIDE 72

var x: Float

Importing Declarations

…using clang::ASTVisitor

flipOverXAxis(…) typedef … Point2f struct {…} float x float y var y: Float

slide-73
SLIDE 73

var x: Float

Importing Declarations

…using clang::ASTVisitor

flipOverXAxis(…) typedef … Point2f struct {…} float x float y var y: Float

slide-74
SLIDE 74

var x: Float

Importing Declarations

…using clang::ASTVisitor

flipOverXAxis(…) typedef … Point2f struct {…} float x float y var y: Float struct _

slide-75
SLIDE 75

var x: Float

Importing Declarations

…using clang::ASTVisitor

flipOverXAxis(…) typedef … Point2f struct {…} float x float y var y: Float struct _

slide-76
SLIDE 76

var x: Float

Importing Declarations

…using clang::ASTVisitor

flipOverXAxis(…) typedef … Point2f struct {…} float x float y var y: Float struct _ typealias Point2f

slide-77
SLIDE 77

var x: Float

Importing Declarations

…using clang::ASTVisitor

flipOverXAxis(…) typedef … Point2f struct {…} float x float y var y: Float struct _ typealias Point2f

slide-78
SLIDE 78

var x: Float

Importing Declarations

…using clang::ASTVisitor

flipOverXAxis(…) typedef … Point2f struct {…} float x float y var y: Float struct _ typealias Point2f func flipOverXAxis

slide-79
SLIDE 79

var x: Float

Importing Declarations

…using clang::ASTVisitor

flipOverXAxis(…) typedef … Point2f struct {…} float x float y var y: Float struct Point2f func flipOverXAxis

slide-80
SLIDE 80

Success!

flipOverXAxis

Arguments: (Point2f) Returns: Point2f

clang::FunctionDecl

Point2f flipOverXAxis(Point2f point)

swift::FuncDecl

slide-81
SLIDE 81

…and back to C

slide-82
SLIDE 82

ABIs

slide-83
SLIDE 83

Platforms and ABIs

Every language/platform combination forms an ABI ABI defines how the language is implemented on that platform Necessary for interoperation: ...between compilers offered by different vendors ...between different versions of the same compiler ...between compiled code and hand-written code (e.g. in assembly) ...between compiled code and various inspection/instrumentation tools

slide-84
SLIDE 84

ABIs for other languages

All languages/extensions supported by Clang have ABIs defined mostly in terms of C Caveat: often require additional linker support Caveat: sometimes use slightly different calling conventions "Itanium" C++ ABI: weak linkage Visual Studio C++ ABI: weak linkage, different CC for member functions GNUStep Objective-C ABI: pure C Apple Objective-C ABI: some Apple-specific linker behavior Objective-C Blocks ABI: pure C

slide-85
SLIDE 85

ABIs for C

Often written by the architecture vendor and then tweaked by the OS vendor Includes: Stack alignment rules Calling conventions and register use rules Size/alignment of fundamental types Layout rules for structs and unions Existence of various extended types Object file structure and linker behavior Guaranteed runtime facilities ...and a whole lot more

slide-86
SLIDE 86

ABIs and undefined behavior

An ABI doesn't mean language-specific restrictions aren't still in effect!

struct A { virtual void foo(); }; void *loadVTable(A *a) { return *reinterpret_cast<void**>(a); }

Still undefined behavior

slide-87
SLIDE 87

Memory

slide-88
SLIDE 88

Working with C values in memory

Often need to allocate storage for C values All complete types in C have an ABI size and alignment: getASTContext().getTypeInfoInChars(someType) For normal types, sizeof(T) is always a multiple of alignof(T) ...but attributes on typedefs can arbitrarily change alignment requirements

slide-89
SLIDE 89

Storage Padding

For many types, sizeof includes some extra storage: Contents are undefined: not required to preserve those bits If you share pointers with C code, it won't promise to preserve them either Special case: C99 _Bool / C++ bool are always stored as 0 or 1 (not necessarily 1 byte)

struct Foo { void *x; long double d; char c; };

void *x; long double x; char c;

slide-90
SLIDE 90

struct/union Layout

Often tempting to do your own C struct layout:

%struct.Foo = {

  • paque*,

x86_fp80, i8 } struct Foo { void *x; long double d; char c; };

slide-91
SLIDE 91

struct/union Layout

Often tempting to do your own C struct layout:

%struct.Foo = {

  • paque*,

x86_fp80, i8 }

It's a trap!

struct Foo { void *x; long double d; char c; };

slide-92
SLIDE 92

struct/union Layout

C/C++ language guarantees: All union members have same address First struct member has same address as struct Later struct member addresses > earlier struct member addresses

slide-93
SLIDE 93

Universal C Layout Algorithm

struct.size = 0, struct.alignment = 1 for field in struct.fields: struct.size = roundUpToAlignment(struct.size, field.alignment) struct.alignment = max(struct.alignment, field.alignment)

  • ffsets[field] = struct.size

struct.size += field.size struct.size = roundUpToAlignment(struct.size, alignment)

Not guaranteed, but might as well be

slide-94
SLIDE 94

Universal C Layout Algorithm?

Bitfield rules differ massively between platforms Many different attributes and pragmas affect layout C++...

slide-95
SLIDE 95

Use Clang

Type info for struct/union types reflects results of layout Can get offsets of individual members: ASTContext::getASTRecordLayout(const RecordDecl *D) IRGen provides interfaces for: lowering types to IR projecting the address of an ordinary field loading and storing to a bitfield

slide-96
SLIDE 96

Calls

slide-97
SLIDE 97

Calls

slide-98
SLIDE 98

Calls

Lowering from Clang function types to LLVM function types

slide-99
SLIDE 99

Calls

Lowering from Clang function types to LLVM function types Inputs: AST calling convention, parameter types, return type

slide-100
SLIDE 100

Calls

Lowering from Clang function types to LLVM function types Inputs: AST calling convention, parameter types, return type Outputs: LLVM calling convention, parameter types, return type, parameter attributes

slide-101
SLIDE 101

Why not just use the C type system?

Things that affect CC lowering: Exact structure of unions Existence and placement of bitfields Attributes Special cases for types that structurally resemble others Everything! Would have to render entire C type system in LLVM, including all extensions

slide-102
SLIDE 102

Frontend/backend mutual aggression pact

Backend figures out how to represent different ways to pass arguments, results Specific IR types Specific attributes on call site Frontend contrives to mutilate arguments into that form

slide-103
SLIDE 103

Examples

static inline
 Point2f flipOverXAxis(Point2f point) {
 // ...
 } 
 typedef struct {
 float x, y;
 } Point2f;

slide-104
SLIDE 104

Examples

static inline
 Point2f flipOverXAxis(Point2f point) {
 // ...
 } 
 typedef struct {
 float x, y;
 } Point2f; // aarch64-apple-ios define %struct.Point2f @flipOverXAxis(float, float)

slide-105
SLIDE 105

static inline
 Point2f flipOverXAxis(Point2f point) {
 // ...
 } 
 typedef struct {
 float x, y;
 } Point2f; // i386-apple-macosx define i64 @flipOverXAxis(float, float)

Examples

slide-106
SLIDE 106

static inline
 Point2f flipOverXAxis(Point2f point) {
 // ...
 } 
 typedef struct {
 float x, y;
 } Point2f; // thumbv7-apple-ios define void @flipOverXAxis(%struct.Point2f* sret, [2 x i32])

Examples

slide-107
SLIDE 107

static inline
 Point2f flipOverXAxis(Point2f point) {
 // ...
 } 
 typedef struct {
 float x, y;
 } Point2f; // x86_64-apple-macosx define <2 x float> @flipOverXAxis(<2 x float>)

Examples

slide-108
SLIDE 108

Relief

LLVM does make an informal ABI guarantee: A type is "register-filling" if it's a pointer or pointer-sized integer. If: 1) all the arguments are register-filling and 2) the return value is either register-filling or void Then the obvious type lowering will match the C ABI

slide-109
SLIDE 109

Relief

Guaranteed by all the normal CPU backends Does not apply to floats, structs, vectors, too-small integers, too-large integers, etc. Extremely useful for free-coding calls to known functions in your language runtime

slide-110
SLIDE 110

Breakdown in negotiations

The current situation is pretty gross and increasingly untenable Backends feel the need to be pretty heroic about what types they accept Difficult for frontends to tweak CCs, which is often useful when moving beyond C

slide-111
SLIDE 111

Entente

Representing whole C type system is unworkable We should consider going the other way: Allow frontends more explicit control of registers and stack Make consistent rules about how different IR types are passed otherwise

slide-112
SLIDE 112

Use Clang

IRGen provides an interface for examining function type lowering Extremely detailed, poorly documented Not a good combination! Still better than doing it yourself In progress: extracting better interfaces to do this lowering

slide-113
SLIDE 113

Sharing a Module with Clang

slide-114
SLIDE 114

Types and global declarations

Your frontend's IR types and Clang's can coexist in a module Your frontend and Clang will sometimes both need to refer to the same entity The types won't always match

slide-115
SLIDE 115

Global declarations

IRGen is pretty forgiving about the type of a declaration Feel free to emit your own declaration with its own type Those code paths are well-covered in IRGen because of incomplete types

slide-116
SLIDE 116

If Clang has to emit the definition, it may have to change the type This will invalidate your own references to that declaration ...unless you hold onto them with a ValueHandle ...which is best practice anyway

slide-117
SLIDE 117

Lazy declaration emission

IRGen only emits certain entities if they're actually used: static or inline functions certain v-tables To get IRGen to emit it, you simply: tell IRGen that it has a definition (by adding it) ask IRGen for a declaration ensure that all deferred declarations are emitted Better APIs for this are in progress

slide-118
SLIDE 118

Summary

slide-119
SLIDE 119

Summary

You can use Clang to import C types and declarations directly into your language Let Clang handle the ABI rules for you instead of reinventing them Most of the APIs for this could be improved

slide-120
SLIDE 120