Faculty of Science Information and Computing Sciences 1
Concepts of programming languages Lecture 9 Wouter Swierstra - - PowerPoint PPT Presentation
Concepts of programming languages Lecture 9 Wouter Swierstra - - PowerPoint PPT Presentation
Faculty of Science Information and Computing Sciences 1 Concepts of programming languages Lecture 9 Wouter Swierstra Faculty of Science Information and Computing Sciences 2 Talks advice language. Remember its the first time for
Faculty of Science Information and Computing Sciences 2
Talks – advice
▶ Focus on what makes your language different or interesting. ▶ Giving small examples can really help give a feel for a new
language.
▶ Go slow. Take the time to talk us through these examples.
Remember – it’s the first time for most people in the audience to read code in this language.
▶ Don’t try to cover too much ground: instead try to convey a handful
- f key points.
Faculty of Science Information and Computing Sciences 3
Last time
▶ Racket is a strict, dynamically typed functional language of the
Lisp/Sheme family.
▶ The syntax and language features are fairly minimalisticy. ▶ But this makes it very easy to manipulate code as data. ▶ Macros let us define new programming constructs in terms of
existing ones.
▶ By customizing the parser even further, we can embed other
languages in Racket or define our own Racket-dialects.
Faculty of Science Information and Computing Sciences 4
Metaprogramming in Racket
> (eval '(+ (* 1 2) 3)) 6 ▶ Quoting can turn any piece of code into data ▶ Using eval we can interpret a quoted expression and compute the
associated value.
Faculty of Science Information and Computing Sciences 5
Today
▶ How do other (typed) languages support this style of
metaprogramming?
▶ Case studies: when do people use reflection? ▶ Embedding DSLs using quasiquotation.
Faculty of Science Information and Computing Sciences 6
Two separate aspects
We can tease apart two separate issues:
▶ How to inspect a piece of code? (Reflection or quotation) ▶ How to generate or run new code? (Metaprogramming)
These ideas pop up over and over again in different languages. Many dynamically typed languages borrow ideas from the Lisp-family of languages.
Faculty of Science Information and Computing Sciences 7
Javascript: object reflection
Objects in Javascript are little more than a map from its attributes and methods to the associated values. Given any object o, we can iterate over all its methods and attributes:
for (var key in o) { console.log(key + " -> " + o[key]); }
This lets us reflect and inspect the object structure.
Faculty of Science Information and Computing Sciences 8
Javascript: eval
As we saw previously, we can evaluate any string corresponding to a program fragment:
var x = 3; var y = 7; var z = eval("x * y")
Note that eval is incredibly unsafe! Question: In what ways can this fail?
▶ Illegal syntax: eval("2#2k-sa[2da") ▶ Reference to unbound variables eval("3 * z") ▶ Type errors: eval("1.toUpperCase()") ▶ Dynamic execution errors: eval("1/0")
Faculty of Science Information and Computing Sciences 8
Javascript: eval
As we saw previously, we can evaluate any string corresponding to a program fragment:
var x = 3; var y = 7; var z = eval("x * y")
Note that eval is incredibly unsafe! Question: In what ways can this fail?
▶ Illegal syntax: eval("2#2k-sa[2da") ▶ Reference to unbound variables eval("3 * z") ▶ Type errors: eval("1.toUpperCase()") ▶ Dynamic execution errors: eval("1/0")
Faculty of Science Information and Computing Sciences 9
C# reflection
In C# there are limited ways in which we can reflect the type of a piece of data:
int i = 42; System.Type type = i.GetType(); System.Console.WriteLine(type);
This prints System.Int32. Similarly to Javascript, we can request the attributes and functions of a class and iterate over these. But what about generating new code?
Faculty of Science Information and Computing Sciences 10
Metaprogramming in C#
There is no way to generate new C# ASTs… But you can emit ‘machine code’ instructions dynamically. All .NET languages, including C#, F#, and Visual Basic, are compiled to the Common Intermediate Language (CIL). This is stack-based assembly language that is either executed natively or run on a virtual machine.
Faculty of Science Information and Computing Sciences 11
Example: Emitting CIL instructions
A simple call to WriteLine:
Console.WriteLine("Hello World");
Becomes…
Faculty of Science Information and Computing Sciences 12
private void EmitCode(MethodBuilder builder , string text) { ILGenerator generator = builder.GetILGenerator(); generator.Emit(OpCodes.Ldstr, text); MethodInfo methodInfo = typeof(System.Console) .GetMethod( "WriteLine", BindingFlags.Public | BindingFlags.Static, null, new Type[]{typeof(string)}, null); generator.Emit(OpCodes.Call, methodInfo); generator.Emit(OpCodes.Ret); }
With another 100 lines of boilerplate…
Faculty of Science Information and Computing Sciences 13
Metaprogramming in C#
We can generate new bytecode and inspect existing classes…
▶ this doesn’t scale very well to larger examples; ▶ emitting CIL codes is essentially untyped and unsafe.
But it is a good example of a different style of metaprogramming where the generating language (C#) and generated language (CIL) are different.
Faculty of Science Information and Computing Sciences 14
Metaprogramming in other languages
More advanced systems – such as Scala’s Lightweight modular staging will be presented after the Christmas break. What about metaprogramming in a strongly typed language such as Haskell?
Faculty of Science Information and Computing Sciences 15
Template Haskell
There is a language extension, Template Haskell, that provides metaprogramming support for Haskell. This defines support for quotating code into data and splicing generated code back into your program.
Faculty of Science Information and Computing Sciences 16
Template Haskell
import Language.Haskell.TH three :: Int three = 1 + 2 threeQ :: ExpQ threeQ = [| 1 + 2 |]
Rather than Racket’s quote function, we can enclose expressions in quotation brackets [| ... |]. This turns code into a quoted expression, ExprQ.
Faculty of Science Information and Computing Sciences 17
Unquoting
> three 3 > $threeQ 3
We can splice an expression e back into our program by writing $e (note the lack of space between $ and e) This runs the associated metaprogram and replaces the occurrence $e with its result.
Faculty of Science Information and Computing Sciences 18
Inspecting code
What happens when we quote an expression?
> runQ [| 1 + 2 |] InfixE (Just (LitE (IntegerL 1))) (VarE GHC.Num.+) (Just (LitE (IntegerL 2)))
Template Haskell defines a data type Exp corresponding to Haskell expressions. The type ExpQ is a synonym for Q Exp – a value of type Exp in the quotation monad Q. The runQ function returns the quoted expression associated with ExpQ.
Faculty of Science Information and Computing Sciences 18
Inspecting code
What happens when we quote an expression?
> runQ [| 1 + 2 |] InfixE (Just (LitE (IntegerL 1))) (VarE GHC.Num.+) (Just (LitE (IntegerL 2)))
Template Haskell defines a data type Exp corresponding to Haskell expressions. The type ExpQ is a synonym for Q Exp – a value of type Exp in the quotation monad Q. The runQ function returns the quoted expression associated with ExpQ.
Faculty of Science Information and Computing Sciences 19
The Exp data type
data Exp = VarE Name -- variables | ConE Name
- - constructors
| LitE Lit
- - literals such as 5 or 'c'
| AppE Exp Exp -- application | ParensE Exp -- parentheses | LamE [Pat] Exp -- lambdas | TupE [Exp] -- tuples | LetE [Dec] Exp -- let | CondE Exp Exp Exp -- if-then-else ...
Not to mention unboxed tuples, record updates, record construction, lists, list comprehensions,…
Faculty of Science Information and Computing Sciences 20
Template Haskell: programming with expressions
incr : Int -> Int incr x = x + 1 incrE : Exp -> Exp incrE e = AppE (VarE 'incr) e) ▶ The first incr function increments a number; ▶ The second takes a quoted expression as argument and builds a
new expression by passing its argument to incr.
▶ We can ‘quote’ variable names with the prefix quotation mark 'incr
Faculty of Science Information and Computing Sciences 21
Template Haskell: programming with expressions
- - Let x = [| 1 + 2 |]
x : Exp x = InfixE (Just (LitE (IntegerL 1)))... y : Int y = $(incrE x)
Question: What is the result of evaulating y? But this might go wrong…
Faculty of Science Information and Computing Sciences 21
Template Haskell: programming with expressions
- - Let x = [| 1 + 2 |]
x : Exp x = InfixE (Just (LitE (IntegerL 1)))... y : Int y = $(incrE x)
Question: What is the result of evaulating y? But this might go wrong…
Faculty of Science Information and Computing Sciences 22
Typing Template Haskell
What happens when we create ill-typed expressions?
- - Let x = [| "Hello world" |]
x : Exp x = LitE (StringL "Hello World") y : Int y = $(incrE x)
Question: What is the result of this program? We get a type error:
No instance for (Num [Char]) arising from a use of ‘incr’ In the expression: incr "Hello World"...
Faculty of Science Information and Computing Sciences 22
Typing Template Haskell
What happens when we create ill-typed expressions?
- - Let x = [| "Hello world" |]
x : Exp x = LitE (StringL "Hello World") y : Int y = $(incrE x)
Question: What is the result of this program? We get a type error:
No instance for (Num [Char]) arising from a use of ‘incr’ In the expression: incr "Hello World"...
Faculty of Science Information and Computing Sciences 23
Typing Template Haskell
Template Haskell is a staged programming language. The compiler starts by type checking the program – even the program fragments building up expressions:
x : Exp x = LitE (StringL "Hello World")
But it ignores any splices:
y : Int y = $(incrE x)
It does not yet know if y is type correct or not…
Faculty of Science Information and Computing Sciences 24
Typing Template Haskell
Once the basic types are correct, the compiler can safely compute new code (such as that arising from the call incrE x). So during type checking the compiler needs to perform evaluation. It can then replace splices, such as $(incrE x) with result of evaluation.
Faculty of Science Information and Computing Sciences 25
Typing Template Haskell
But even then, we don’t know if the generated code is type correct. At this point, the splice $(incrE x) will be replaced by incr "Hello
World".
The compiler type checks the generated code, which may raise an error. Question: Is two passes enough? Not in general - the generated code may contain new program splices.
Faculty of Science Information and Computing Sciences 25
Typing Template Haskell
But even then, we don’t know if the generated code is type correct. At this point, the splice $(incrE x) will be replaced by incr "Hello
World".
The compiler type checks the generated code, which may raise an error. Question: Is two passes enough? Not in general - the generated code may contain new program splices.
Faculty of Science Information and Computing Sciences 26
Type safety
Question: So is Template Haskell statically typed? Yes: all generated code is type checked. No: metaprograms are essentially untyped.
Faculty of Science Information and Computing Sciences 26
Type safety
Question: So is Template Haskell statically typed? Yes: all generated code is type checked. No: metaprograms are essentially untyped.
Faculty of Science Information and Computing Sciences 27
Type safe metaprograms
Template Haskell also provides a data type for ‘typed expressions’:
newtype TExp a = TExp { unType :: Exp }
The type variable is not used, but tags an expression with the type we expect it to have. This is sometimes known as a phantom type. This can be used to give us some more type safety:
appE :: TExp (a -> b) -> TExp a -> TExp b appE (TExp f) (TExp x) = TExp (AppE f x)
Question: Where does this break?
Faculty of Science Information and Computing Sciences 28
Limited safety…
There are lots of ways to break this. Referring to variables is one way:
bogus :: TExp String bogus = TExp (VarE 'incr)
But more generally, there are plenty of situations where we cannot easily figure out the types of the metaprogram that we generate.
Faculty of Science Information and Computing Sciences 29
Beyond expressions
The Exp data type is used to reflect expressions. But Template Haskell also provides data types describing:
▶ Patterns ▶ Function definitions ▶ Data type declarations ▶ Class declarations ▶ Instance definitions ▶ Compiler pragmas ▶ …
Together with the technology to reflect code into such data types.
Faculty of Science Information and Computing Sciences 30
Beyond expressions
As a result, we can generate arbitrary code fragments using Template Haskell:
▶ new type signatures; ▶ new data type declarations; ▶ new classes or class instances; ▶ …
Any pattern in our code that we can describe programmatically can be automated throught Template Haskell.
Faculty of Science Information and Computing Sciences 31
Template Haskell: examples
There are ways in which people use of Template Haskell:
▶ Generalizing a certain pattern of functions: zip :: [a] -> [b] -> [(a,b)] zip3 :: [a] -> [b] -> [c] -> [(a,b,c)] zip4 :: [a] -> [b] -> [c] -> [d] -> [(a,b,c,d)] ... ▶ Automating boilerplate code (lenses); ▶ Including system information (git-embed). ▶ Interfacing safely to an external data source, such as a database,
requires computing new marshalling/unmarshalling functions (printf).
Faculty of Science Information and Computing Sciences 32
Example: git-embed
Suppose that we want to include version information about our program.
> mytool --version Built from the branch 'master' with hash 410d5264a
We could do this in numerous ways:
▶ maintain a version variable in our code manually; ▶ have a shell script that generates this information whenever we
release code;
▶ splice this information into our code using Template Haskell.
Faculty of Science Information and Computing Sciences 33
Example: git-embed
There is a library using Template Haskell, git-embed, that provides precisely this functionality. Using it is easy enough:
import Git.Embed gitRev :: String gitRev = $(embedGitShortRevision) gitBranch :: String gitBranch = $(embedGitBranch)
How is it implemented?
Faculty of Science Information and Computing Sciences 34
Example: git-embed ideas
Functions such as embedGitBranch need to compute a string, corresponding to the current git branch. To do so:
- 1. We perform a bit of I/O, running git branch with suitable
arguments;
- 2. The result of this command contains the information that we are
after.
- 3. Quoting this result back into a string literal, yields the desired
value. Note: we can run IO computations while metaprogramming…
Faculty of Science Information and Computing Sciences 35
Example: git-embed implementation
embedGitBranch : ExpQ embedGitBranch = embedGit ["rev-parse", "--abbrev-ref", "HEAD"] embedGit :: [String] -> ExpQ embedGit args = do addRefDependentFiles gitOut <- runIO (readProcess "git" args "") return $ LitE (StringL gitOut)
The addRefDependentFiles adds the files from the .git directory as
- dependencies. If these files change, the module will be recompiled.
Faculty of Science Information and Computing Sciences 36
Quotation and I/O
This example illustrates that we can run I/O operations during quotation. Question: Why should this be allowed? And what are the drawbacks?
▶ Makes it possible to read data from a file, network, database, etc. –
and use this information to generate new code or write new data to a file.
▶ The compiling code may have side effects! You can write a Haskell
program that formats your hard-drive when compiled.
Faculty of Science Information and Computing Sciences 36
Quotation and I/O
This example illustrates that we can run I/O operations during quotation. Question: Why should this be allowed? And what are the drawbacks?
▶ Makes it possible to read data from a file, network, database, etc. –
and use this information to generate new code or write new data to a file.
▶ The compiling code may have side effects! You can write a Haskell
program that formats your hard-drive when compiled.
Faculty of Science Information and Computing Sciences 37
Example: printf
If you’ve ever done any debugging with C, you will have encountered
printf: char* userName; x = ... ; y = ... ; printf("x,y, and user are now:") printf("x=%d,y=%d,user=%s",x,y,userName);
The printf function takes a variable number of arguments: depending
- n the format string, it expects a different number of integers and
strings. What is its type?
Faculty of Science Information and Computing Sciences 38
Example: printf using Template Haskell
We’ll sketch how to implement a printf function in Haskell:
> $(printf "x=%d,s=%s") 6 "Hello" "x=6,s=Hello"
The key idea is to use the argument string to compute a function taking suitable arguments. Splicing $(printf "x=%d,s=%s") will compute the term:
\n0 -> \s1 -> "x=" ++ shown0 ++ ",s=" ++ s1
Faculty of Science Information and Computing Sciences 39
Example: printf
printf :: String -> ExpQ printf s = gen (parse s) data Format = D | S | L String parse :: String -> [Format] gen :: [Format] -> ExpQ ▶ The printf function maps a string to an expression; ▶ The parse function reads in the string and splits it into a series of
format instructions – it doesn’t use any Template Haskell and I won’t cover it further.
▶ Depending on these instructions, the gen command will compute a
different expression.
Faculty of Science Information and Computing Sciences 40
Example: printf
Let’s try to figure out how the gen function works in several different steps.
data Format = D | S | L String gen :: [Format] -> ExpQ gen [] = LitE (StringL "")
If the list of formatting directives is empty, we compute the empty string – that was easy enough.
Faculty of Science Information and Computing Sciences 41
Example: printf
Now suppose we only ever have to worry about handling a single formatting directive:
data Format = D | S | L String gen :: [Format] -> ExpQ gen [D] = [| \n -> show n |] gen [S] = [| \s -> s |] gen [L str] = LitE (StringL str)
Each individual case generates the code that we would write by hand
- therwise.
Faculty of Science Information and Computing Sciences 42
Example: printf
printf :: String -> Exp printf s = gen (parse s) [| "" |] gen :: [Format] -> Exp -> Exp gen [] e = e gen (D:fmts) e = [| \n-> $(gen fmts [| $e ++ show n |]) |] gen (S:fmts) e = [| \s-> $(gen fmts [| $e ++ s |]) |] gen (L s:fmts) e = gen fmts [| $e ++ $(LitE (StringL s)) |]
The gen function is defined using an accumulating parameter. Initially, this is just the empty string.
Faculty of Science Information and Computing Sciences 43
Example: printf
printf :: String -> Exp printf s = gen (parse s) [| "" |] gen :: [Format] -> Exp -> Exp gen [] e = e gen (D:fmts) e = [| \n-> $(gen fmts [| $e ++ show n |]) |] gen (S:fmts) e = [| \s-> $(gen fmts [| $e ++ s |]) |] gen (L s:fmts) e = gen fmts [| $e ++ $(LitE (StringL s)) |]
As we encounter more formatting directives, we add an additional lambda if necessary and perform a recursive call. Note the subtle interplay between splicing and quoting.
Faculty of Science Information and Computing Sciences 44
Example: printf
This example shows how to compute new expressions from existing data. This same pattern pops up whenever we want to interface with an external data source, such as database:
▶ Request information about the table layout; ▶ Parse the result and generate corresponding types; ▶ Generate functions to access the data.
Faculty of Science Information and Computing Sciences 45
Record management
Records in Haskell provide a convenient way to organize structured data.
data Person = {name :: String, address :: Address} data Address = {street :: String, city :: String}
In practice, these records can be huge.
Faculty of Science Information and Computing Sciences 46
Nested records
We can use the record fields to project out the desired information. If we need to access nested fields, we can define our own projection functions:
personCity :: Person -> City personCity = city . address
Record projections compose nicely. What about record updates?
Faculty of Science Information and Computing Sciences 47
Nested records
But setting nested fields is pretty painful:
setCity :: City -> Address -> Address setCity newCity a = a {city = newCity} setAddress :: Address -> Person -> Person setAddress newAddress p = p {address = newAddress} setPersonCity :: City -> Person -> Person setPersonCity newCity p = setAddress (setCity newCity (address p)) p
This is already quite some ‘boilerplate’ code – code that is not interesting and follows a fixed pattern.
Faculty of Science Information and Computing Sciences 48
Example: lenses
To automate this, we can package the getter and setter functions in a single data type, sometimes reffered to as a lens:
data (:->) a b = Lens { get : a -> b , set : b -> a -> a }
Such lenses compose nicely:
compose :: (b :-> c) -> (a :-> b) -> (a :-> c)
Faculty of Science Information and Computing Sciences 49
Example: lenses
In our example, suppose we are given lenses for every record field:
city :: (Address :-> String) address :: (Person :-> Address)
We can compose these lenses by hand, to assemble the pieces of data that we’re interested in:
personCity :: Person :-> City personCity = compose city address updateCity :: City -> Person -> Person updateCity newCity = set personCity newCity
Faculty of Science Information and Computing Sciences 50
Example: lenses
Lenses make the manipulation nested records manageable. But who writes the lenses? This is not hard to do by hand:
city :: Address :-> City city = Lens { get = \a -> city a , set = \nc a -> a {city = nc} }
But could clearly use some automation.
Faculty of Science Information and Computing Sciences 51
Example: fclabels
The fclabels package defines the type of lenses together with a Template Haskell module that generates the lenses associated with any given record:
data Person = Person { _name :: String, _address :: Address} data Address = Address { _city :: City , _street :: String} mkLabel ''Address mkLabel ''Person
The double quotes ''Address quotes type Adress – the distinction can be necessary.
Faculty of Science Information and Computing Sciences 52
Example: fclabels
What does the Template Haskell code do?
- 1. Given a quoted record name, looks up the associated record
declaration;
- 2. For each field, generate a suitable lens by:
2.1 looking up the type of the field; 2.2 generating the type signature of the required lens; 2.3 generating a name for the lens, based on the field name; 2.4 generating an expression corresponding to the lens definition;
- 3. Splicing all these declarations back into the file.
About 700 loc – but handles different flavours of lenses and generalizations.
Faculty of Science Information and Computing Sciences 53
Example: fclabels
This example sketches how you might want to use Template Haskell to generate code. There is a clear pattern showing how to define a lens for any given record… But we need metaprogramming technology to automate this.
Faculty of Science Information and Computing Sciences 54
Quasiquotation
As a last example, I want to briefly mention quasiquotation. We’ve seen how to embed domain specific languages in Haskell using deep/shallow embeddings. When we do so, we are constrained by Haskell’s syntax and static semantics. Racket shows how to use macros to write custom language dialects. Haskell’s quasiquotation framework borrows these ideas.
Faculty of Science Information and Computing Sciences 55
Quasiquotation: example
Suppose I’m writing a Haskell library for manipulating and generating C code. But working with C ASTs directly is pretty painful:
add n = Func (DeclSpec [ ] [ ] (Tint Nothing)) (Id "add") DeclRoot (Args [Arg (Just (Id "x")) ...
What I’d like to do is embed (a fragment of) C in my Haskell library.
Faculty of Science Information and Computing Sciences 56
Using quasiquotation
The quasiquoter allows me to do just that:
add n = [cfun | int add (int x ) { return x + $int : n$; } |]
The cfun quasiquoter tells me how to turn a string into a suitable Exp.
Faculty of Science Information and Computing Sciences 57
Defining quasiquoters
A quasiquoter is nothing more than a series of parsers for expressions, patterns, types and declarations:
data QuasiQuoter = QuasiQuoter { quoteExp :: String -> Q Exp, quotePat :: String -> Q Pat, quoteType :: String -> Q Type, quoteDec :: String -> Q [Dec] }
Whenever the Haskell parser encounters a quasiquotation [ myQQ |
... |] it will run the parser associated with the quasiquoter myQQ to
generate the quoted expression/pattern/type/declaration.
Faculty of Science Information and Computing Sciences 58
Multiline strings
As a simple example, suppose we want to have multi-line string literals. We can define a quasiquoter:
ml :: QuasiQuoter ml = QuasiQuoter { quoteExp = (\a -> LitE (StringL a)), ... }
And call it as follows:
example : String example = [ml | hello beautiful world|]
Faculty of Science Information and Computing Sciences 59
Quasiquoting
The quasiquoting mechanism allows you to embed arbitrary syntax within your Haskell program. And still use Template Haskell’s quotation and splicing to mix your
- bject language with Haskell code.
This is a mix of the embedded and stand-alone approaches to domain specific languages that we saw over the last few weeks.
Faculty of Science Information and Computing Sciences 60
Drawbacks of metaprogramming
Metaprogramming has many applications, but several crucial drawbacks:
▶ Template Haskell AST and ‘real’ AST are often out of step. ▶ AST is much more complex than S-expressions – the Template
Haskell library is huge!
▶ Debugging type errors in generated code is hard. ▶ Q monad allows arbitrary IO during compilation – which may be a
security risk.
▶ Large computations can slow down compile times.
Faculty of Science Information and Computing Sciences 61
Looking ahead
▶ Student presentations next week. ▶ Parallel and concurrent programming in Erlang and Haskell. ▶ …
Faculty of Science Information and Computing Sciences 62