CMSC 245 Wrap-up This class is about understanding how programs - - PowerPoint PPT Presentation

cmsc 245 wrap up this class is about understanding how
SMART_READER_LITE
LIVE PREVIEW

CMSC 245 Wrap-up This class is about understanding how programs - - PowerPoint PPT Presentation

CMSC 245 Wrap-up This class is about understanding how programs work To do this, were going to have to learn how a computer works Learned a ton in the class Regexp Lexical vs. Dynamic Scoping Closures Parsing Objects Racket Heaps


slide-1
SLIDE 1

CMSC 245 Wrap-up

slide-2
SLIDE 2

This class is about understanding how programs work

slide-3
SLIDE 3

To do this, we’re going to have to learn how a computer works

slide-4
SLIDE 4

Learned a ton in the class

slide-5
SLIDE 5

Lexical vs. Dynamic Scoping

Closures

Heaps

Stacks

Assembly

Calling conventions

Functions

Objects

Classes

Method dispatch

C++ Racket

Parsing

Regexp

Garbage collection

JS

slide-6
SLIDE 6

To apologize for making you write so much I wrote 732 lines of C++ yesterday

slide-7
SLIDE 7
  • Today we’re going to design an interpreter
  • Our source language will be a subset of Scheme
  • Numbers, variables, if, lambdas, let, begin, set!
  • We’ll write our own lexer, grammar, and parser
  • Starting from what you already wrote in labs
  • Our interpreter will use data structures from the course
  • And will include garbage collection under the hood
slide-8
SLIDE 8

Raw Text Lexer

Regex

Parser

CFG

AST

C++ (Sub)classes

Interpreter

Methods on AST

slide-9
SLIDE 9

Raw Text Lexer

scanner.l ~20 lines of code

Parser AST Interpreter

interpreter.h —220 lines of code parser.cc —150 lines of code interpreter.cc —160 lines of code

slide-10
SLIDE 10

Sometimes the most best way to do something is to find someone who’s already done it for you…

Lesson

slide-11
SLIDE 11

Symbol Table

Garbage Collector

slide-12
SLIDE 12

HAMT Boehm GC

slide-13
SLIDE 13

HAMT

Hash Array-Mapped Trie

Think of this is as a hash table that is “quick” to copy

Boehm GC

High-performance GC for C

We’ll use this to make it so our interpreter is automatically garbage collected

slide-14
SLIDE 14

HAMT Boehm GC

We’ll have our hash table use the GC under the hood So when we put things into HAMT, they are automatically GC’d

slide-15
SLIDE 15

The grammar…

START -> E $ E -> number E -> identifier E -> ( OP E+ ) E -> ( begin E+ ) E -> ( lambda (ID+) E ) E -> ( set! x E ) E -> ( E+ ) OP --> +|-|*|=

(In EBNF, allows E+)

slide-16
SLIDE 16

The grammar…

START -> E $ E -> number E -> identifier E -> ( OP E+ ) E -> ( begin E+ ) E -> ( lambda (ID+) E ) E -> ( set! x E ) E -> ( E+ ) OP --> +|-|*|=

Note! Not LL(1)!

(In EBNF, allows E+)

slide-17
SLIDE 17

AST

Idea: Represent using subclasses

AstNode

ConstantNode VariableNode LambdaNode IfThenElseNode FunctionCallNode SetNode

Number Variable name Arguments Lambda body Guard Then Else Function to call Arguments Variable name Expression We’ll dig into this in a few mins

slide-18
SLIDE 18

The lexer…

[ \t] { continue; } [\n] { tokenCount++; return NEWLINE; } ";".* { continue; } "(" { tokenCount++; return LPAREN; } ")" { tokenCount++; return RPAREN; } "+" { tokenCount++; return PLUS; } "-" { tokenCount++; return MINUS; } "*" { tokenCount++; return TIMES; } "lambda" { tokenCount++; return LAMBDA; } "let" { tokenCount++; return LET; } "<EOF>" { tokenCount++; return END_OF_INPUT; }

  • ?{digit}+ { tokenCount++; return INT; }

{identifier} { tokenCount++; return IDENTIFIER; } . { scannerError(); continue; }

slide-19
SLIDE 19

The Parser

I started from code we gave you in Lab 5…

But I cheated because it’s not LL(1) See parser.cc

(5 minute tour)

slide-20
SLIDE 20

The Symbol Table

Is a dictionary that takes strings to addresses in the heap Means most things are stored on heap Necessitates GC (we’ll discuss next)

slide-21
SLIDE 21

The Symbol Table

typedef hamt<HashedString, Address> environment;

Wrapper for strings HAMT is a dictionary Representation of pointers

Two methods: Get: Takes a dictionary and key, gives us address Which we then look up in heap Insert: Takes a dictionary, key, and value Returns a new dictionary

slide-22
SLIDE 22

The Heap

Stores two possible things: Plain old numbers Closures You could add other things (strings, etc..)

To find x, we look up address in symbol table, then use that address to look up through the heap

slide-23
SLIDE 23

typedef hamt<HashedString, Address> environment;

Wrapper around std::string Symbol tables

slide-24
SLIDE 24

typedef hamt<HashedString, Address> environment;

Wrapper around std::string Symbol tables

struct Closure { AstNode *function; environment *environment; };

Closures

slide-25
SLIDE 25

typedef hamt<HashedString, Address> environment;

Wrapper around std::string Symbol tables

struct Closure { AstNode *function; environment *environment; };

Closures

typedef variant<int, Closure> value;

Values

Variant is new in C++17

Container that allows me to store anything from any set of types get<int>(x) // gets the integer value assuming // x is an integer

slide-26
SLIDE 26

hamt<Address, value> *heap = new hamt<Address, value>(); typedef hamt<HashedString, Address> environment;

Wrapper around std::string Symbol tables

struct Closure { AstNode *function; environment *environment; };

Closures

typedef variant<int, Closure> value;

Values Heap

slide-27
SLIDE 27

Address *putValueInHeap(value v) { heapSize++; Address* addr = new ((Address*)GC_MALLOC(sizeof(Address))) Address({heapSize}); value * val = new ((value*)GC_MALLOC(sizeof(value))) value(v)); heap = const_cast<hamt<Address, value> *> (heap->insert(addr,val)); return addr; } value getValueFromHeap(Address a) { return *heap->get(&a); }

slide-28
SLIDE 28

Address *putValueInHeap(value v) { heapSize++; Address* addr = new ((Address*)GC_MALLOC(sizeof(Address))) Address({heapSize}); value * val = new ((value*)GC_MALLOC(sizeof(value))) value(v)); heap = const_cast<hamt<Address, value> *> (heap->insert(addr,val)); return addr; } value getValueFromHeap(Address a) { return *heap->get(&a); }

Tracks the object with GC

slide-29
SLIDE 29

Every AstNode implementation has a method execute : symbol table —> value There is a “top level” symbol table where global variables go

(top of interpreter.cc)

slide-30
SLIDE 30

REPL

int main() { while (true) { cout << "> "; AstNode *AST = parseE();; executeToplevelAst(AST); } }; void executeToplevelAst(AstNode *node) { value result = node->execute(globalEnvironment); if (holds_alternative<int>(result)) { cout << get<int>(result) << endl; } else { get<Closure>(result).function->render(); cout << endl; } }