CSE 3341: Principles of Programming Languages Recursive Descent - - PowerPoint PPT Presentation

cse 3341 principles of programming languages recursive
SMART_READER_LITE
LIVE PREVIEW

CSE 3341: Principles of Programming Languages Recursive Descent - - PowerPoint PPT Presentation

CSE 3341: Principles of Programming Languages Recursive Descent Parsing Jeremy Morris 1 Parsing A grammar is a generator for a language The rules tell us how to create strings in the language A parser is a recognizer for a language


slide-1
SLIDE 1

1

CSE 3341: Principles of Programming Languages Recursive Descent Parsing

Jeremy Morris

slide-2
SLIDE 2

Parsing

 A grammar is a generator for a language

 The rules tell us how to create strings in the language

 A parser is a recognizer for a language

 Confirms or rejects a string as being in the language or not being

in the language

 For an arbitrary CFG we can prove that the upper bound

  • n its running time is O(n3)

 Earley's algorithm and CYK algorithm

 Fortunately, if the CFG is carefully constructed, we can

do much better than that

 LL or LR grammars 2

slide-3
SLIDE 3

Top-down vs. Bottom-up parsing

 "Top-down" or predictive parsing (or LL parsing)

 Starts from the root node of the language and the left-most token.  Build the parse tree "top down" by using tokens to drive which

rule will be next to be expanded.

 Predictive parsers are most often written by hand.

 "Bottom-up" parsing (or LR parsing)

 Builds the parse tree from the leaves upward, matching a

collection of nodes to rule expansions.

 Also starts with the left-most token, but no fixed first rule to

expand.

 Bottom-up parsers are most often developed using a parser

generator such as Bison or YACC.

3

slide-4
SLIDE 4

CORE parsing practice

program int Y,Z; begin Y = 20; Z = 5; Y = Y – Z; write Y; end

4

slide-5
SLIDE 5

Recursive Descent

 An algorithm for walking an already constructed AST

 Top-down rather than bottom-up  Useful for interpreting parsed code, printing parsed code,

generating new code from parsed code

 Basic Idea:

 Create one method/procedure for each non-terminal 

The body of that method decides how to walk through its children based on the rules of the language

 Start by calling the procedure for the starting non-terminal  Algorithm ends when you have walked the entire tree 

Never ends? Infinite loop.

5

slide-6
SLIDE 6

Recursive Descent Example

void executeIf(??) bool b = evaluateCond(??) if (b) then executeSS(??) else executeSS(??)

6

<if> <cond> <stmt-seq> <stmt-seq> … … …

slide-7
SLIDE 7

Arrays to represent parse trees?

 Each node in the tree → one row in array.  Each row has n columns:

1.

Integer corresponding to non-terminal for the node.

2.

Integer corresponding to which alternative is used to expand that non-terminal

3.

Row numbers of children used

This is how we determine n above – maximum number of children needed for our language + 2.

7

Disclaimer: Your instructor does not advocate the use of arrays for hand built parsers in the year 2016! But you should understand how this algorithm works.

slide-8
SLIDE 8

Recursive Descent Example (revisited)

void executeIf(int n, int[][] pt) bool b = evaluateCond(pt[n,3], pt) if (b) then executeSS(pt[n,4], pt) else if (pt[n,2] == 2) then executeSS(pt[n,5], pt)

8

Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors.

slide-9
SLIDE 9

Recursive Descent Example (revisited)

void printIf(int n, int[][] pt) print("if") printCond(pt[n,3],pt) print("then") printSS(pt[n,4], pt) if (pt[n,2] == 2) then print("else") printSS(pt[n,5], pt) print("end;")

9

Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors.

slide-10
SLIDE 10

Recursive Descent Example (again)

void printAssign(int n, int[][] pt) printId(pt[n,3],pt) print(" = ") printExp(pt[n,4],pt)

10

Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors.

slide-11
SLIDE 11

Recursive Descent Example (again)

void execAssign(int n, int pt[][]) int result = evalExp(pt[n,4],pt) assignIdVal(pt[n,3], result)

11

Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors.

slide-12
SLIDE 12

Recursive Descent Parsing

 Parsing is harder

 Instead of walking the tree we are building it as we go  Same idea, one method for each non-terminal…  …Except that now each method will write values to the table

instead of reading from it

 Calling parse method will create an empty "node" in the tree by

using the next free row in the table

Requires us to keep track of rows being used

(Also requires us to have a big table or grow it dynamically)

Ignore this for now – there's a better approach we'll focus on once we have the idea down

12

slide-13
SLIDE 13

Recursive Descent Parsing Example

13

void parseIf(int n, int[][] pt) pt[n,1] = 8 String s = t.currentToken() // should be "if" t.nextToken() // consume the token pt[n,3] = currentRow++ parseCond(pt[n,3], pt) pt[n,4] = currentRow++ t.nextToken() // consume the "then" token parseSS(pt[n,4], pt) s = t.currentToken() if (s is "else") then t.nextToken() // consume the token pt[n,2] = 2 // indicate we're using the second expansion pt[n,5] = currentRow++ parseSS(pt[n,5],pt) else pt[n,2] = 1 // indicate we're using the first expansion t.nextToken() // why do this? t.nextToken() // and this? Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors.

slide-14
SLIDE 14

Recursive Descent Parsing

 Are you feeling good about this code?

 As an algorithm it's fine, but as far as code goes it leaves a bit to

be desired

 The code suffers from a severe lack of abstraction

 We're talking about trees but operating on a table  Why aren't we operating on a tree?

 Let's talk about an approach that uses a bit more

abstraction

 Encapsulate the data into a parse tree class  Hide our operations a bit – let the parse tree class take care of

details while we focus on bigger picture

 A dip into object-oriented design 14

slide-15
SLIDE 15

Parse Tree Class Design

 Let's think about the interface

 We're going to have a tree with a cursor – a means of moving

from node to node in the tree

 For each node we need to store: 

The non-terminal identity

The rule alternative used in expansion of this non-terminal

 For the cursor we need to be able to: 

Move it to each child (child 1, 2 and 3)

Move it back up to the parent node

 We need to be able to check: 

Is there a child?

Is there a parent (i.e. are we at the root node?)

15

slide-16
SLIDE 16

Parse Tree Class Design – Interface1

16

interface ParseTree // To get the contents of the node int getIdentity() int getAlternative() // To get the number of children int getChildCount() // To find out if it is the root boolean hasParent() // To move the cursor void moveToChild(int index) void moveToParent()

slide-17
SLIDE 17

Recursive Descent Example (ParseTree)

void printIf(ParseTree pt) print("if") pt.moveToChild(1) printCond(pt) print("then") pt.moveToParent() pt.moveToChild(2) printSS(pt) pt.moveToParent() if (pt.getAlternative() == 2) then print("else") pt.moveToChild(3) printSS(pt) pt.moveToParent() // set it back at the if node print("end;")

17

Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors.

slide-18
SLIDE 18

ParseTree Interface Design

 For dealing with variable assignment we need some

more operations

 If we're at an <id> node we need the id name, value  Add a few more methods to the interface:

// get the Id string if we are at an id node String getIdString() // set the id numeric value if we are an id node void setIdValue(int value) // get the numeric value for an id at an id node int getIdValue()

18

slide-19
SLIDE 19

Recursive Descent Example (ParseTree)

void execAssign(ParseTree pt) pt.moveToChild(2) // move to the expression to evaluate int value = execExpr(pt) pt.moveToParent() pt.moveToChild(1) // move to the ID node to store the value pt.setIdValue(value) pt.moveToParent() // restore our cursor to the top of the assign

19

Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors.

slide-20
SLIDE 20

ParseTree Interface Parsing

 What about parsing?

 For parsing we need to be able to: 

Add nodes

Set the content of nodes

 Need more operations to be able to do that:

// add another child to the current node void addChild() // To set the contents of the node void setIdentity(int ident) void setAlternative(int alternative)

20

slide-21
SLIDE 21

Recursive Descent Parsing Example (ParseTree)

void parseAssign(ParseTree pt) pt.setIdentity(7) // set it to an assignment pt.setAlternative(1) // use expansion 1 pt.addChild() pt.addChild() // add two children for the assignment node pt.moveToChild(1) parseID(pt) t.nextToken() // why are we doing this? pt.moveToParent() pt.moveToChild(2) parseExpr(pt) t.nextToken() // why? pt.moveToParent() // why?

21

Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors.

slide-22
SLIDE 22

Recursive Descent Parsing

 Okay, so we have a bit more abstraction

 Still not great – we can do better

 Let's make this all more object-oriented

 Right now we're treating the whole tree like an object  Let's make each node an object instead  Make a separate class for each non-terminal 

Build printing, parsing and executing logic into each non-terminal class

Build the children available into each class

22

slide-23
SLIDE 23

Node Class Design – Interfaces

23

interface programNode void parseProgram(Tokenizer t) void printProgram() void execProgram() interface ifNode void parseIf(Tokenizer t) void printIf() void execIf() interface stmtNode void parseStmt(Tokenizer t) void printStmt() void execStmt()

slide-24
SLIDE 24

Recursive Descent Example (ProgramNode)

public class ProgramNode: private: DeclSeqNode ds StmtSeqNode ss public: ProgramNode() this.ds = new DeclSeqNode() this.ss = new StmtSeqNode() void parseProgram(Tokenizer t) t.nextToken() // why? ds.parseDeclSeq(t) t.nextToken() // why? ss.parseStmtSeq(t) t.nextToken() // why?

24

Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors. void printProgram() print("program") ds.printDeclSeq() print("begin") ss.printStmtSeq() print("end") void execProgram() ds.execDeclSeq() ss.execStmtSeq()

slide-25
SLIDE 25

Recursive Descent Example (IfNode)

public class IfNode: private: CondNode condition StmtSeqNode thenSeq StmtSeqNode elseSeq int altNo; public: IfNode() this.condition = new CondNode() this.thenSeq = new StmtSeqNode() this.elseSeq = null this.altNo = 1;

25

Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors. void parseIf(Tokenizer t) t.nextToken() // why? condition.parseCondition(t) t.nextToken() // why? thenSeq.parseStmtSeq(t) String token = t.currentToken if (token is "else") then t.nextToken() // why? this.altNo = 2; elseSeq = new StmtSeqNode() elseSeq.parseStmtSeq(t) t.nextToken() t.nextToken() // why?

slide-26
SLIDE 26

Recursive Descent Example (IfNode continued)

void printIf() print("if") condition.printCondition() print("then") thenSeq.printStmtSeq() if (altNo == 2) then print("else") elseSeq.printStmtSeq() print("end;")

26

Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors. void execIf() bool c = condition.evalCondition() if (c) then thenSeq.execStmtSeq() else if (altNo == 2) then elseSeq.execStmtSeq()

slide-27
SLIDE 27

Recursive Descent Example (StmtNode)

public class StmtNode: private: AssignNode assign IfNode if LoopNode loop InputNode input OuptutNode output int altNo public: StmtNode() this.assign = null this.ifNode = null this.loop = null this.input = null this.output = null this.altNo = 1;

27

Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors. And parseStmtNode is incomplete void parseStmt(Tokenizer t) String tok = t.currentToken() if (tok is an id) assign = new AssignNode() altNo = 1 assign.parseAssign(t) else if (tok is "if") ifNode = new IfNode() altNo = 2 if.parseIf(t) else if (tok is "loop") loop = new LoopNode() altNo = 3 loop.parseLoop(t) …

slide-28
SLIDE 28

Recursive Descent Example (StmtNode continued)

void printStmt() if (altNo == 1) assign.printAssign() ….

28

Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors. Also neither function here is complete. void execStmt() if (altNo == 1) assign.execAssign() …

slide-29
SLIDE 29

Identifier and Assign Nodes

 In this approach, we need to consider Identifier and

Assign nodes a bit differently

 Need to make sure that each Identifier is only created once  Later uses of the same Id should refer to the same Identifier

  • bject

 Need to make sure that assignment works properly  Recall the symbol table from our earlier discussion? Need

something to replace that

29

slide-30
SLIDE 30

Recursive Descent Example (IdNode)

public class IdNode: private: String name int value bool initialized static Map<String, IdNode> symTab IdNode(String n) this.name = n this.initialized = false

30

Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors. public: static IdNode parseId(Tokenizer t) String tok = t.currentToken() t.nextToken() if (tok not in symTab) IdNode node = new IdNode(tok) symTab[tok] = node return symTab[tok] void setValue(int v) this.value = v this.initialized = true int getValue() return this.value String getName() return this.getName()

slide-31
SLIDE 31

Recursive Descent Example (AssignNode)

public class AssignNode: private: IdNode id ExprNode expr public: AssignNode this.id = null this.expr = new ExprNode() void parseAssign(Tokenizer t) id = IdNode.parseId(t) t.nextToken() // why? this.expr.parseExpr(t) t.nextToken()

31

Note that this pseudocode may not be complete. Specifically it lacks error checking, which is needs to be doing to report errors. void printAssign() print(id.getName()) print("=") this.expr.printExpr() print(";") void execAssign() int value = this.expr.evalExpr() id.setValue(value)

slide-32
SLIDE 32

32