Parsing Parsers Jenna Zeigen JSConf Hawaii 2/5/2020 - - PowerPoint PPT Presentation

parsing parsers
SMART_READER_LITE
LIVE PREVIEW

Parsing Parsers Jenna Zeigen JSConf Hawaii 2/5/2020 - - PowerPoint PPT Presentation

Parsing Parsers Jenna Zeigen JSConf Hawaii 2/5/2020 @zeigenvector jenna.is/at-jsconfhi Senior Frontend Engineer at Slack Organizer of EmpireJS Organizer of BrooklynJS @zeigenvector jenna.is/at-jsconfhi @zeigenvector


slide-1
SLIDE 1 @zeigenvector jenna.is/at-jsconfhi

Jenna Zeigen JSConf Hawaii 2/5/2020

Parsing Parsers

slide-2
SLIDE 2 @zeigenvector jenna.is/at-jsconfhi

Senior Frontend Engineer at Slack Organizer of EmpireJS Organizer of BrooklynJS

slide-3
SLIDE 3 @zeigenvector jenna.is/at-jsconfhi

@zeigenvector jenna.is/at-jsconfhi

slide-4
SLIDE 4 @zeigenvector jenna.is/at-jsconfhi

parsing parsers!

slide-5
SLIDE 5 @zeigenvector jenna.is/at-jsconfhi
  • 1. abcs of language
slide-6
SLIDE 6 @zeigenvector jenna.is/at-jsconfhi
  • 1. abcs of language
  • 2. hmm, actually,

let's just step through a (small) parser

slide-7
SLIDE 7 @zeigenvector jenna.is/at-jsconfhi

the abcs

  • f

language

slide-8
SLIDE 8 @zeigenvector jenna.is/at-jsconfhi Justin Bieber - What Do You Mean? https://en.wikipedia.org/wiki/Language

the abcs of language

"language" is a structured system of communication

First you're up and you’re down And then between Oh I really want to know What do you mean? Ooh

♪♪♪

slide-9
SLIDE 9 @zeigenvector jenna.is/at-jsconfhi No Doubt - Don't Speak https://en.wikipedia.org/wiki/Language

the abcs of language

"natural language" is a naturally evolved system that humans use to communicate with each other

You speak And I know just what You're what you're sayin'

♪♪♪

slide-10
SLIDE 10 @zeigenvector jenna.is/at-jsconfhi Dua Lipa - New Rules https://en.wikipedia.org/wiki/Formal_language

the abcs of language

"formal languages" have an alphabet and words, which can be combined correctly based on specific rules

I got new rules, I count 'em I got new rules, I count 'em

♪♪♪

🎁

slide-11
SLIDE 11 @zeigenvector jenna.is/at-jsconfhi MC Hammer - U Can't Touch This https://en.wikipedia.org/wiki/Formal_grammar

grammar school

a language's grammar is the set of rules for that language

Stop! Grammar time!

♪♪♪

slide-12
SLIDE 12 @zeigenvector jenna.is/at-jsconfhi

grammar school

"formal grammars" put these rules in terms of replacement

To the left, to the left To the left, to the left (Mmm) To the left, to the left Non-terminals in the spot to the left To the left, to the left The grammar tells us for what symbols They are replaceable ♪♪♪ Beyoncé - Irreplaceable https://en.wikipedia.org/wiki/Formal_grammar

🎁

slide-13
SLIDE 13 @zeigenvector jenna.is/at-jsconfhi

grammar school

Jenna gave the talk

Sentence Verb Phrase Noun Phrase Noun Verb Noun Direct Object

slide-14
SLIDE 14 @zeigenvector jenna.is/at-jsconfhi

grammar school

Sentence = Noun + Verb Phrase Verb Phrase = Verb + Noun Phrase Noun Phrase = Direct Object + Noun

slide-15
SLIDE 15 @zeigenvector jenna.is/at-jsconfhi Ariana Grande - thank u, next https://www.ecma-international.org/ecma-262/10.0/index.html

grammar school

Programming language grammars are defined in their spec

thank u, spec thank u, spec thank u, spec I'm so very grateful for my spec

♪♪♪

slide-16
SLIDE 16 @zeigenvector jenna.is/at-jsconfhi

syntax city

slide-17
SLIDE 17 @zeigenvector jenna.is/at-jsconfhi

syntax city

javascript "front end" in:#random in: #general from:@jenna

slide-18
SLIDE 18 @zeigenvector jenna.is/at-jsconfhi

syntax city

javascript "front end" in:#random in: #general from:@jenna

Query → Term Query → Term Query Query → Filter Query → Filter Query

slide-19
SLIDE 19 @zeigenvector jenna.is/at-jsconfhi

in:#random javascript in: #general from:@jenna

syntax city

"front end"

slide-20
SLIDE 20 @zeigenvector jenna.is/at-jsconfhi
  • k, now

parsers

slide-21
SLIDE 21 @zeigenvector jenna.is/at-jsconfhi

moving parse

Miley Cyrus - Party in the USA https://en.wikipedia.org/wiki/Parsing

the process of analyzing language against the rules of its grammar

I got my rules up, And a bit of language Is its syntax okay? Yeah we're parsing in the USA

♪♪♪

slide-22
SLIDE 22 @zeigenvector jenna.is/at-jsconfhi

moving parse

Counting Crows - Mr. Jones https://dev.to/yelouafi/a-gentle-introduction-to-parser-combinators-21a0

a function that takes raw input and returns meaningful data created from the input,

  • r an error
All the beautiful inputs Are very, very meaningful You know, space is my favorite delimiter I felt so symbolic yesterday

♪♪♪

slide-23
SLIDE 23 @zeigenvector jenna.is/at-jsconfhi

moving parse

https://en.wikipedia.org/wiki/Parsing

parsers usually have two parts: the lexer and the parser

lexer and parser making us a tree P-A-R-S-I-N-G

♪♪♪

slide-24
SLIDE 24 @zeigenvector jenna.is/at-jsconfhi Old Crow Medicine Show - Wagon Wheel https://tomassetti.me/guide-parsing-algorithms-terminology/

the lexer takes the text and breaks it down into meaningful units, called "tokens"

Reading through this code I've been asked to invoke Got a lexer out here first Made a nice short token

♪♪♪

lex go

slide-25
SLIDE 25 @zeigenvector jenna.is/at-jsconfhi

lex go

first, the "scanner" goes through and breaks the string

  • f characters into the proper

chunks, or "lexemes"

I was born to lex (Yes) According to the spec What amazing tech, Having this effect (Woo) And soon the parser will turn These strings into objects (Money)

♪♪♪

Cardi B - Money https://en.wikipedia.org/wiki/Lexical_analysis
slide-26
SLIDE 26 @zeigenvector jenna.is/at-jsconfhi

lex go

coding time

♪♪♪

Semisonic - Closing Time https://github.com/jennazee/sparse/blob/master/sparse.js https://jenna.is/sparse/sparse.html
slide-27
SLIDE 27 @zeigenvector jenna.is/at-jsconfhi

lex go

const lexemes = 'Jenna gave the talk'.split(' ');

slide-28
SLIDE 28 @zeigenvector jenna.is/at-jsconfhi

lex go

"Jenna gave the talk" . split ( ' ' ) ; const lexemes =

slide-29
SLIDE 29 @zeigenvector jenna.is/at-jsconfhi

lex go

The Notorious B.I.G. – Sky's The Limit https://en.wikipedia.org/wiki/Lexical_analysis

then, the "evaluator" combines the lexeme's type with its value to create the "token"

I then begin to encounter with my parse, To split the text apart Break it down into sections Tokens from the lexemes

♪♪♪

slide-30
SLIDE 30 @zeigenvector jenna.is/at-jsconfhi

lex go

coding time

♪♪♪

Semisonic - Closing Time https://github.com/jennazee/sparse/blob/master/sparse.js https://jenna.is/sparse/sparse.html
slide-31
SLIDE 31 @zeigenvector jenna.is/at-jsconfhi

lex go

"Jenna gave the talk" . split ( ' ' ) ; const lexemes =

slide-32
SLIDE 32 @zeigenvector jenna.is/at-jsconfhi

lex go

https://esprima.org/demo/parse.html#

Punctuator Identifier Punctuator String Punctuator Punctuator Keyword Punctuator Identifier String

slide-33
SLIDE 33 @zeigenvector jenna.is/at-jsconfhi

lex go

https://esprima.org/demo/parse.html#

[ { "type": "Keyword", "value": "const" }, { "type": "Identifier", "value": "lexemes" }, { "type": "Punctuator", "value": "=" }, { "type": "String", "value": "'Jenna gave a talk'" { "type": "Punctuator", "value": "." }, { "type": "Identifier", "value": "split" }, { "type": "Punctuator", "value": "(" }, { "type": "String", "value": "' '" }, { "type": "Punctuator", "value": ")" }, { "type": "Punctuator", "value": ";" } ]

weird lex but ok
slide-34
SLIDE 34 @zeigenvector jenna.is/at-jsconfhi

parse for the course

the parser will check that the syntax is correct while creating a structural representation

CHVRCHES ft. Marshmello - Here With Me https://www.geeksforgeeks.org/introduction-of-parsing-ambiguity-and-parsers-set-1/ Every single word Is perfect as it can be And I put it in a tree

♪♪♪

slide-35
SLIDE 35 @zeigenvector jenna.is/at-jsconfhi

parse for the course

Term Term Query FromToken InToken InToken

javascript "front end" in:#random in: #general from:@jenna

slide-36
SLIDE 36 @zeigenvector jenna.is/at-jsconfhi

parse for the course

I know who I want To read my code (It's you!)

♪♪♪

Semisonic - Closing Time https://github.com/jennazee/sparse/blob/master/sparse.js https://jenna.is/sparse/sparse.html
slide-37
SLIDE 37 @zeigenvector jenna.is/at-jsconfhi

parse for the course

https://esprima.org/demo/parse.html#

Program VariableDeclaration VariableDeclarator String CallExpression MemberExpression Arguments Identifier Identifier String

slide-38
SLIDE 38 @zeigenvector jenna.is/at-jsconfhi

parse for the course

https://esprima.org/demo/parse.html#

VariableDeclarator String CallExpression MemberExpression Arguments Identifier Identifier String

const lexemes = 'Jenna gave the talk'.split(' ');

slide-39
SLIDE 39 @zeigenvector jenna.is/at-jsconfhi

parse for the course

{ "type": "Program", "body": [ { "type": "VariableDeclaration", "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "lexemes" }, "init": { "type": "CallExpression", "callee": { "type": "MemberExpression", "computed": false, "object": { "type": "Literal", "value": "Jenna gave the talk", "raw": "'Jenna gave a talk'" }, "property": { "type": "Identifier", "name": "split" } }, "arguments": [ { "type": "Literal", "value": " ", "raw": "' '" } ] } } ], "kind": "const" } Computers can have a little JavaScript, as a tree https://esprima.org/demo/parse.html#
slide-40
SLIDE 40 @zeigenvector jenna.is/at-jsconfhi

in: javascript in: from:

syntax city

"front end"

#general #random @jenna

slide-41
SLIDE 41 @zeigenvector jenna.is/at-jsconfhi

syntax city

javascript "front end" in:#random in: #general from:@jenna

Query → Term Query → Term Query Query → Filter Query → Filter Query Filter → Modifier Entity

slide-42
SLIDE 42 @zeigenvector jenna.is/at-jsconfhi

syntax city

Term Term Query Modifier Entity Modifier Entity Modifier FromToken InToken InToken

javascript "front end" in:#random in: #general from:@jenna

slide-43
SLIDE 43 @zeigenvector jenna.is/at-jsconfhi

parse for the course

Read my co-odeeee

♪♪♪

Semisonic - Closing Time https://github.com/jennazee/sparse/blob/master/parse.js https://jenna.is/sparse/parse.html
slide-44
SLIDE 44 @zeigenvector jenna.is/at-jsconfhi

the more complicated stuff...

slide-45
SLIDE 45 @zeigenvector jenna.is/at-jsconfhi

advanced grammar school

/in: ?([^ ]+)|from: ? ([^ ]+)'|"([^"]+)"| \'([^\']+)\'|([^ ]+)'/

slide-46
SLIDE 46 @zeigenvector jenna.is/at-jsconfhi https://en.wikipedia.org/wiki/Regular_grammar

A "regular grammar" is

  • ne where all the

production rules are

  • ne of the following:

A → a A → aB

advanced grammar school

slide-47
SLIDE 47 @zeigenvector jenna.is/at-jsconfhi

advanced grammar school

A → a A → aB Query → Term Query → Term Query Query → Filter Query → Query Filter

slide-48
SLIDE 48 @zeigenvector jenna.is/at-jsconfhi

A → a A → Ba Query → Query Filter Filter → Modifier Entity

advanced grammar school

slide-49
SLIDE 49 @zeigenvector jenna.is/at-jsconfhi

A → a A → Ba Query → Query Filter Filter → Modifier Entity

advanced grammar school

  • h no
slide-50
SLIDE 50 @zeigenvector jenna.is/at-jsconfhi https://en.wikipedia.org/wiki/Context-free_grammar

A "context-free grammar" has rules that follow A → α where A is a non-terminal and α is a combo of terminal and non-terminal

advanced grammar school

slide-51
SLIDE 51 @zeigenvector jenna.is/at-jsconfhi https://en.wikipedia.org/wiki/Context-free_grammar#Well-formed_parentheses

S → SS S → () S → (S) S → [] S → [S]

advanced grammar school

slide-52
SLIDE 52 @zeigenvector jenna.is/at-jsconfhi

<div class="Wrapper"> <input class="Input"/> <div class="Visualizer"> <div class="Token">Grammars!</div> </div> </div>

advanced grammar school

slide-53
SLIDE 53 @zeigenvector jenna.is/at-jsconfhi

real world parsing

slide-54
SLIDE 54 @zeigenvector jenna.is/at-jsconfhi

in: javascript "front end" #random in: #general from: @jenna

real world: parsers

slide-55
SLIDE 55 @zeigenvector jenna.is/at-jsconfhi

Modifier Term Term Entity

real world: parsers

Modifier Entity Modifier Entity

slide-56
SLIDE 56 @zeigenvector jenna.is/at-jsconfhi Ashlee Simpson - Pieces of Me https://www.cs.umd.edu/~mvz/cmsc430-s07/M08topdown.pdf

then, the parser goes through and matches the tokens to production rules

real world: parsers

It's as if you know me better Than I ever knew myself I love how you can tell All the pieces, pieces, pieces of me

♪♪♪

slide-57
SLIDE 57 @zeigenvector jenna.is/at-jsconfhi

real world: parsers

Modifier Entity Term Modifier Entity Modifier Entity Term

https://www.cs.umd.edu/~mvz/cmsc430-s07/M08topdown.pdf
slide-58
SLIDE 58 @zeigenvector jenna.is/at-jsconfhi

real world: parsers

Modifier Entity Term Modifier Entity Modifier Entity Term Query

https://www.cs.umd.edu/~mvz/cmsc430-s07/M08topdown.pdf
slide-59
SLIDE 59 @zeigenvector jenna.is/at-jsconfhi

real world: parsers

Modifier Entity Term Modifier Entity Modifier Entity Query Term Query

https://www.cs.umd.edu/~mvz/cmsc430-s07/M08topdown.pdf
slide-60
SLIDE 60 @zeigenvector jenna.is/at-jsconfhi

real world: parsers

Modifier Entity Modifier Entity Modifier Entity Term Query Term Query Query

https://www.cs.umd.edu/~mvz/cmsc430-s07/M08topdown.pdf
slide-61
SLIDE 61 @zeigenvector jenna.is/at-jsconfhi

real world: parsers

Entity Modifier Entity Modifier Entity Term Query Term Query Query Query Term

https://www.cs.umd.edu/~mvz/cmsc430-s07/M08topdown.pdf

Modifier

slide-62
SLIDE 62 @zeigenvector jenna.is/at-jsconfhi

real world: parsers

Entity Modifier Entity Modifier Entity Term Query Term Query Query Query Term

  • h no

Modifier

slide-63
SLIDE 63 @zeigenvector jenna.is/at-jsconfhi

real world: parsers

Entity Modifier Entity Modifier Entity Term Query Term Query Query Query Filter

https://www.cs.umd.edu/~mvz/cmsc430-s07/M08topdown.pdf

Modifier

slide-64
SLIDE 64 @zeigenvector jenna.is/at-jsconfhi

real world: parsers

Entity Modifier Entity Entity Term Modifier Query Term Query Query Query Filter Modifier

slide-65
SLIDE 65 @zeigenvector jenna.is/at-jsconfhi

Grammars! Lexers! Tokens! Parsers! Trees!

slide-66
SLIDE 66 @zeigenvector jenna.is/at-jsconfhi

from: @zeigenvector "thank you" "JSConf Hawaii" jenna.is/at-jsconfhi

slide-67
SLIDE 67 @zeigenvector jenna.is/at-jsconfhi MC Hammer - U Can't Touch This https://en.wikipedia.org/wiki/Noam_Chomsky https://en.wikipedia.org/wiki/Chomsky_hierarchy

extra credit

the "Chomsky hierarchy" describes different classes of formal grammars

Stop! Grammar time!

♪♪♪

slide-68
SLIDE 68 @zeigenvector jenna.is/at-jsconfhi https://en.wikipedia.org/wiki/Chomsky_hierarchy

extra credit

Type 0: Recursively Enumerable Type 1: Context-Sensitive Type 2: Context-Free Type 3: Regular