SLIDE 1
CPSC 313: Chomsky Normal Form
October 25, 2020
motivation We want a simple standard format to describe the productions
- f a grammar so that we can do proofs and constructions more readily on the
grammar. We now must show that all grammars can be converted into this normal form. In Chomsky Normal Form (CNF) all productions are of one of the following three types: A → BC (A, B, C ∈ V ) A → a (A ∈ V and a ∈ Σ) S → ǫ (where S ∈ V and is the start symbol) S does not occur on the right hand side of any production. ((T, α) ∈ P ⇒ nS(α) = 0)
- remove S from the right side of productions
- remove epsilon productions not from the start symbol (A → ǫ and A = S)
- remove mixing terminals and non-terminals on right hand side (A →
abCD)
- remove more than 1 alphabet symbols on right hand side (A → aA)
- remove more than 2 variables in its righthand side (A → ABC)
- remove productions of length one that are to variables (A → B)
That is, remove any productions that violate the rules. To remove the epsilon transitions we just think about what variables can derive epsilon, and replace them by omiting them in productions that create that variable. For example if the variable A ⇒⋆ ǫ, and we have productions like B → ABC then we add production B → BC. We need to include S → ǫ if S ⇒⋆ ǫ as the base case. For the unit productions (A → B) we just allow A to derive everything B can derive. So if we had A → B and B → CD|aa we just add A → CD and A → aa and remove A → B To remove alphabet symbols and variables in our rules we simply promote symbols to a variable and include a rule from that variable to the symbol as its
- nly derivation. If we had A → aB we can add a new rule Xa → a and replace