anonymous
anonymous

Reputation: 19

Converting a CFG to CNF

I am trying to convert a CFG to a CNF, but I am unsure what to identify as 'variables'. Here is the problem:

S -> aA | ABa 
A -> AA | a  
B -> AbA | bb 

I have added a new start variable to make it

S' -> S  
S -> aA | ABa 
A -> AA | a  
B -> AbA | bb 

Then, after unit production removal, it is:

S' -> aA | ABa 
S -> aA | ABa 
A -> AA | a  
B -> AbA | bb

I know the next step would be to change any production that have more than 2 variables, but is ABa three variables? Or is it two variables and a terminal?

If it is two variables and a terminal, to finally simplify it, am I able to create something like this:

S' -> aA | Xa  
S -> aA | Xa  
A -> AA | a  
B -> Yb | bb  
X -> AB
Y -> AA 

Thanks!

Upvotes: 0

Views: 4352

Answers (1)

Nikolay Handzhiyski
Nikolay Handzhiyski

Reputation: 1579

The grammars in the Chomsky normal form has the following production formats:

A → BC, or
A → a,  or
S → ε,

It is made of symbols (a, A...). These symbols are in two sets: terminal symbols (as a and b, lower case letters, both part of the alphabet) and non terminal symbols (as A and B, upper case letters). The grammar also has productions (like S → a), and a starting symbol (a non-terminal) S.

You start from this grammar (there is no need of new starting rule):

S → aA | ABa
A → AA | a
B → AbA | bb 

Separate the concatenations into their own productions:

S → aA 
S → ABa
A → AA
A → a
B → AbA
B → bb 

Move all terminal symbols into their own non-terminal symbols (a into C (but keep the a in A, because its already valid in the CNF) and b into D):

S → CA 
S → ABC
A → AA
A → a
B → ADA
B → DD
C → a
D → b

Split the the right hand sides where there are more than two non-terminal symbols:

S → CA 
S → AE
A → AA
A → a
B → AF
B → DD
C → a
D → b
E → BC
F → DA

Then you are done.

Upvotes: 0

Related Questions