Reputation: 143
I'm still new to CFG and CNF
and have trouble sometimes understanding the concepts.
I'm trying to convert this CFG into Chomsky Normal Form:
G: S -> aSbS | bSaS | epsilon
I think the language generates all strings with same number of a and b, i.e. {a^n b^n |n>-0}
.
But to convert it into CNF, I've finished adding a new start state and eliminating epsilon-productions:
S_0 -> S | epsilon
S -> aSbS | bSaS | aS | bS | a | b
Perhaps I need two non-terminals(variables) A -> a and B -> b :
S_0 -> S | epsilon
S -> ASBS | BSAS | AS | BS | a | b
A -> a
B -> b
I'm stuck here and really don't know what the next step should be. There seem to be no unit productions or useless symbols.
Upvotes: 2
Views: 2175
Reputation: 632
S -> XY | YX | AS | BS | a | b
A -> a
B -> b
X ->AS
Y ->BS
Upvotes: 1
Reputation: 36
Chomsky normal is defined by having all productions of the following form:
A -> BC
(Where A,B and C are arbitrary nonterminals)
A -> a
(Where A is an arbitrary nonterminal and a an arbitrary terminal)
or S -> epsilon
In addition to this the start symbol may never appear on the right side of any production.
The general transformation of any CFG to CNF consists of 4 steps (Wikipedia uses the terms START, TERM, BIN, DEL, UNIT so let's use those)
The order of the operations can vary, but that is the general order commonly used.
START: Elimination of any start symbols that appear on the right hand side.
This is achieved by introducing a new starting symbol S0
and adding the production S0 -> S
.
TERM: Removing all terminals from productions with more than 1 symbol on the right side, this is what you were about to do.
BIN: Reduction of all right sides to at most two symbols. This is achieved by introducing new nonterminals as follows:
Given A -> X1,...,Xn
we simply reduce the right side by introducing new nonterminals and splitting the right side to satisfy the requirement as follows:
A -> X1,..,Xn-2,A1
A1 -> Xn-1,Xn
This process is repeated until the right side is 2 symbols long.
DEL: Elimination of epsilons from the right side (with the exception of course being S -> epsilon
, if epsilon is part of the language), which you have already done.
UNIT: The removal of unit productions (A -> B) This is achieved by substituting the result of the unit production with all its possible productions. For example
A -> B
B -> a | X1X2
would result in B being substituted on the right side with its productions:
A -> a | X1X2
and B being removed, along with its productions.
Generally these steps can be done in an arbitrary order, but note that in many cases the effect of later steps can break conditions satisfied by earlier steps.
Hope this helps.
Upvotes: 2