Scorch
Scorch

Reputation: 173

Creating Context-Free-Grammar with restrictions

I'm trying to understand CFG by using an example with some obstacles in it.

For example, I want to match the declaration of a double variable:

double d; In this case, "d" could be any other valid identifier.

There are some cases that should not be matched, e.g. "double double;", but I don't understand how to avoid a match of the second "double"

My approach:

G = (Σ, V, S, P)
Σ = {a-z}
V = {S,T,U,W}
P = { S -> doubleTUW 
T -> _(space)
U -> (a-z)U | (a-z)
W -> ;
}

Now there must be a way to limit the possible outcomes of this grammar by using the function L(G). Unfortunately, I couldn't find a syntax that meet my requirement to deny a second "double".

Upvotes: 0

Views: 145

Answers (1)

rici
rici

Reputation: 241701

Here's a somewhat tedious regular expression to match any identifier other than double. Converting it to a CFG can be done mechanically but it is even more tedious.

([a-ce-z]|d[a-np-z]|do[a-tv-z]|dou[ac-z]|doub[a-km-z]|doubl[a-df-z]|double[a-z])[a-z]*

Converting it to a CFG can be done mechanically but it is even more tedious:

ALPHA → a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z
NOT_B → a|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z
NOT_D → a|b|c|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z
NOT_E → a|b|c|d|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z
NOT_L → a|b|c|d|e|f|g|h|i|j|k|m|n|o|p|q|r|s|t|u|v|w|x|y|z
NOT_O → a|b|c|d|e|f|g|h|i|j|k|l|m|n|p|q|r|s|t|u|v|w|x|y|z
NOT_U → a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|v|w|x|y|z
WORD  → NOT_D
      | d NOT_O
      | do NOT_U
      | dou NOT_B
      | doub NOT_L
      | doubl NOT_E
      | double ALPHA
      | WORD   ALPHA

This is why many of us usually use scanner generators like (f)lex which handle such exclusions automatically.

Upvotes: 1

Related Questions