Reputation: 1704
I am working with LEX and YACC. I have a question regarding how to define tokens, I mean I have two regular expressions which share some characters, see the example below:
SHARED "+"|"-"|"/"|"="|"%"|"?"|"!"|"."|"$"|"_"|"~"|"&"|"^"|"<"|">"|"("|")"|","
REXP_1 {SHARED}|[a-zA-Z]|[ \t]+|[\\][\\\"]
REXP_2 {SHARED}|[a-zA-Z]|[ \t]+|"*"
Now my point is how to identify when a character from the shared regular expression correspond to REXP_1 or REXP_2 when I define the tokens in the third section of the .lex file.
I think I am misunderstanding something, I guess that the way I write the regular expression is wrong but I do not find a way to put it in a better way. Could you please give me some hints?
More over I would appreciate if someone could advice me some criteria to determine when to define a token (file.lex) or when to define a symbol in the grammar(file.y). For some symbols it is easy to figure out if it is a token or a grammar symbol but for some others I find it difficult to define where to put them.
By the way I am working with this grammar
Upvotes: 0
Views: 229
Reputation: 5883
The OP wrote:
Just in case someone find it interesting I am going to write out the lessons I learned. I think that the most important lesson I learnt is that common sense is a great tool to figure out what is a intern token in the .lex file and what is a suitable token to share with the .y file.
Since the term 'common sense' may be a bit ambiguous I post the following example:
ALPHA_NUMERIC [a-bA-B0-9] SQ_CHAR {SHARED}|{ALPHA_NUMERIC} SINGLE_QUOTED {SINGLE_QUOTE}{SQ_CHAR}{SQ_CHAR}*{SINGLE_QUOTE}
where
ALPHA_NUMERIC
is a good intern token (file.lex
) but is a bad token to share in the grammar file whereasSINGLE_QUOTED
may be a good token to share with the grammar(file.y
). I wrote 'may be' because it is very dependent of the specific grammar we are working on, in my concrete case it is a good token to share with the YACC file.What I did is to define as a token a regexp similar to the one @OGHaza advised me in
file.lex
and then I use it in the grammar itself (file.y
).
Upvotes: 1