Reputation: 41
I'm a newby for Flex and Bison, and I have tried to write a Flex lexical scanner and then a Bison grammar, but I encounter the following problem:
For example, if the word abc
can be seen as category1
or category2
in Flex, I would like Bison to choose category1
if it appears without syntax error as category1
in the Bison grammar and incorrect as category2; but if it appears as a syntax error when it is category1
and not as category2
, then Flex should classify it as category2
.
Is there a way to do this? Or am I totally misunderstanding Flex and Bison?
Upvotes: 4
Views: 1945
Reputation: 5703
To reiterate Jonathan Leffler's above comment of Jan 13 at 19:39, you are trying to parse a context-sensitive language with context-insensitive parser-generator tools. You need to re-think the grammar or re-think your choice of parser-generator tools -- what you are doing is the equivalent of trying to use a screwdriver to hammer in a nail.
If it were me, I would go back to the books and the Interwebs to review handling of context-sensitive grammar parsing.
Upvotes: -1
Reputation: 754820
Flex supports 'start states' and 'exclusive start states' which might allow you to achieve the effect you want. If you can tell in advance that the context is such that abc
should be category1
, then you can tell Flex to start a state in which abc
is classified as category1
, while in other states, it is classified in category2
. Don't forget to switch the state back when you're done with the special state. This sort of technique can be used to make selected keywords into a keyword in some contexts and leave it as an identifier in other contexts. Usually, though, you have the lexical analyzer always classify it the same way (e.g. as token KW_ABC
) and let the grammar get on with using that token.
Upvotes: 0
Reputation: 241911
This situation typically arises with what are often called "semi-reserved" words, or what are called "contextual keywords" in C#. In bison/flex, these are a pain to deal with. (Lemon has an undocumented feature where you can define a fallback for a token using the %fallback
directive, which is perfect for this use case; you simply make IDENTIFIER
the fallback for any contextually reserved token.)
With some work, you might be able to achieve the same effect by defining non-terminals like:
identifier : IDENTIFIER | VAR | ADD | REMOVE | DYNAMIC | GLOBAL | ...
/* VAR is special in a local-variable-type: */
local_variable_type_identifier : IDENTIFIER | ADD | REMOVE | DYNAMIC | GLOBAL | ...
You can probably find the places you need to customize by using identifier
throughout and then solving each conflict which includes a reduction to identifier
by replacing it with a restricted non-terminal which excludes the semi-reserved words which participate in the conflict.
It's not great, but it's the best approach I know.
Upvotes: 2