Reputation: 25096
I am writing a grammar for a programming language, but I'm running headfirst into a shift/reduce problem. The problem can be found in the state:
fn_call -> ID . L_PAREN fn_args R_PAREN
assignment -> ID . ASSIGN value
assignment -> ID . ASSIGN container
value -> ID
Before explaining a bit further, I want to clarify:
Is this shift/reduce because the program can't determine if I am calling a function or using the ID as a value (eg. constant or variable)?
Moving on, is it possible to fix this? My language does not currently use line delimiters (such as ';' in C or '\n' in Python). The parser is LALR(1).
What is the most efficient (adding the fewest rules to the grammar) way to decipher between a function call or a variable with line delimiters?
EDIT: Here is the lookahead for that state.
! shift/reduce conflict for L_PAREN resolved as shift
L_PAREN shift and go to state 60
ASSIGN shift and go to state 61
COMMA reduce using rule 43 (value -> ID .)
R_PAREN reduce using rule 43 (value -> ID .)
DASH reduce using rule 43 (value -> ID .)
R_BRACE reduce using rule 43 (value -> ID .)
NONE reduce using rule 43 (value -> ID .)
DEFN reduce using rule 43 (value -> ID .)
FOR reduce using rule 43 (value -> ID .)
INT_T reduce using rule 43 (value -> ID .)
DBL_T reduce using rule 43 (value -> ID .)
STR_T reduce using rule 43 (value -> ID .)
ID reduce using rule 43 (value -> ID .)
INT reduce using rule 43 (value -> ID .)
DBL reduce using rule 43 (value -> ID .)
STR reduce using rule 43 (value -> ID .)
COMMENT_LINE reduce using rule 43 (value -> ID .)
L_BRACE reduce using rule 43 (value -> ID .)
SET reduce using rule 43 (value -> ID .)
! L_PAREN [ reduce using rule 43 (value -> ID .) ]
Upvotes: 1
Views: 825
Reputation: 95354
If this is the set of items that form a "state" of the parser, then you haven't written it down right:
fn_call -> ID . L_PAREN fn_args R_PAREN
assignment -> ID . ASSIGN value
assignment -> ID . ASSIGN container
value -> ID . *missing lookahead set*
You don't exhibit the rest of your language, so we cannot know what the lookahead set is for the rule
value -> ID
Under the assumption that you indeed have a shift-reduce conflict in this state, then the lookahead set must contain "ASSIGN" or "L_PAREN". I can't tell you how to fix your problem without knowing more.
Given that your present grammar has these issues, you cannot fix this simply "adding rules" of any kind, whether they involve line delimiters or not, because adding rules will not change what is already in lookahead sets (it may add more tokens to existing sets).
EDIT: One way out of your problem may be to switch parsing technologies. Your problem is the LALR parsers cannot handle the local ambiguity that you seem to have. However, your overall grammar may not have an actual ambiguity if you look further ahead. That depends on your language syntax but you are rolling you own so you can do as you please. I suggest looking into GLR parsing technology, which can handle arbitrary lookahead; check out recent versions of Bison.
Upvotes: 1
Reputation: 241721
The following is just guesswork since you haven't shown much of your grammar. I'm assuming that you allow expressions as statements, and not just function calls. In that case, an expression can start with an (
, and a statement can end with an ID
. Since you have no statement delimiters (I think), then the following is truly ambiguous:
a = b
(c + d)
After reading the b
(ID
), it is unclear whether to reduce it to value
, as part of the assignment
, or to leave it as an ID and shift the (
as part of fn_call
.
You can't remove ambiguity by adding productions. :)
Upvotes: 2