Sourav Kannantha B
Sourav Kannantha B

Reputation: 3299

Yacc is prematurely stopping lexer?

I have a yacc file containing these rules:

%start program
.
.
.
statement
    : ';' { $$ = NULL; }
    | expression ';' { $$ = exprStmt($1); }
    | compound ';' { $$ = cmpndStmt($1); }
    ;

routine
    : routine statement { $$ = rtn($1, $2); }
    | { $$ = NULL; }
    ;

block
    : '{' routine '}' { $$ = $2; }
    ;

compound
    : block { $$ = cmpnd($1); }
    | if_cmpnd { $$ = $1; }
    ;

if_cmpnd
    : IF expression compound { $$ = ifcmpnd($2, $3); }
    | if_cmpnd ELSE compound { $$ = elsecmpnd($1, $3); }
    ;

program
    : routine { printf("YACC finish\n"); exit(0); }
    ;

For this parser, I gave a sample text file containing:

var a := 6
var b := 5
if a<5 {
b = 2
}
else {
b = 20
}

But during parsing, with debug, I got this output

.
.
.
--accepting rule at line 47 ("if")
--accepting rule at line 62 (" ")
--accepting rule at line 47 ("a")
--accepting rule at line 56 ("<")
--accepting rule at line 29 ("5")
--accepting rule at line 62 (" ")
--accepting rule at line 57 ("{")
--accepting rule at line 60 ("
")
--accepting rule at line 47 ("b")
--accepting rule at line 56 (" = ")
--accepting rule at line 29 ("2")
--accepting rule at line 60 ("
")
--accepting rule at line 57 ("}")
--accepting rule at line 60 ("
")
--accepting rule at line 47 ("else")
YACC finish

As it can be seen, the block after else is not lexed. But instead, if I replace "else" by something else like "if a>8" text is lexed till the EOF. I'm not getting the cause of it. Someone help me out please.

Upvotes: 1

Views: 85

Answers (1)

rici
rici

Reputation: 241911

You complain that YACC is prematurely terminating. That's easily explained; your parser includes: (empasis added)

program
    : routine { printf("YACC finish\n"); exit(0); }

and the entire purpose of exit(0) is to terminate prematurely. So it's doing the job you asked it to do. However, it's hard to imagine a use case in which calling exit() from a parser action is correct, and the results you're getting show why.

It's particularly important to note that bison/yacc parsers (and especially bison parsers) may perform reductions before signalling an error, even in cases where the error token does not appear in the follow list for the production being reduced. That's probably the case here; had you not called exit(), you would have gotten a syntax error indication, which at least would have been less mysterious (and which could have been handled with an error-handling production in your grammar).

yyparse will return (with a return value of 0) if it manages to parse the entire input; it will return 1 if it encounters a syntax error and 2 if it detects memory over-use. (There are preprocessor symbols for these constants, but it is usually sufficient to test whether the return value of yyparse() was zero or not.) You should always allow the parser to return normally, to give it an opportunity to clean up allocated resources before returning.

You don't show your scanner, and the only trace you present is a trace of the scanner, not the parser. So it's not possible to know what token type the parser received when the scanner encountered the else token. Enabling parser traces instead of (or as well as) scanner traces would provide information more useful in debugging the parser (including the token type of each token passed to the parser).

However, I was struck by the fact that if and else are both handled by the same scanner action as your identifiers (at line 47 of your scanner description). That implies that your scanner does not have specific rules for each keyword. Presumably, you are doing something inefficient and error-prone like compare identifier tokens with each possible keyword in order to return the correct token type to the parser. But it's hard to do more than guess about this because of lack of information.

Upvotes: 3

Related Questions