Rachel
Rachel

Reputation: 47

Yacc--What does 'error' mean?

What's the meaning of token 'error'? how to detect error without ;

Upvotes: 1

Views: 1109

Answers (2)

Chris Dodd
Chris Dodd

Reputation: 126175

One common source of confusion -- the error token is for error recovery, not error detection. Syntax errors are detected and reported automatically by the parser. You can detect other errors in the actions and tell bison about it by using the YYERROR macro.

Conceptually, the error token replaces a sequence of zero or more input tokens, in an attempt to convert an invalid input stream into a valid one. When an error occurs, the bison generated parser goes into error recovery mode, discarding tokens and states until it gets to a point where the error pseudo-token can be shifted. It then shifts the error token and attempts to continue from there.

Upvotes: 2

rici
rici

Reputation: 241671

After the error pseudo-terminal is matched, the bison parser continues to parse in the normal way, except that it discards tokens which "cannot be handled".

If it encounters a token which immediately follows the error token, it can shift that token, which means that it will stop discarding tokens.

However, that is not the only way the parser can handle a token. It could also handle it by doing a reduction.

Here, the word "handled" is interpreted a bit loosely, since a reduction action does not actually accept the lookahead token. Nonetheless, it is sufficient for the error production to be reduced.

In such a case, care must be taken to not call yyerrok. If error handling is cancelled with yyerrok and the lookahead token cannot be shifted, then the error handler will be reentered and it is possible to fall into an endless loop.

For example,

commands: %empty | commands command

command : exp ';'   { printf("Value is %d\n", $1); }
        | error ';' { printf("Bad expression\n"); yyerrok; }
        | error     { printf("Missing semicolon\n"); }

The first command production causes the result of a correct expression to be printed out. The second production deals with syntax errors where there is still a semicolon. It can cancel error handling because the ; has already been shifted so it is ok to restart error-handling.

The third production deals with a missing semicolon. Here, we cannot call yyerrok because it is possible that the lookahead token is an illegal token, such as !. If we were to call yyerrok, the error status would be cleared, and error-handling would be immediately reentered with the same exclamation mark as the lookahead token, causing an endless loop. But without yyerrok, the parser is still in error-handling mode and the offending token will be discarded.

Note: The above was intended to help answer the question of what would be the effect of an error production with nothing following the error token. It was not intended to answer any question not being asked, such as "How do I do X ?" (For various values of X). The provided example is a bit artificial. The original used a newline character as the expression terminator, and it was not necessary to include the second error-handling production since it is effectively Impossible to leave out a terminating newline except at EOF.

Upvotes: 2

Related Questions