mickeyj
mickeyj

Reputation: 101

Lex and Yacc do not report an error when an unexpected character is parsed.

Lex and Yacc are not reporting an error when an unexpected character is parsed. In the code below, there is no error when #set label sample is parsed, but the # is not valid.

Lex portion of code

identifier [\._a-zA-Z0-9\/]+ 

<INITIAL>{s}{e}{t} {
    return SET;
}

<INITIAL>{l}{a}{b}{e}{l} {
    return LABEL;
}

<INITIAL>{i}{d}{e}{n}{t}{i}{f}{i}{e}{r} {
    strncpy(yylval.str, yytext,1023);
    yylval.str[1023] = '\0';
    return IDENTIFIER;


}

Yacc portion of code.

definition : SET LABEL IDENTIFIER
{
    cout<<"set label "<<$3<<endl;
};

When #set sample label is parsed, there should be an error reported because # is an unexpected character. But there is no error reported. How should I modify the code so an error is reported?

Upvotes: 2

Views: 243

Answers (2)

Damien
Damien

Reputation: 1528

There is an example in A Guide To Lex & Yacc by Thomas Niemann of the following:

/* anything else is an error */
. yyerror("invalid character");

This resulted in a compiler warning: warning: implicit declaration of function ‘yyerror’

This was fixed with a declaration:

extern void yyerror(char *error);

Upvotes: 0

(Comments converted to a SO style Q&A format)

@JonathanLeffler wrote:

That's why you need a default rule in the lexical analyzer (typically the LHS is .) that arranges for an error to be reported. Without it, the default action is just to echo the unmatched character and proceed onwards with the next one.

At the least you would want to include the specific character that is causing trouble in the error message. You might well want to return it as a single-character token, which will generally trigger an error in the grammar. So:

<*>. { cout << "Error: unexpected character " << yytext << endl; return *yytext; } 

might be appropriate.

Upvotes: 1

Related Questions