Reputation: 81
i have make a lex file as it shown below:
%%
[\t\n]
"if" {printf("IF_TOKEN\n");}
"else" {printf("ELSE_TOKEN\n");}
"while" {printf("WHILE_TOKEN\n");}
"FOR" {printf("FOR_TOKEN\n");}
"BREAK" {printf("BREAK_TOKEN\n");}
"float" {printf("FLOAT_TOKEN\n");}
"int" {printf("INT_TOKEN\n");}
"long" {printf("LONG_TOKEN\n");}
"return" {printf("RETURN_TOKEN\n");}
"defFunction" {printf("DEFFUNCTION_TOKEN\n");}
"defClass" {printf("DEFCLASS_TOKEN\n");}
"\(" {printf("PAROPEN_TOKEN\n");}
"\)" {printf("PARCLOS_TOKEN\n");}
"\{" {printf("CBROPEN_TOKEN\n");}
"\}" {printf("CBRCLOS_TOKEN\n");}
"<" {printf("LESSTHN_TOKEN\n");}
">" {printf("GRTRTHN_TOKEN\n");}
"=" {printf("EQUALTO_TOKEN\n");}
"!=" {printf("NEQUALTO_TOKEN\n");}
"\+" {printf("SUM_TOKEN\n");}
"-" {printf("MINUS_TOKEN\n");}
"\*" {printf("STAR_TOKEN\n");}
"\/" {printf("SLASH_TOKEN\n");}
"%" {printf("REMAIN_TOKEN\n");}
"\[" {printf("BRAOPEN_TOKEN\n");}
"\]" {printf("BRACLOS_TOKEN\n");}
";" {printf("SEMICOL_TOKEN\n");}
[-]?[1-9][0-9]* {printf("NUMBER\n");}
[A-Za-z&_$][A-Za-z$_]* {printf("ID\n");}
. {printf("ERROR");}
%%
int yywrap (void) {
return 1;
}
int main (int argc, char** argv) {
yylex();
return 0;
}
if i give 125apple as an input to this lex file after compile the .l file, it should print error but it print NUMBER ID how can i give 125apple as a single input?
Upvotes: 0
Views: 152
Reputation: 241721
In many languages, that's exactly how 125apple
would be lexed, in part because that's the way a naive lex scanner definition works.
If you want it to be an error, you need to explicitly make it an error, by adding a pattern which will match erroneous tokens. By putting it after the pattern which matches valid numbers, you avoid triggering an error on inputs which match both patterns, so the error pattern can also match valid tokens. That makes it a bit easier to write.
0|[-]?[1-9][0-9]* {printf("NUMBER\n");}
[-]?[0-9]+[0-9A-Za-z_]* {printf("ERROR\n");}
[A-Za-z&_$][A-Za-z$_]* {printf("ID\n");}
Above, I made a little change: your number pattern does not recognize 0, so I added it.
The error line not only catches 125apple
. It also catches other erroneous tokens, like 0037
and -0
. (I'm not convinced that -0
should be an error; you might want to fix that.) It does not treat 123$apple
as an error, so you might want to change that, too.
Upvotes: 2