Reputation: 12333
I have this lex file:
COMMENT \#.*\n
SPACE [\x20\n\r\t]
L [a-zA-Z_]
D [0-9]
%%
{COMMENT} |
{SPACE}+ ;
{L}({L}|{D})* { printf("identifier token: %s\n", yytext); return 1; }
-?{D}* { printf("int number token: %s\n", yytext); return 1; }
.* { printf("invalid token: %s\n", yytext); return -1; }
%%
#include <stdio.h>
int yywrap() {
return 1;
}
int main() {
while(yylex() > 0) {};
return 0;
}
And I have, say, two files.
Case 1:
#comentario de prueba
print nestor
Case 2:
#comentario de mierda
print
Using such lex definition, I get an error: "invalid token: print nestor" for the first case, while the second case returns with no error.
What am I doing wrong? The intention here is that the first case produce tokens: (spaces)(identifier)(spaces)(identifier)
Upvotes: 0
Views: 336
Reputation: 56059
Lex takes the longest match first. In this case, that's going to be
.* { printf("invalid token: %s\n", yytext); return -1; }
Because .*
matches the entire line. Take out the *
, just .
should work.
Upvotes: 2