Reputation: 18706
My current lex file looks like this:
%{
#include "foo.h"
void rem_as(char* string);
%}
DIGIT [0-9]
LITTERAL [a-zA-Z]
SEP [_-]|["."]|["\\"][ ]
FILE_NAME ({DIGIT}|{LITTERAL}|{SEP})*
PATH ({FILE_NAME}"/"{FILE_NAME})*|({FILE_NAME})
%%
"move" {return MOVE;}
"mv" {return MOVE;}
">" {return R_STDOUT;}
"2>" {return R_STDERR;}
"<" {return R_STDIN;}
"|" {return PIPE;}
"&" {return AND;}
"=" {return EQUAL_SIGN;}
"-"?{DIGIT}+ {yylval.integer = atoi(yytext); return NUM;}
{PATH} {rem_as(yytext); sscanf(yytext,"%[^\n]",yylval.string); return FILENAME;}
\n {return LINEBREAK;}
. ;
%%
That works quite good.
For example, thanks to this grammar
Move: MOVE FILENAME FILENAME { move($2, $3); }
;
I can do stuff like move a b
.
Now my problem:
After adding this to my lex file
VAR_NAME [a-zA-Z][a-zA-Z0-9_-]*
...
{VAR_NAME} {return VAR_NAME;} // declared before the "=" rule
My previous rules break, especially FILENAME, which now must necessarily contain a '/'.
For example, with this grammar:
VarDecl: VAR_NAME EQUAL_SIGN FILENAME { puts("foo"); }
;
a=b/
works while a=b
throws a syntax error.
Any idea about the cause of the problem?
Thanks.
Upvotes: 0
Views: 147
Reputation: 36
The order in which you declare lex rules matters, b matches VAR_NAME, so the VAR_NAME token is emitted, before even trying to match PATH, so you end up with a VAR_NAME EQUAL_SIGN VAR_NAME rule which is invalid.
The easy solution is to make PATH a rule in you grammar, not in your lexical stuff.
PATH: VAR_NAME | FILE_NAME | VAR_NAME SLASH PATH | FILE_NAME SLASH PATH
adding just / as a token in your lex file.
Upvotes: 2