Reputation: 111
lex file
%%
^((for)|(while))[(][)] return FORWHILE;
[ ]* return SPACE;
([a-z ]+[;])+ return LINE;
. ;
%%
yacc file
%start s
%token FORWHILE
%token SPACE
%token LINE
%%
s: FORWHILE SPACE '{' SPACE LINE SPACE '}' SPACE
{
printf("Data OK\n");
};
%%
#include <stdio.h>
#include <string.h>
#include "lex.yy.c"
int main()
{
return yyparse();
}
int yyerror()
{
printf("Error in data\n");
return 0;
}
This is what i tried.
I want that input for(){bla bla;bla bla;}
will result Data OK
.
I can't get it to work for some reason.
What is the problem?
Thanks
Update:
%start s
%token FOR
%token WHILE
%token LINE
%token DO
%%
s: forwhile | dowhile { printf("Data OK\n"); };
dowhile: do_stmt '{' lines '}' while_stmt;
forwhile: for_or_while_stmt '{' lines '}';
lines: line | lines line;
line: LINE;
for_or_while_stmt: for_stmt | while_stmt;
for_stmt: FOR'()';
while_stmt: WHILE'()';
do_stmt: DO;
%%
#include <stdio.h>
#include <string.h>
#include "lex.yy.c"
int main()
{
return yyparse();
}
int yyerror()
{
printf("Error in data\n");
return 0;
}
lex
%%
"while" return WHILE;
"for" return FOR;
"do" return DO;
[a-z ]+[;] return LINE;
. return yytext[0];
%%
I also added do while. but still couldn't get it right.
Upvotes: 0
Views: 971
Reputation: 241671
According to your lexical scanner, bla bla;bla bla;
contains two LINE
tokens. Your grammar, however, only allows one
s: FORWHILE SPACE '{' SPACE LINE SPACE '}' SPACE
There are some other issues which you should address. First, it is not a good idea for a lex pattern to match the empty string, as with [ ]* { return SPACE; }
. If the match succeeds (which is only possible if no longer match exists), then you will find yourself in an endless loop, because the scanner will never advance.
It's generally not advisable to pass whitespace tokens to the parser; better is to simply ignore them in the lexical scanner, particular (as in this case) when they are optional. On the other hand, ignoring unrecognized characters (. ;
) can easily mask errors.
Finally, .
does not match any character. It matches any character other than a newline. In your lexical scanner definition, newline characters are not matched by any rule, and will consequently cause the infinite loop referred to above. If you fix the SPACE
rule to only match non-empty sequences, then newlines will fall through to the implicit default rule, which is ECHO
. This is also not a good idea.
I strongly recommend using %option nodefault
, which will cause flex to provide a diagnostic if any input is not matched by any rule. However, that won't warn you about rules matching empty patterns, so you still need to be careful.
Upvotes: 1