Reputation: 1301
Hi I have a scenario where bison will successfully parse my input if there is a space separating a grammar...
Here is the situation: I am attempting to declare a variable:
int a = 31 ;
This yyin parses successfully
int a = 31;
Does not parse successfully
The error I receive is:
syntax error, unexpected $end, expecting TSEMI
Here is the section of the bison code
%token <string> TIDENTIFIER TINTEGER TDOUBLE
%token <token> TCEQUAL TCNE TCLT TCLE TCGT TCGE TASSIGN
%token <token> TLPAREN TRPAREN TLBRACE TRBRACE TCOMMA TDOT TSEMI
%token <token> TPLUS TMINUS TMUL TDIV
...
var_decl : ident ident TSEMI { $$ = new VarDel($1, $2); }
| ident ident TASSIGN expr TSEMI {$$ = new VarDel($1, $2, $4);}
;
ident : TIDENTIFIER { $$ = new Var($1->c_str()); delete $1; }
;
expr : ident { $<ident>$ = $1; }
| numeric
;
numeric : TINTEGER { $$ = new Num(atol($1->c_str())); delete $1; }
| TDOUBLE { $$ = new Num(atof($1->c_str())); delete $1; }
;
And here is a section of my flex file
[ \t\n] ;
[a-zA-Z_][a-zA-Z0-9_]* SAVE_TOKEN; return TIDENTIFIER;
[0-9]+.[0-9]* SAVE_TOKEN; return TDOUBLE;
[0-9]+ SAVE_TOKEN; return TINTEGER;
"=" return TOKEN(TASSIGN);
"==" return TOKEN(TCEQUAL);
"!=" return TOKEN(TCNE);
"<" return TOKEN(TCLT);
"<=" return TOKEN(TCLE);
">" return TOKEN(TCGT);
">=" return TOKEN(TCGE);
"(" return TOKEN(TLPAREN);
")" return TOKEN(TRPAREN);
"{" return TOKEN(TLBRACE);
"}" return TOKEN(TRBRACE);
"." return TOKEN(TDOT);
"," return TOKEN(TCOMMA);
"+" return TOKEN(TPLUS);
"-" return TOKEN(TMINUS);
";" return TOKEN(TSEMI);
"*" return TOKEN(TMUL);
"/" return TOKEN(TDIV);
. printf("Unknown token!n"); yyterminate();
Why is it parsing successfully when there is a space but not when there is one?
Thanks
Upvotes: 1
Views: 107
Reputation: 241841
[0-9]+.[0-9]*
should be [0-9]+\.[0-9]*
. As written it matches 31;
.
You would do well to enable flex debugging (the -d
command-line flag) to see how it tokenises. Also, using atof
silently hides the fact that the token is not a valid number. Consider using a safer string→number converter; you'll find one in the C++ standard library; in C, it would be strtod
followed by a check that endptr
is at the the end. (And you could do this conversion in the lexer, avoiding the unnecessary allocation and deallocation of a string.)
Upvotes: 3