smit
smit

Reputation: 1069

why simple grammar rule in bison not working?

I am learning flex & bison and I am stuck here and cannot figure out how such a simple grammar rule does not work as I expected, below is the lexer code:

%{

#include <stdio.h>
#include "zparser.tab.h"

%}

%%

[\t\n ]+        //ignore white space

FROM|from           { return FROM;   }
select|SELECT       { return SELECT; }
update|UPDATE       { return UPDATE; }
insert|INSERT       { return INSERT; }
delete|DELETE       { return DELETE; }
[a-zA-Z].*          { return IDENTIFIER; }
\*                  { return STAR;   }

%%

And below is the parser code:

%{
#include<stdio.h>
#include<iostream>
#include<vector>
#include<string>
using namespace std;

extern int yyerror(const char* str);
extern int yylex();


%}

%%

%token SELECT UPDATE INSERT DELETE STAR IDENTIFIER FROM;


ZQL     : SELECT STAR FROM  IDENTIFIER { cout<<"Done"<<endl; return 0;}
        ;

%%

Can any one tell me why it shows error if I try to put "select * from something"

Upvotes: 0

Views: 590

Answers (2)

user207421
user207421

Reputation: 310913

[a-zA-Z].* { return IDENTIFIER; }

The problem is here. It allows any junk to follow an initial alpha character and be returned as IDENTIFIER, including in this case the entire rest of the line after the initial ''s.

It should be:

[a-zA-Z]+          { return IDENTIFIER; }

or possibly

[a-zA-Z][a-zA-Z0-9]*          { return IDENTIFIER; }

or whatever else you want to allow to follow an initial alpha character in your identifiers.

Upvotes: 1

rici
rici

Reputation: 241721

[a-zA-Z].* will match an alphabetic character followed by any number of arbitrary characters except newline. In other words, it will match from an alphabetic character to the end of the line.

Since flex always accepts the longest match, the line select * from ... will appear to have only one token, IDENTIFIER, and that is a syntax error.

Upvotes: 2

Related Questions