Ruslan R. Laishev
Ruslan R. Laishev

Reputation: 143

BISON+FLEX using short form of tokens

I'd like to implement some command lang ... Is there a way to implement token reconginizing to get token for "CREATE" :

CREATE  
CRE
CREA
CREAT

another example:

DELE
DEL
DELET
DELETE

for token "DELETE"

I know way like :

"CREATE" { return KWD_CREATE;}
"CRE"    { return KWD_CREATE;}


"DEL"     { return KWD_DELETE;}
"DELET"   { return KWD_DELETE;}

But, is there a right way to recognize reduced form of keywords ?

Update: I have tried the suggested trick like:

CRE(A(T(E?)?)?   { return KWD_CREATE;}
DEL(E(T(E?)?)?   { return KWD_DELETE;}

But next problem is take place:

CREATE - is recognized
CREAT - is recognized
CREA - is **not** recognized

I see "syntax error, unexpected id", id it's identifier pattern as follow:

identifier  [$_a-zA-Z][$_a-zA-Z0-9\%\*]*

Any idea? What's im need to check additionaly ?

Thanks!

Upvotes: 1

Views: 50

Answers (1)

rici
rici

Reputation: 241671

There's no shorthand for this syntax, but you can simply use, for example:

CRE(A(TE?)?)?   { return KWD_CREATE;}
DEL(E(TE?)?)?   { return KWD_DELETE;}

That would be easy enough to do programmatically if you were generating your lexer with some kind of generator-generator (a technique I find quite useful).

Test:

$ cat abbrev.l
%option noinput nounput noyywrap nodefault 8bit
%%
cre(a(te?)?)?   { fprintf(stderr, "%s\n", "CREATE"); }
del(e(te?)?)?   { fprintf(stderr, "%s\n", "DELETE"); }
[[:alpha:]]+    { fprintf(stderr, "WORD: %s\n", yytext); }
[[:space:]]+    ;
.               { fprintf(stderr, "PUNC: %c\n", *yytext); }
$ flex -o abbrev.c abbrev.l
$ gcc -Wall -o abbrev abbrev.c -lfl
$ ./abbrev
create
CREATE
creat
CREATE
crea
CREATE
cre
CREATE
cr
WORD: cr
delete
DELETE
delet
DELETE
dele
DELETE
del
DELETE
de
WORD: de

Upvotes: 2

Related Questions