Reputation: 791
For example, I'm supposed to convert "int" to "INT". But if there's the word "integer", I don't think it's supposed to turn into "INTeger".
If I define "int" printf("INT");
the substrings are matched though. Is there a way to prevent this from happening?
Upvotes: 1
Views: 478
Reputation: 12037
I believe the following captures what you want.
%{
#include <stdio.h>
%}
ws [\t\n ]
%%
{ws}int{ws} { printf ("%cINT%c", *yytext, yytext[4]); }
. { printf ("%c", *yytext); }
To expand this beyond word boundaries ({ws}
, in this case) you will need to either add modifiers to ws
or add more specifc checks.
Upvotes: 2
Reputation: 1643
Lex will choose the rule with the longest possible match for the current input. To avoid substring matches you need to include an additional rule that is longer than int
. The easiest way to do to this is to add a simple rule that picks up any string that is longer than one character, i.e. [a-zA-Z]+
. The entire lex program would look like this:-
%%
[\t ]+ /* skip whitespace */
int { printf("INT"); }
[a-zA-Z]+ /* catch-all to avoid substring matches */
%%
int main(int argc, char *argv[])
{
yylex();
}
Upvotes: 1
Reputation: 791
well, here's how i did it:
(("int"([a-z]|[A-Z]|[0-9])+)|(([a-z]|[A-Z]|[0-9])+"int")) ECHO;
"int" printf("INT");
better suggestions welcome.
Upvotes: 1