master chief
master chief

Reputation: 791

Lex: How do I Prevent it from matching against substrings?

For example, I'm supposed to convert "int" to "INT". But if there's the word "integer", I don't think it's supposed to turn into "INTeger".

If I define "int" printf("INT"); the substrings are matched though. Is there a way to prevent this from happening?

Upvotes: 1

Views: 478

Answers (3)

ezpz
ezpz

Reputation: 12037

I believe the following captures what you want.

%{
#include <stdio.h>
%}

ws                      [\t\n ]

%%

{ws}int{ws}         { printf ("%cINT%c", *yytext, yytext[4]); }
.                       { printf ("%c", *yytext); }

To expand this beyond word boundaries ({ws}, in this case) you will need to either add modifiers to ws or add more specifc checks.

Upvotes: 2

Andrew O&#39;Reilly
Andrew O&#39;Reilly

Reputation: 1643

Lex will choose the rule with the longest possible match for the current input. To avoid substring matches you need to include an additional rule that is longer than int. The easiest way to do to this is to add a simple rule that picks up any string that is longer than one character, i.e. [a-zA-Z]+. The entire lex program would look like this:-

%%

[\t ]+          /* skip whitespace */
int { printf("INT"); }
[a-zA-Z]+       /* catch-all to avoid substring matches */

%%

int main(int argc, char *argv[])
   {
   yylex();
   }

Upvotes: 1

master chief
master chief

Reputation: 791

well, here's how i did it:

(("int"([a-z]|[A-Z]|[0-9])+)|(([a-z]|[A-Z]|[0-9])+"int")) ECHO;
"int" printf("INT");

better suggestions welcome.

Upvotes: 1

Related Questions