Novak007
Novak007

Reputation: 436

Checking Valid Arithmetic Expression in Lex (in C)

I have to write code for checking if an arithmetic expression is valid or not , in lex. I am aware that I could do this very easily using yacc but doing only in lex is not so easy.

I have written the code below, which for some reason doesn't work. Besides this, i also don't get how to handle binary operators .

My wrong code:

%{
#include <stdio.h>
/* Will be using stack to check the validity of arithetic expressions */
char stack[100];
int top = 0;
int validity =0;S
%}
operand [a-zA-Z0-9_]+

%%
  /* Will consider unary operators (++,--), binary operators(+,-,*,/,^), braces((,)) and assignment operators (=,+=,-=,*=,^=) */
"("                { stack[top++]='(';}
")"                { if(stack[top]!=')') yerror(); else top--;}
[+|"-"|*|/|^|%]    { if(stack[top]!='$') yerror(); else stack[top]=='&';}
"++"          { if(stack[top]!='$') yerror(); else top--;}
[+"-"*^%]?=        { if(top) yerror();}
operand            { if(stack[top]=='&') top--; else stack[top++]='$';}

%%

int yerror()
{
    printf("Invalid Arithmetic Expression\n");
}

Upvotes: 1

Views: 1953

Answers (1)

rici
rici

Reputation: 241901

First, learn how to write regular expressions in Flex. (Patterns, Flex manual).

Inside a character class ([]), neither quotes nor stars nor vertical bars are special. To include a - or a ], you can escape them with a \ or put them at the beginning of the list, or in the case of - at the end.

So in:

[+|"-"|*|/|^|%]

The | is just another character in the list, and including it five times doesn't change anything. "-" is a character range consisting only of the character ", although I suppose the intention was to include a -. Probably you wanted [-+*/^%] or [+\-*/^%].

There is no way that the flex scanner can guess that a + (for example) is a unary operator instead of a binary operator, and putting it twice in the list of rules won't do anything; the first rule will always take effect.

Finally, if you use definitions (like operand) in your patterns, you have to enclose them in braces: {operand}; otherwise, flex will interpret it as a simple keyword.

And a hint for the assignment itself: A valid unparenthesized arithmetic expression can be simplified into the regular expression:

term    {prefix-operator}*{operand}{postfix-operator}*
expr    {term}({infix-operator}{term})*

But you can't use that directly because (a) it doesn't deal with parentheses, (b) you probably need to allow whitespace, and (c) it doesn't correctly reject a+++++b because C insists on the "maximal munch" rule for lexical scans, so that is not the same as the correct expression a++ + ++b.

You can, however, translate the above regular expression into a very simple two-state state machine.

Upvotes: 1

Related Questions