Reputation: 436
I have to write code for checking if an arithmetic expression is valid or not , in lex. I am aware that I could do this very easily using yacc but doing only in lex is not so easy.
I have written the code below, which for some reason doesn't work. Besides this, i also don't get how to handle binary operators .
My wrong code:
%{
#include <stdio.h>
/* Will be using stack to check the validity of arithetic expressions */
char stack[100];
int top = 0;
int validity =0;S
%}
operand [a-zA-Z0-9_]+
%%
/* Will consider unary operators (++,--), binary operators(+,-,*,/,^), braces((,)) and assignment operators (=,+=,-=,*=,^=) */
"(" { stack[top++]='(';}
")" { if(stack[top]!=')') yerror(); else top--;}
[+|"-"|*|/|^|%] { if(stack[top]!='$') yerror(); else stack[top]=='&';}
"++" { if(stack[top]!='$') yerror(); else top--;}
[+"-"*^%]?= { if(top) yerror();}
operand { if(stack[top]=='&') top--; else stack[top++]='$';}
%%
int yerror()
{
printf("Invalid Arithmetic Expression\n");
}
Upvotes: 1
Views: 1953
Reputation: 241901
First, learn how to write regular expressions in Flex. (Patterns, Flex manual).
Inside a character class ([
…]
), neither quotes nor stars nor vertical bars are special. To include a - or a ], you can escape them with a \ or put them at the beginning of the list, or in the case of - at the end.
So in:
[+|"-"|*|/|^|%]
The | is just another character in the list, and including it five times doesn't change anything. "-"
is a character range consisting only of the character ", although I suppose the intention was to include a -. Probably you wanted [-+*/^%]
or [+\-*/^%]
.
There is no way that the flex scanner can guess that a + (for example) is a unary operator instead of a binary operator, and putting it twice in the list of rules won't do anything; the first rule will always take effect.
Finally, if you use definitions (like operand
) in your patterns, you have to enclose them in braces: {operand}
; otherwise, flex will interpret it as a simple keyword.
And a hint for the assignment itself: A valid unparenthesized arithmetic expression can be simplified into the regular expression:
term {prefix-operator}*{operand}{postfix-operator}*
expr {term}({infix-operator}{term})*
But you can't use that directly because (a) it doesn't deal with parentheses, (b) you probably need to allow whitespace, and (c) it doesn't correctly reject a+++++b
because C insists on the "maximal munch" rule for lexical scans, so that is not the same as the correct expression a++ + ++b
.
You can, however, translate the above regular expression into a very simple two-state state machine.
Upvotes: 1