Reputation: 628
I'm writing an interpreter for a small language that deals with vectors. I'm using Flex and Bison.
Vectors are declared with the following notation:
v := <1.4, -2.2, 7>
So, their components can also include negative numbers. Supported operations on those vectors are addition, subtraction and scalar multiplication. So you cannot add a vector and scalar, you cannot subtract a vector and a scalar, but you can multiply a vector and a scalar.
Since negative numbers are supported, my lexical analyzer uses the following regular expression to match numbers:
[+-]?[0-9]+([.][0-9]+)? {yylval.somedub = atof(yytext); return NUMBER;}
The problem occurs when an expression of the following format is to be parsed (exactly the same problem with a plus, of course):
v-2*v
The way it should be parsed is: vector minus (number times vector)
. But, as soon as flex sees -2 it will interpret it as a number, so I get vector (number) times vector
which, of course, makes no sense. On the other hand if you were to write v - 2*v
it works fine, since there's a space between the -
and the 2
. The expression part of my grammar looks like this (I'm not going to copy the entire code because it's pretty big):
expression:
expression '+' level_1
| expression '-' level_1
| level_1
level_1:
NUMBER '*' level_1
| level_1 '*' NUMBER
| level_2
level_2:
'(' expression ')'
| vector //parses the whole <a, b, c, ..> notation, irrelevant for the problem
I've also declared +, - and * as left-associative using
%left '+' '-'
%left '*'
So how would I go about fixing this problem? I don't know whether I need to change the associativity somehow or maybe reconstruct the whole grammar.
Any ideas?
Thanks.
Upvotes: 1
Views: 72
Reputation: 370455
If -2
is recognized as a single NUMBER
token, that means the parser will see v-2
as a name followed by a number and at that point, there's not really anything you can do to get the parse you want. So instead -2
should be recognized as two tokens: a minus sign followed by a number.
To achieve this, you can simply remove [+-]?
from your regex for numbers (I'm assuming you already have a rule to recognize +
and -
on their own).
Now you only need to adjust your grammar to allow -
or +
followed by a number (or by any expression if you also want to allow something like -v
or -(2+3)
).
Upvotes: 2