Reputation:
I don't know why my approach fails:
if_stmt:
CONTROL_IF '(' expr ')' '{' lines '}'
{
if(($3.int_value != 0)||($3.float_value != 0)){
yyerror("hello");
}
}
Test Code:
line14 if(b == 1){
line15 print(b);
line16 }
Error Output:
Syntax error at line 15
The yacc file successfully compiles but after test it with the if-else code, it always saying there's a syntax error. Also the reason I put a yyerror in the yacc if statement is that I failed to execute the lines which supposed to be $6 but the yacc said it has no declared type. So I am wondering how to execute the yyerror and if that can be solved, how to execute the lines if the expr is true(!=0).
Upvotes: 0
Views: 840
Reputation: 241721
Note: Your immediate question -- "Why does my grammar produce a syntax error" -- cannot be answer without seeing your grammar. So this answer concentrates on the other question, which you really need to understand: "Why can't I use the semantic value of lines
in my action for an if
statement.
The actions associated with a production are executed when the production's right-hand side is recognized. In order to recognise a right-hand side, all of the contained non-terminals must already have been recognised, so their actions have already been performed before the production's action is run.
So you cannot decide in a production whether or not to execute the actions of its children. That would require time travel.
A non-terminal may have an associated semantic value, and this value is computed, at least initially, by the the action for the production which was recognised. The action function registers this value by assigning it to $$
. If the action does not assign to $$
, the non-terminal has no value. If there is no action, { $$ = $1; }
is used.
Bison/yacc produces C programs, which must conform to C rules. In C, every value has a type, which must be known to the compiler. Unlike languages like Python, you cannot declare a variable whose type is decided later. int i;
means that i
is an int
. When the program runs, it cannot change the i
to be a double
. As an int
it was born, and as an int
it will die.
Bison/yacc non-terminals -- which are sometimes called "grammatic variables" -- are no different (or only slightly different). Non-terminals which hold values must have a type known to the compiler (and to the parser generator). The type of the variable cannot be decided later on, and it cannot vary between two different executions of the parser.
Bison/yacc actually implements this using a C union type, which effectively allows several different variables to use the same memory (but not at the same time). Unions don't really avoid having to know the type of a value at compile-time, since you can only reference the value of a union using a specific union member. That union member has a type, just like any other variable. So when you use unions you actually have two tasks: you must give every member a fixed type, and you need to remember which of the members of a union is currently in use. Since the compiler doesn't know, it won't help you, and it won't fix your mistakes either. As with many aspects of writing C programs, you're on your own. The program will work if you don't make mistakes. If you do make a mistake, almost anything could happen.
Bison/yacc can help with some of the inconveniences of using unions. It is using a union in the first place for an internal purpose: the semantic values of the various active non-terminals are stored in a stack, and a stack is an array of C values of a given type. By using a union type, bison/yacc can use different members of the different union values on the stack, as long as it knows which member of each stack slot is in use. And, as it happens, it does know that because it knows which non-terminal each stack slot corresponds to. All of that means that you can treat bison/yacc's grammatical variables as though they were ordinary C variables, each having a known type.
Bison/yacc also allows you to have non-terminals without any value at all. Technically speaking, "no value at all" is not a legitimate value. The union will have some value, left over from the last time that particular memory was used. But as long as you never attempt to refer to the value of the non-terminal, the facts that it is uninitialised doesn't matter. It's not an error to fail to initialise a variable. It's an error to try to use the value of an uninitialised variable.
So if you try to use a non-terminal's value and that non-terminal has no declared type (that is, it doesn't appear in any %type
declaration), then Bison will complain that you are trying to use a non-terminal with no declared type, which is effectively saying that you've never assigned a value to that non-terminal. Furthermore, if you try to assign a value to a variable with no declared type (by assigning to $$
in a production action for that non-terminal), then bison will complain that you haven't declared a type for that non-terminal. It must do that because it has to compile the assignment of $$
into the assignment of some member of the union, and you haven't told it which member to use.
So that's what the problem that bison/yacc is referring to when it complains that $6
has no declared type. $6
is an instance of a lines
non-terminal, and you haven't declared a %type
for lines
. And it's not enough to just declare a type: if you're going to use the value of lines
in some production, you must have set the value of lines
when it was created, so all the productions which could be used to create a lines
must either have an assignment to $$
or must use the default $$ = $1;
action (and then $1
must be a value of the correct type).
Unfortunately, bison doesn't know if you have assigned a value during the execution of a production action. Even the C compiler can't always figure that out, and the C compiler understands C code a lot better than bison/yacc does. Bison/yacc just copies the code, changing the $x
tokens into expressions which refer to the appropriate union member of the appropriate stack slot. But it does warn you about the things that it can figure out, like using the value of a non-terminal which has no type, or using the default action when $$
and $1
have different types.
OK, so we've established that the value of lines
can't be used in the production for an if
statement because lines
doesn't have a value. But just giving it a value isn't going to fix your problem, because what you want cannot be fixed by just giving lines
a value. What you want is for lines
not to be evaluated at all until the right-hand side which references it decides whether or not it should be evaluated. And that is not possible, because the lines
action is always executed at the moment that lines
is recognised. C does not implement lazy execution.
That's not to say that you can't implement lazy execution in C. You can do whatever you want, subject to your ingenuity, because C is Turing complete. You could work out a way to represent the action you want to lazily execute, and make the value of lines
the description of some action. That shouldn't be so difficult to understand, because that's precisely what a compiler does, and according to your question tags you are trying to construct a compiler. A calculator can just execute the string you throw at it, but that's not what a compiler does. Compiling a program produces an "executable", which is precisely a representation of how to execute the compiled program.
You have a number of possible approaches to solving this problem in your parser. One way to do it would be to actually compile lines
into a fragment of machine code, and make that fragment of machine code the semantic value of lines
. But that's going to be very painful, since at the moment that you parse lines
you don't really have enough information to fully compile it. (For example, you don't yet know the storage layout for the entire program.) It's much easier to create some kind of intermediate representation, which can later on be converted into an executable program. Or, even easier, to create a parse tree which contains all of the useful information extracted from the parse. (Parse trees are often called ASTs, for "Abstract Syntax Tree" -- or "Annotated Syntax Tree" -- and you can easily search for that term, in your text book if you have one, or on the internet.)
Upvotes: 1