Reputation: 1211
I try to create a c++ flex/bison parser. I used this tutorial as a starting point and did not change any bison/flex configurations. I am stuck now to the point of trying to unit test the lexer.
I have a function in my unit tests that directly calls yylex, and checks the result of it:
private: static void checkIntToken(MyScanner &scanner, Compiler *comp, unsigned long expected, unsigned char size, char isUnsigned, unsigned int line, const std::string &label) {
yy::MyParser::location_type loc;
yy::MyParser::semantic_type semantic; // <---- is seems like the destructor of this variable causes the crash
int type = scanner.yylex(&semantic, &loc, comp);
Assert::equals(yy::MyParser::token::INT, type, label + "__1");
MyIntToken* token = semantic.as<MyIntToken*>();
Assert::equals(expected, token->value, label + "__2");
Assert::equals(size, token->size, label + "__3");
Assert::equals(isUnsigned, token->isUnsigned, label + "__4");
Assert::equals(line, loc.begin.line, label + "__5");
//execution comes to this point, and then, program crashes
}
The error message is:
program: ../src/__autoGenerated__/MyParser.tab.hh:190: yy::variant<32>::~variant() [S = 32]: Assertion `!yytypeid_' failed.
I have tried to follow the logic in the auto-generated bison files, and make some sense out of it. But I did not succeed on that and ultimately gave up. I searched then for any advice on the web about this error message but did not find any.
The location indicated by the error has the following code:
~variant (){
YYASSERT (!yytypeid_);
}
EDIT: The problem disappears only if I remove the
%define parse.assert
option from the bison file. But I am not sure if this is a good idea...
What is the proper way to obtain the value of the token generated by flex, for unit testing purposes?
Upvotes: 2
Views: 1140
Reputation: 241791
Note: I've tried to explain bison variant types to the best of my knowledge. I hope it is accurate but I haven't used them aside from some toy experiments. It would be an error to assume that this explanation in any way implies an endorsement of the interface.
The so-called "variant" type provided by bison's C++ interface is not a general-purpose variant type. That was a deliberate decision based on the fact that the parser is always able to figure out the semantic type associated with a semantic value on the parser stack. (This fact also allows a C union
to be used safely within the parser.) Recording type information within the "variant" would therefore be redundant. So they don't. In that sense, it is not really a discriminated union, despite what one might expect of a type named "variant".
(The bison variant type is a template with an integer (non-type) template argument. That argument is the size in bytes of the largest type which is allowed in the variant; it does not in any other way specify the possible types. The semantic_type
alias serves to ensure that the same template argument is used for every bison variant object in the parser code.)
Because it is not a discriminated union, its destructor cannot destruct the current value; it has no way to know how to do that.
This design decision is actually mentioned in the (lamentably insufficient) documentation for the Bison "variant" type. (When reading this, remember that it was originally written before std::variant
existed. These days, it would be std::variant
which was being rejected as "redundant", although it is also possible that the existence of std::variant
might have had the happy result of revisiting this design decision). In the chapter on C++ Variant Types, we read:
Warning: We do not use Boost.Variant, for two reasons. First, it appeared unacceptable to require Boost on the user’s machine (i.e., the machine on which the generated parser will be compiled, not the machine on which bison was run). Second, for each possible semantic value, Boost.Variant not only stores the value, but also a tag specifying its type. But the parser already “knows” the type of the semantic value, so that would be duplicating the information.
Therefore we developed light-weight variants whose type tag is external (so they are really like unions for C++ actually).
And indeed they are. So any use of a bison "variant" must have a definite type:
build
a variant with an argument of the type to build. (This is the only case where you don't need a template parameter, because the type is deduced from the argument. You would have to use an explicit template parameter only if the argument were not of the precise type; for example, an integer of lesser rank.)T
with as<T>
. (This is undefined behaviour if the value has a different type.)T
with destroy<T>
.T
with copy<T>
or move<T>
. (move<T>
involves constructing and then destructing a T()
, so you might not want to do it if T
had an expensive default constructor. On the whole, I'm not convinced by the semantics of the move
method. And its name conflicts semantically with std::move
, but again it came first.)T
with swap<T>
.Now, the generated parser understands all these restrictions, and it always knows the real types of the "variants" it has at its disposal. But you might come along and try to do something with one of these objects in a way that violates a constraint. Since the object really doesn't have any way to check the constraint, you'll end up with undefined behaviour which will probably have some disastrous eventual consequence.
So they also implemented an option which allows the "variant" to check the constraints. Unsurprisingly, this consists of adding a discriminator. But since the discriminator is only used to validate and not to modify behaviour, it is not a small integer which chooses between a small number of known alternatives, but rather a pointer to a std::typeid
(or NULL
if the variant does not yet contain a value.) (To be fair, in most cases alignment constraints mean that using a pointer for this purpose is no more expensive than using a small enum. All the same...)
So that's what you're running into. You enabled assertions with %define parse.assert
; that option was provided specifically to prevent you from doing what you are trying to do, which is let the variant object's destructor run before the variant's value is explicitly destructed.
So the "correct" way to avoid the problem is to insert an explicit call at the end of the scope:
// execution comes to this point, and then, without the following
// call, the program will fail on an assertion
semantic.destroy<MyIntType*>();
}
With the parse assertion enabled, the variant object will be able to verify that the types specified as template parameters to semantic.as<T>
and semantic.destroy<T>
are the same types as the value stored in the object. (Without parse.assert
, that too is your responsibility.)
Warning: opinion follows.
In case anyone reading this cares, my preference for using real std::variant
types comes from the fact that it is actually quite common for the semantic value of an AST node to require a discriminated union. The usual solution (in C++) is to construct a type hierarchy which is, in some ways, entirely artificial, and it is quite possible that std::variant
can better express the semantics.
In practice, I use the C interface and my own discriminated union implementation.
Upvotes: 6