Reputation: 1211
I try to write a compiler, and use flex/bison for the scanning and parsing. My question is about how these 2 can communicate, so that lex passes a token type, and (if needed) a semantic value.
The problem is that I find different (conflicting?) documentations.
For example here they mention to use yylval subfields for the semantic value, and to return the token type (probably and integer).
[0-9]+ {
yylval->build<int> () = text_to_int (yytext);
return yy::parser::token::INTEGER;
}
[a-z]+ {
yylval->build<std::string> () = yytext;
return yy::parser::token::IDENTIFIER;
}
But then, I see (also in the official docs) this:
"-" return yy::calcxx_parser::make_MINUS (loc);
"+" return yy::calcxx_parser::make_PLUS (loc);
"*" return yy::calcxx_parser::make_STAR (loc);
"/" return yy::calcxx_parser::make_SLASH (loc);
"(" return yy::calcxx_parser::make_LPAREN (loc);
")" return yy::calcxx_parser::make_RPAREN (loc);
":=" return yy::calcxx_parser::make_ASSIGN (loc);
{int} {
errno = 0;
long n = strtol (yytext, NULL, 10);
if (! (INT_MIN <= n && n <= INT_MAX && errno != ERANGE))
driver.error (loc, "integer is out of range");
return yy::calcxx_parser::make_NUMBER (n, loc);
}
{id} return yy::calcxx_parser::make_IDENTIFIER (yytext, loc);
. driver.error (loc, "invalid character");
<<EOF>> return yy::calcxx_parser::make_END (loc);
Here, yylval is not mentioned at all, and what we return is some strange make_??? functions, that I fail to understand where they are defined, what parameters they accept and what they return.
Can somebody clarify to me the is the difference between those 2 approaches, and, if I should use the second, a short explanation on those mysterious make_??? methods?
Thanks in advance!
Upvotes: 1
Views: 1020
Reputation: 241721
The documentation section you link to is the first of two sections which describe alternative APIs. It would be better to start reading at the beginning, where it is explained that:
The actual interface with yylex depends whether you use unions, or variants.
The example you cite uses variants, and therefore uses the complete symbols interface, where the make_*
methods are defined. (These are not standard library or Boost variants; they are a simple discriminated union class defined by the bison framework.)
Which of the APIs you use is entirely up to you; they both have advantages and disadvantages.
There is also a third alternative: build both the parser and the lexer using C interfaces. That doesn't stop you from using C++ datatypes, but you cannot put them directly into the parser stack; you need to use pointers and that makes memory management more manual. (Actually, there are two different C APIs as well: the traditional one, in which the parser automatically calls the scanner when it needs a token, and the "push" interface, where the scanner calls the parser with each token.)
Upvotes: 1