Reputation: 305
The following Spirit x3 grammar for a simple robot command language generates compiler errors in Windows Visual Studio 17. For this project, I am required to compile with the warning level to 4 (/W4) and treat warnings as errors (/WX).
Warning C4127 conditional expression is constant SpiritTest e:\data\boost\boost_1_65_1\boost\spirit\home\x3\char\detail\cast_char.hpp 29
Error C2039 'insert': is not a member of 'boost::spirit::x3::unused_type' SpiritTest e:\data\boost\boost_1_65_1\boost\spirit\home\x3\core\detail\parse_into_container.hpp 259 Error C2039 'end': is not a member of 'boost::spirit::x3::unused_type' SpiritTest e:\data\boost\boost_1_65_1\boost\spirit\home\x3\core\detail\parse_into_container.hpp 259 Error C2039 'empty': is not a member of 'boost::spirit::x3::unused_type' SpiritTest e:\data\boost\boost_1_65_1\boost\spirit\home\x3\core\detail\parse_into_container.hpp 254 Error C2039 'begin': is not a member of 'boost::spirit::x3::unused_type' SpiritTest e:\data\boost\boost_1_65_1\boost\spirit\home\x3\core\detail\parse_into_container.hpp 259
Clearly, something is wrong with my grammar, but the error messages are completely unhelpful. I have found that if I remove the Kleene star in the last line of the grammar (*parameter to just parameter) the errors disappear, but then I get lots of warnings like this:
Warning C4459 declaration of 'digit' hides global declaration SpiritTest e:\data\boost\boost_1_65_1\boost\spirit\home\x3\support\numeric_utils\detail\extract_int.hpp 174 Warning C4127 conditional expression is constant SpiritTest e:\data\boost\boost_1_65_1\boost\spirit\home\x3\char\detail\cast_char.hpp 29
#include <string>
#include <iostream>
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/home/x3.hpp>
namespace x3 = boost::spirit::x3;
//
// Grammar for simple command language
//
namespace scl
{
using boost::spirit::x3::char_;
using boost::spirit::x3::double_;
using boost::spirit::x3::int_;
using boost::spirit::x3::lexeme;
using boost::spirit::x3::lit;
using boost::spirit::x3::no_case;
auto valid_identifier_chars = char_ ("a-zA-Z_");
auto quoted_string = '"' >> *(lexeme [~char_ ('"')]) >> '"';
auto keyword_value_chars = char_ ("a-zA-Z0-9$_.");
auto qual = lexeme [!(no_case [lit ("no")]) >> +valid_identifier_chars] >> -('=' >> (quoted_string | int_ | double_ | +keyword_value_chars));
auto neg_qual = lexeme [no_case [lit ("no")] >> +valid_identifier_chars];
auto qualifier = lexeme ['/' >> (qual | neg_qual)];
auto verb = +valid_identifier_chars >> *qualifier;
auto parameter = +keyword_value_chars >> *qualifier;
auto command = verb >> *parameter;
}; // End namespace scl
using namespace std; // Must be after Boost stuff!
int
main ()
{
vector <string> input =
{
"show/out=\"somefile.txt\" motors/all cameras/full",
"start/speed=5 motors arm1 arm2/speed=2.5/track arm3",
"rotate camera1/notrack/axis=y/angle=45"
};
//
// Parse each of the strings in the input vector
//
for (string str : input)
{
auto b = str.begin ();
auto e = str.end ();
cout << "Parsing: " << str << endl;
x3::phrase_parse (b, e, scl::command, x3::space);
if (b != e)
{
cout << "Error, only parsed to position: " << b - str.begin () << endl;
}
} // End for
return 0;
} // End main
Upvotes: 1
Views: 356
Reputation: 393029
There is a regression since Boost 1.65 that causes problems with some rules that potentially propagate into container type attributes.
They dispatch to the wrong overload when instantiated without an actual bound attribute. When this happens there is a "mock" attribute type called unused_type
. The errors you are seeing indicate that unused_type
is being treated as if it were a concrete attribute type, and clearly that won't fly.
The regression was fixed in https://github.com/boostorg/spirit/commit/ee4943d5891bdae0706fb616b908e3bf528e0dfa
You can see that it's a regression by compiling with Boost 1.64:
Boost 1.64 compiles it fine GCC and Clang
Now, latest develop is supposed to fix it, but you can simply copy the patched file, even just the 7-line patch.
All of the above was already available when I linked the duplicate question How to make a recursive rule in boost spirit x3 in VS2017, which highlights the same regression
using namespace std; // Must be after Boost stuff!
Actually, it probably needs to be nowhere unless very locally scoped, where you can see the impact of any potential name colisions.
Consider encapsulating the skipper, since it's likely logically part of your grammar spec, not something to be overridden by the caller.
This is a bug:
auto quoted_string = '"' >> *(lexeme[~char_('"')]) >> '"';
You probably meant to assert the whole literal is lexeme, not individual characters (that's... moot because whitespace would never hit the parser anyways, because of the skipper).
auto quoted_string = lexeme['"' >> *~char_('"') >> '"'];
Likewise, you might have intended +keyword_value_chars
to be lexeme, because right now one=two three four
would parse the "qualifier" one
with a "keyword value" of onethreefour
, not one three four
¹
x3::space
skips embedded newlines, if that's not the intent, use x3::blank
Since PEG grammars are parsed left-to-right greedy, you can order the qualifier
production and do without the !(no_case["no"])
lookahead assertion. That not only removes duplication but also makes the grammar simpler and more efficient:
auto qual = lexeme[+valid_identifier_chars] >>
-('=' >> (quoted_string | int_ | double_ | +keyword_value_chars)); // TODO lexeme
auto neg_qual = lexeme[no_case["no"] >> +valid_identifier_chars];
auto qualifier = lexeme['/' >> (neg_qual | qual)];
¹ Note (Post-Scriptum) now that we notice
qualifier
is, itself, already a lexeme, there's no need tolexeme[]
things inside (unless, of course they're reused in contexts with skippers).However, this also gives rise to the question whether whitespace around the
=
operator should be accepted (currently, it is not), or whether qualifiers can be separated with whitespace (likeid /a /b
; currently they can).
Perhaps verb
needed some lexemes[]
as well (unless you really did want to parse "one two three"
as a verb)
If no
prefix for negative qualifiers, then maybe the identifier itself is, too? This could simplify the grammar
The ordering of int_
and double_
makes it so that most doubles are mis-parsed as int
before they could ever be recognized. Consider something more explicit like x3::strict_real_policies<double>>{} | int_
If you're parsing quoted constructs, perhaps you want to recognize escapes too ('\"'
and '\\'
for example):
auto quoted_string = lexeme['"' >> *('\\' >> char_ | ~char_('"')) >> '"'];
If you have a need for "keyword values" consider listing known values in x3::symbols<>
. This can also be used to parse directly into an enum type.
Here's a version that parses into AST types and prints it back for demonstration purposes:
#include <boost/config/warning_disable.hpp>
#include <string>
#include <vector>
#include <boost/variant.hpp>
namespace Ast {
struct Keyword : std::string { // needs to be strong-typed to distinguish from quoted values
using std::string::string;
using std::string::operator=;
};
struct Nil {};
using Value = boost::variant<Nil, std::string, int, double, Keyword>;
struct Qualifier {
enum Kind { positive, negative } kind;
std::string identifier;
Value value;
};
struct Param {
Keyword keyword;
std::vector<Qualifier> qualifiers;
};
struct Command {
std::string verb;
std::vector<Qualifier> qualifiers;
std::vector<Param> params;
};
}
#include <boost/fusion/adapted/struct.hpp>
BOOST_FUSION_ADAPT_STRUCT(Ast::Qualifier, kind, identifier, value)
BOOST_FUSION_ADAPT_STRUCT(Ast::Param, keyword, qualifiers)
BOOST_FUSION_ADAPT_STRUCT(Ast::Command, verb, qualifiers, params)
#include <boost/spirit/home/x3.hpp>
namespace x3 = boost::spirit::x3;
namespace scl {
//
// Grammar for simple command language
//
using x3::char_;
using x3::int_;
using x3::lexeme;
using x3::no_case;
// lexeme tokens
auto keyword = x3::rule<struct _keyword, Ast::Keyword> { "keyword" }
= lexeme [ +char_("a-zA-Z0-9$_.") ];
auto identifier = lexeme [ +char_("a-zA-Z_") ];
auto quoted_string = lexeme['"' >> *('\\' >> x3::char_ | ~x3::char_('"')) >> '"'];
auto value
= quoted_string
| x3::real_parser<double, x3::strict_real_policies<double>>{}
| x3::int_
| keyword;
auto qual
= x3::attr(Ast::Qualifier::positive) >> identifier >> -('=' >> value);
auto neg_qual
= x3::attr(Ast::Qualifier::negative) >> lexeme[no_case["no"] >> identifier] >> x3::attr(Ast::Nil{}); // never a value
auto qualifier
= lexeme['/' >> (neg_qual | qual)];
auto verb
= identifier;
auto parameter = x3::rule<struct _parameter, Ast::Param> {"parameter"}
= keyword >> *qualifier;
auto command = x3::rule<struct _command, Ast::Command> {"command"}
= x3::skip(x3::space) [ verb >> *qualifier >> *parameter ];
} // End namespace scl
// For Demo, Debug: printing the Ast types back
#include <iostream>
#include <iomanip>
namespace Ast {
static inline std::ostream& operator<<(std::ostream& os, Value const& v) {
struct {
std::ostream& _os;
void operator()(std::string const& s) const { _os << std::quoted(s); }
void operator()(int i) const { _os << i; }
void operator()(double d) const { _os << d; }
void operator()(Keyword const& kwv) const { _os << kwv; }
void operator()(Nil) const { }
} vis{os};
boost::apply_visitor(vis, v);
return os;
}
static inline std::ostream& operator<<(std::ostream& os, Qualifier const& q) {
os << "/" << (q.kind==Qualifier::negative?"no":"") << q.identifier;
if (q.value.which())
os << "=" << q.value;
return os;
}
static inline std::ostream& operator<<(std::ostream& os, std::vector<Qualifier> const& qualifiers) {
for (auto& qualifier : qualifiers)
os << qualifier;
return os;
}
static inline std::ostream& operator<<(std::ostream& os, Param const& p) {
return os << p.keyword << p.qualifiers;
}
static inline std::ostream& operator<<(std::ostream& os, Command const& cmd) {
os << cmd.verb << cmd.qualifiers;
for (auto& param : cmd.params) os << " " << param;
return os;
}
}
int main() {
for (std::string const str : {
"show/out=\"somefile.txt\" motors/all cameras/full",
"start/speed=5 motors arm1 arm2/speed=2.5/track arm3",
"rotate camera1/notrack/axis=y/angle=45",
})
{
auto b = str.begin(), e = str.end();
Ast::Command cmd;
bool ok = parse(b, e, scl::command, cmd);
std::cout << (ok?"OK":"FAIL") << '\t' << std::quoted(str) << '\n';
if (ok) {
std::cout << " -- Full AST: " << cmd << "\n";
std::cout << " -- Verb+Qualifiers: " << cmd.verb << cmd.qualifiers << "\n";
for (auto& param : cmd.params)
std::cout << " -- Param+Qualifiers: " << param << "\n";
}
if (b != e) {
std::cout << " -- Remaining unparsed: " << std::quoted(std::string(b,e)) << "\n";
}
}
}
Prints
OK "show/out=\"somefile.txt\" motors/all cameras/full"
-- Full AST: show/out="somefile.txt" motors/all cameras/full
-- Verb+Qualifiers: show/out="somefile.txt"
-- Param+Qualifiers: motors/all
-- Param+Qualifiers: cameras/full
OK "start/speed=5 motors arm1 arm2/speed=2.5/track arm3"
-- Full AST: start/speed=5 motors arm1 arm2/speed=2.5/track arm3
-- Verb+Qualifiers: start/speed=5
-- Param+Qualifiers: motors
-- Param+Qualifiers: arm1
-- Param+Qualifiers: arm2/speed=2.5/track
-- Param+Qualifiers: arm3
OK "rotate camera1/notrack/axis=y/angle=45"
-- Full AST: rotate camera1/notrack/axis=y/angle=45
-- Verb+Qualifiers: rotate
-- Param+Qualifiers: camera1/notrack/axis=y/angle=45
For completeness
- Demo also Live On MSVC (Rextester) - note that RexTester uses Boost 1.60
- Coliru uses Boost 1.66 but the problem doesn't manifest itself because now, there are concrete attribute values bound to parsers
Upvotes: 2