Brian
Brian

Reputation: 305

Unhelpful compiler errors in x3 grammar

The following Spirit x3 grammar for a simple robot command language generates compiler errors in Windows Visual Studio 17. For this project, I am required to compile with the warning level to 4 (/W4) and treat warnings as errors (/WX).

Warning C4127 conditional expression is constant SpiritTest e:\data\boost\boost_1_65_1\boost\spirit\home\x3\char\detail\cast_char.hpp 29
Error C2039 'insert': is not a member of 'boost::spirit::x3::unused_type' SpiritTest e:\data\boost\boost_1_65_1\boost\spirit\home\x3\core\detail\parse_into_container.hpp 259 Error C2039 'end': is not a member of 'boost::spirit::x3::unused_type' SpiritTest e:\data\boost\boost_1_65_1\boost\spirit\home\x3\core\detail\parse_into_container.hpp 259 Error C2039 'empty': is not a member of 'boost::spirit::x3::unused_type' SpiritTest e:\data\boost\boost_1_65_1\boost\spirit\home\x3\core\detail\parse_into_container.hpp 254 Error C2039 'begin': is not a member of 'boost::spirit::x3::unused_type' SpiritTest e:\data\boost\boost_1_65_1\boost\spirit\home\x3\core\detail\parse_into_container.hpp 259

Clearly, something is wrong with my grammar, but the error messages are completely unhelpful. I have found that if I remove the Kleene star in the last line of the grammar (*parameter to just parameter) the errors disappear, but then I get lots of warnings like this:

Warning C4459 declaration of 'digit' hides global declaration SpiritTest e:\data\boost\boost_1_65_1\boost\spirit\home\x3\support\numeric_utils\detail\extract_int.hpp 174 Warning C4127 conditional expression is constant SpiritTest e:\data\boost\boost_1_65_1\boost\spirit\home\x3\char\detail\cast_char.hpp 29

#include <string>
#include <iostream>

#include <boost/config/warning_disable.hpp>
#include <boost/spirit/home/x3.hpp>

namespace x3 = boost::spirit::x3;

//
// Grammar for simple command language
//

namespace scl
    {
    using boost::spirit::x3::char_;
    using boost::spirit::x3::double_;
    using boost::spirit::x3::int_;
    using boost::spirit::x3::lexeme;
    using boost::spirit::x3::lit;
    using boost::spirit::x3::no_case;

    auto    valid_identifier_chars = char_ ("a-zA-Z_");
    auto    quoted_string = '"' >> *(lexeme [~char_ ('"')]) >> '"';
    auto    keyword_value_chars = char_ ("a-zA-Z0-9$_.");

    auto    qual = lexeme [!(no_case [lit ("no")]) >> +valid_identifier_chars] >> -('=' >> (quoted_string | int_ | double_ | +keyword_value_chars));
    auto    neg_qual = lexeme [no_case [lit ("no")] >> +valid_identifier_chars];
    auto    qualifier = lexeme ['/' >> (qual | neg_qual)];

    auto    verb = +valid_identifier_chars >> *qualifier;
    auto    parameter = +keyword_value_chars >> *qualifier;
    auto    command = verb >> *parameter;
    };  // End namespace scl

using namespace std;                                    // Must be after Boost stuff!

int
main ()

{
vector <string> input = 
    {
    "show/out=\"somefile.txt\" motors/all cameras/full",
    "start/speed=5 motors arm1 arm2/speed=2.5/track arm3",
    "rotate camera1/notrack/axis=y/angle=45"
    };

    //
    // Parse each of the strings in the input vector
    //

    for (string str : input)
        {
        auto    b = str.begin ();
        auto    e = str.end ();


        cout << "Parsing: " << str << endl;
        x3::phrase_parse (b, e, scl::command, x3::space);

        if (b != e)
            {
            cout << "Error, only parsed to position: " << b - str.begin () << endl;
            }

        }   // End for

    return 0;
}                           // End main

Upvotes: 1

Views: 356

Answers (1)

sehe
sehe

Reputation: 393029

There is a regression since Boost 1.65 that causes problems with some rules that potentially propagate into container type attributes.

They dispatch to the wrong overload when instantiated without an actual bound attribute. When this happens there is a "mock" attribute type called unused_type. The errors you are seeing indicate that unused_type is being treated as if it were a concrete attribute type, and clearly that won't fly.

The regression was fixed in https://github.com/boostorg/spirit/commit/ee4943d5891bdae0706fb616b908e3bf528e0dfa

You can see that it's a regression by compiling with Boost 1.64:

Now, latest develop is supposed to fix it, but you can simply copy the patched file, even just the 7-line patch.

All of the above was already available when I linked the duplicate question How to make a recursive rule in boost spirit x3 in VS2017, which highlights the same regression

Review

  • using namespace std; // Must be after Boost stuff!

    Actually, it probably needs to be nowhere unless very locally scoped, where you can see the impact of any potential name colisions.

  • Consider encapsulating the skipper, since it's likely logically part of your grammar spec, not something to be overridden by the caller.

  • This is a bug:

    auto quoted_string = '"' >> *(lexeme[~char_('"')]) >> '"';
    

    You probably meant to assert the whole literal is lexeme, not individual characters (that's... moot because whitespace would never hit the parser anyways, because of the skipper).

    auto quoted_string = lexeme['"' >> *~char_('"') >> '"'];
    
  • Likewise, you might have intended +keyword_value_chars to be lexeme, because right now one=two three four would parse the "qualifier" one with a "keyword value" of onethreefour, not one three four¹

  • x3::space skips embedded newlines, if that's not the intent, use x3::blank

  • Since PEG grammars are parsed left-to-right greedy, you can order the qualifier production and do without the !(no_case["no"]) lookahead assertion. That not only removes duplication but also makes the grammar simpler and more efficient:

    auto qual      = lexeme[+valid_identifier_chars] >>
        -('=' >> (quoted_string | int_ | double_ | +keyword_value_chars)); // TODO lexeme
    auto neg_qual  = lexeme[no_case["no"] >> +valid_identifier_chars];
    auto qualifier = lexeme['/' >> (neg_qual | qual)];
    

    ¹ Note (Post-Scriptum) now that we notice qualifier is, itself, already a lexeme, there's no need to lexeme[] things inside (unless, of course they're reused in contexts with skippers).

    However, this also gives rise to the question whether whitespace around the = operator should be accepted (currently, it is not), or whether qualifiers can be separated with whitespace (like id /a /b; currently they can).

  • Perhaps verb needed some lexemes[] as well (unless you really did want to parse "one two three" as a verb)

  • If no prefix for negative qualifiers, then maybe the identifier itself is, too? This could simplify the grammar

  • The ordering of int_ and double_ makes it so that most doubles are mis-parsed as int before they could ever be recognized. Consider something more explicit like x3::strict_real_policies<double>>{} | int_

  • If you're parsing quoted constructs, perhaps you want to recognize escapes too ('\"' and '\\' for example):

    auto quoted_string = lexeme['"' >> *('\\' >> char_ | ~char_('"')) >> '"'];
    
  • If you have a need for "keyword values" consider listing known values in x3::symbols<>. This can also be used to parse directly into an enum type.

Here's a version that parses into AST types and prints it back for demonstration purposes:

Live On Coliru

#include <boost/config/warning_disable.hpp>

#include <string>
#include <vector>
#include <boost/variant.hpp>

namespace Ast {
    struct Keyword : std::string { // needs to be strong-typed to distinguish from quoted values
        using std::string::string;
        using std::string::operator=;
    };

    struct Nil {};
    using Value = boost::variant<Nil, std::string, int, double, Keyword>;

    struct Qualifier {
        enum Kind { positive, negative } kind;
        std::string identifier;
        Value       value;
    };

    struct Param {
        Keyword                keyword;
        std::vector<Qualifier> qualifiers;
    };

    struct Command {
        std::string            verb;
        std::vector<Qualifier> qualifiers;
        std::vector<Param>     params;
    };
}

#include <boost/fusion/adapted/struct.hpp>
BOOST_FUSION_ADAPT_STRUCT(Ast::Qualifier, kind, identifier, value)
BOOST_FUSION_ADAPT_STRUCT(Ast::Param, keyword, qualifiers)
BOOST_FUSION_ADAPT_STRUCT(Ast::Command, verb, qualifiers, params)

#include <boost/spirit/home/x3.hpp>

namespace x3 = boost::spirit::x3;
namespace scl {
    //
    // Grammar for simple command language
    //
    using x3::char_;
    using x3::int_;
    using x3::lexeme;
    using x3::no_case;

    // lexeme tokens
    auto keyword       = x3::rule<struct _keyword, Ast::Keyword> { "keyword" }
                       = lexeme [ +char_("a-zA-Z0-9$_.") ];
    auto identifier    = lexeme [ +char_("a-zA-Z_") ];
    auto quoted_string = lexeme['"' >> *('\\' >> x3::char_ | ~x3::char_('"')) >> '"'];

    auto value
        = quoted_string
        | x3::real_parser<double, x3::strict_real_policies<double>>{}
        | x3::int_
        | keyword;

    auto qual
        = x3::attr(Ast::Qualifier::positive) >> identifier >> -('=' >> value);
    auto neg_qual
        = x3::attr(Ast::Qualifier::negative) >> lexeme[no_case["no"] >> identifier] >> x3::attr(Ast::Nil{}); // never a value
    auto qualifier 
        = lexeme['/' >> (neg_qual | qual)];

    auto verb      
        = identifier;

    auto parameter = x3::rule<struct _parameter, Ast::Param> {"parameter"}
        = keyword >> *qualifier;

    auto command = x3::rule<struct _command, Ast::Command> {"command"} 
        = x3::skip(x3::space) [ verb >> *qualifier >> *parameter ];

} // End namespace scl

// For Demo, Debug: printing the Ast types back
#include <iostream>
#include <iomanip>

namespace Ast {
    static inline std::ostream& operator<<(std::ostream& os, Value const& v) {
        struct {
            std::ostream& _os;
            void operator()(std::string const& s) const { _os << std::quoted(s); }
            void operator()(int i)                const { _os << i; } 
            void operator()(double d)             const { _os << d; } 
            void operator()(Keyword const& kwv)   const { _os << kwv; } 
            void operator()(Nil)                  const { }
        } vis{os};

        boost::apply_visitor(vis, v);
        return os;
    }

    static inline std::ostream& operator<<(std::ostream& os, Qualifier const& q) {
        os << "/" << (q.kind==Qualifier::negative?"no":"") << q.identifier;
        if (q.value.which())
           os << "=" << q.value;
        return os;
    }

    static inline std::ostream& operator<<(std::ostream& os, std::vector<Qualifier> const& qualifiers) {
        for (auto& qualifier : qualifiers)
            os << qualifier;
        return os;
    }

    static inline std::ostream& operator<<(std::ostream& os, Param const& p) {
        return os << p.keyword << p.qualifiers;
    }

    static inline std::ostream& operator<<(std::ostream& os, Command const& cmd) {
        os << cmd.verb << cmd.qualifiers;
        for (auto& param : cmd.params)         os << " " << param;
        return os;
    }
}

int main() {
    for (std::string const str : {
            "show/out=\"somefile.txt\" motors/all cameras/full",
            "start/speed=5 motors arm1 arm2/speed=2.5/track arm3",
            "rotate camera1/notrack/axis=y/angle=45",
        })
    {
        auto b = str.begin(), e = str.end();

        Ast::Command cmd;
        bool ok = parse(b, e, scl::command, cmd);
        std::cout << (ok?"OK":"FAIL") << '\t' << std::quoted(str) << '\n';

        if (ok) {
            std::cout << " -- Full AST: " << cmd << "\n";
            std::cout << " -- Verb+Qualifiers: " << cmd.verb << cmd.qualifiers << "\n";
            for (auto& param : cmd.params)
                std::cout << "     -- Param+Qualifiers: " << param << "\n";
        }

        if (b != e) {
            std::cout << " -- Remaining unparsed: " << std::quoted(std::string(b,e)) << "\n";
        }
    }
}

Prints

OK  "show/out=\"somefile.txt\" motors/all cameras/full"
 -- Full AST: show/out="somefile.txt" motors/all cameras/full
 -- Verb+Qualifiers: show/out="somefile.txt"
     -- Param+Qualifiers: motors/all
     -- Param+Qualifiers: cameras/full
OK  "start/speed=5 motors arm1 arm2/speed=2.5/track arm3"
 -- Full AST: start/speed=5 motors arm1 arm2/speed=2.5/track arm3
 -- Verb+Qualifiers: start/speed=5
     -- Param+Qualifiers: motors
     -- Param+Qualifiers: arm1
     -- Param+Qualifiers: arm2/speed=2.5/track
     -- Param+Qualifiers: arm3
OK  "rotate camera1/notrack/axis=y/angle=45"
 -- Full AST: rotate camera1/notrack/axis=y/angle=45
 -- Verb+Qualifiers: rotate
     -- Param+Qualifiers: camera1/notrack/axis=y/angle=45

For completeness

  • Demo also Live On MSVC (Rextester) - note that RexTester uses Boost 1.60
  • Coliru uses Boost 1.66 but the problem doesn't manifest itself because now, there are concrete attribute values bound to parsers

Upvotes: 2

Related Questions