Markus
Markus

Reputation: 373

boost spirit wide char rule create null char´s

With this rule

name_valid %= (lexeme[+(boost::spirit::standard_wide::alpha | lit('_'))]);

of type

typedef qi::rule<Iterator, std::wstring()> name_valid;

Running in debug mode all is fine. name_valid contains the correct string. When going to release mode in VC2017 I got a NUL char on such inputs

Input  : a_b  
Output : a(NULL)b

I found out that I have to rewrite the rule like this. Can´t see a lit as wide char operation. Do I miss something here?

 name_valid %= +(boost::spirit::standard_wide::alpha | wide::char_(L'_'));

Upvotes: 1

Views: 188

Answers (1)

sehe
sehe

Reputation: 393114

i found out that i have to rewrite the rule like this

Well, if the goal was to match '_' as part of a name, then you NEED to write that anyway. Because +(alpha | '_') exposes an attribute which is the character sequence of all alpha characters, but not '_' since literals do not expose an attribute.

Can´t see a lit as wide char operation.

That's qi::lit(L'_')

Do i miss something here

What I think is happening is that alpha|'_' synthesizes an optional<char>. Apparently, the propagation rules are so relaxed that optional<char> can be assigned to char through its conversion-to-bool operation (resulting in a NUL characater). Wide characters have nothing to do with it:

Live On Coliru

#include <boost/spirit/include/qi.hpp>
namespace qi = boost::spirit::qi;
namespace enc = boost::spirit::standard;

int main() {
    std::string const input = "A.B";
    auto f = input.begin(), l = input.end();

    std::string output;
    if (qi::parse(f, l, +(enc::alpha | '.'), output)) {
        std::cout << "Parsed: '" << output << "'\n";
    } else {
        std::cout << "Failed\n";
    }

    if (f!=l)
        std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
}

Prints

00000000: 5061 7273 6564 3a20 2741 0042 270a       Parsed: 'A.B'.

Testing The Hypothesis:

Splitting it up in separate rules makes it visible: Live On Coliru

qi::rule<It, char()> c = enc::alpha | '.';
qi::rule<It, std::string()> s = +c;
BOOST_SPIRIT_DEBUG_NODES((s)(c))

Prints

<s>
  <try>A.B</try>
  <c>
    <try>A.B</try>
    <success>.B</success>
    <attributes>[A]</attributes>
  </c>
  <c>
    <try>.B</try>
    <success>B</success>
    <attributes>[NUL]</attributes>
  </c>
  <c>
    <try>B</try>
    <success></success>
    <attributes>[B]</attributes>
  </c>
  <c>
    <try></try>
    <fail/>
  </c>
  <success></success>
  <attributes>[[A, NUL, B]]</attributes>
</s>

This highlights that the char exposed by c indeed becomes the NUL char. The following, however, makes clear that wasn't completely intentional: Live On Coliru

qi::rule<It, boost::optional<char>()> c = enc::alpha | '.';
qi::rule<It, std::string()> s = +c;
BOOST_SPIRIT_DEBUG_NODES((s)(c))

which will abort with an assertion:

sotest: /home/sehe/custom/boost_1_65_0/boost/optional/optional.hpp:1106: boost::optional::reference_const_type boost::optional<char>::get() const [T = char]: Assertion `this->is_initialized()' failed.

Out of curiosity: this fixes it: Live On Coliru

qi::rule<It, std::string()> c = enc::alpha | '.';
qi::rule<It, std::string()> s = +c;

Prints

Parsed: 'AB'

fully as expected

Summary

Automatic attribute propagation rules are powerful, but can be surprising.

Don't play fast and loose with attribute compatibility: say what you mean. In your case alpha | char_('_') is conceptually the only thing that SHOULD do what you expect.

Upvotes: 2

Related Questions