Boost.Spirit is not parsing whole input

Question

I have a boost::spirit::qi rules:

auto dquote = qi::char_('"');
auto comma = qi::char_(',');
auto newline = qi::char_('
');
auto nonEscaped = *(qi::char_ - newline - comma - dquote);
auto escaped = *qi::blank >> dquote >> *((qi::char_ - dquote) | (dquote >> dquote)) >> dquote >> *qi::blank;
auto field = nonEscaped | escaped;

When I try to parse an input:

string input(" "e""e" ");
qi::phrase_parse(begin(input), end(input), field, qi::char_('
'));

The input is not fully matched by the escaped rule, but only the nonEscaped rule is applied. So only the first space is matched. How do I convince spirit to parse whole input or to parse as much as possible?

When I change the order of variants in the field rule to the following, then it works. But is that the right solution?

auto field = escaped | nonEscaped;

sehe · Accepted Answer

Yes, reordering is the right solution.

Boost Spirit generates what's known as LL parsers, which means

It parses the input from Left to right, and constructs a Leftmost derivation of the sentence (hence LL, compared with LR parser)

In simple words, it matches the first possible token and doesn't do backtracking unless the rule fails. You could 'assert' a post-condition of sorts at the end of the nonEscaped rule, see

qi::eps(false)
& parser operator (zero-width lookahead)
! parser operator (zero-width negative lookahead)
- parser operator

Using Semantic Actions:

assigning to _pass in semantic actions
use a semantic action function object, returning bool (false to fail)

However, in practice this will lead to suboptimal parsers (unnecessary backtracking, e.g.)

HTH

Boost.Spirit is not parsing whole input

Answers (1)

Related Questions