Reputation: 1332
Assuming I have the following rule:
identifier %=
lexeme[
char_("a-zA-Z")
>> -(*char_("a-zA-Z_0-9")
>> char_("a-zA-Z0-9"))
]
;
qi::rule<Iterator, std::string(), Skipper> identifier;
and the following input:
// identifier
This_is_a_valid123_Identifier
As the traces show the identifier is parsed properly and the attributes are set but the skipper starts one char after the first character of the string again:
<identifier>
<try>This_is_a_valid123_I</try>
<skip>
<try>This_is_a_valid123_I</try>
<emptylines>
<try>This_is_a_valid123_I</try>
<fail/>
</emptylines>
<comment>
<try>This_is_a_valid123_I</try>
<fail/>
</comment>
<fail/>
</skip>
<success>his_is_a_valid123_Id</success>
<attributes>[[T, h, i, s, _, i, s, _, a, _, v, a, l, i, d, 1, 2, 3, _, I, d, e, n, t, i, f, i, e, r]]</attributes>
</identifier>
<skip>
<try>his_is_a_valid123_Id</try>
<emptylines>
<try>his_is_a_valid123_Id</try>
<fail/>
</emptylines>
<comment>
<try>his_is_a_valid123_Id</try>
<fail/>
</comment>
<fail/>
</skip>
I've already tried to use as_string in the lexeme expression which did not help.
Upvotes: 2
Views: 633
Reputation: 393064
I don't see why you complicate the expression. Can you try
identifier %=
char_("a-zA-Z")
>> *char_("a-zA-Z_0-9")
;
qi::rule<Iterator, std::string()> identifier;
This is about the most standard expression you can get. Even if you don't want to allow identifiers ending in _
I'm quite sure you don't want such a trailing _
to be parsed as 'the next token'. In such a case, I'd just add validation after the parse.
Update To the comment:
Here is the analysis:
First off: -(*x)
is a red flag. It is never a useful pattern as *x
already matches an empty sequence, you can't make it "more optional"
(in fact, if
*x
was made to allow partial backtracking as in regular expression, you'd likely have seen exponential performance or even infite runtime; "luckily",*x
is always greedy in Spirit Qi).
This indeed facilitates your bug. Let's look at your parser expression in the OP as lines 1, 2, 3.
T
. his_is_a_valid123_Identifier
. -(...)
kicks in and everything after line 1 is backtracked. However: Qi
Yes. You guessed it. std::string
is a container attribute.
So what you end up is a succeeded match with length 1 and residu of a failed optional sequence in the attribute.
Some other backgrounders on how to resolve this kind of backtracking issue:
Upvotes: 4