user3715831
user3715831

Reputation: 13

Boost Spirit is incorrect parsing

I have a following code:

namespace qi = boost::spirit::qi;
std::string input("d:/std/prte/type.spr#12");
std::string::iterator strbegin = input.begin();
std::pair<std::string, int> p;
qi::parse(
    strbegin, 
    input.end(),
    *qi::char_ >> '#' >> qi::int_,       // parser grammar 
    p
);

I want to get ("d:/std/prte/type.spr", 12), but I got ("d:/std/prte/type.spr#12", 0). What is wrong?

Upvotes: 1

Views: 98

Answers (2)

sehe
sehe

Reputation: 392833

You forgot to check the result of parsing: See it Live On Coliru clearly prints:

Parse failed
Remaining unparsed: 'd:/std/prte/type.spr#12'

Here's the error handling code for future reference:

#include <boost/fusion/adapted/std_pair.hpp>
#include <boost/spirit/include/qi.hpp>

namespace qi = boost::spirit::qi;

int main()
{
    std::string input("d:/std/prte/type.spr#12");
    std::string::const_iterator iter = input.begin(), end = input.end();
    std::pair<std::string, int> p;
    bool ok = qi::parse(iter, end,
            *qi::char_ >> '#' >> qi::int_,       // parser grammar 
            p
        );

    if (ok)
        std::cout << "Parse success\n";
    else
        std::cout << "Parse failed\n";


    if (iter != end)
        std::cout << "Remaining unparsed: '" << std::string(iter,end) << "'\n";
}

Of course, you want non-greedy match until #:

        *(qi::char_ - '#') >> '#' >> qi::int_,       // parser grammar 

Also look at BOOST_SPIRIT_DEBUG_NODE to debug your rules:

qi::rule<It, std::string()> srule = *qi::char_;
qi::rule<It, int()> irule         = qi::int_;
BOOST_SPIRIT_DEBUG_NODES((srule)(irule))

It iter = input.begin(), end = input.end();
bool ok = qi::parse(iter, end,
        srule >> '#' >> irule,       // parser grammar 
        p
    );

Would print

<srule>
  <try>d:/std/prte/type.spr#12</try>
  <success></success>
  <attributes>[[d, :, /, s, t, d, /, p, r, t, e, /, t, y, p, e, ., s, p, r, #, 1, 2]]</attributes>
</srule>
Parse failed
Remaining unparsed: 'd:/std/prte/type.spr#12'

As you can see Live On Coliru as well

Upvotes: 1

Cort Ammon
Cort Ammon

Reputation: 10863

Spirit does not do backoffs like regular expressions do. *char_ will capture as many characters as possible, hence grabbing the entire string. If you check the return from parse, it should indicate that you failed to match because, after consuming d:/std/prte/type.spr#12 as *char_, it could not find a #

The solution should be to change it to *(char_ - '#') >> # >> int_

Upvotes: 2

Related Questions