stix
stix

Reputation: 1146

Why does this boost::spirit::qi rule fail to parse a string?

I'm writing a parser for PureData patches using Boost spirit and C++.

I have the following simple test of parsing canvas records:

#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/adapted/adt/adapt_adt.hpp>
#include <boost/fusion/include/adapt_adt.hpp>

struct PdCanvas {
    int topX;
    int topY;
    int wX;
    int wY;
    std::string name;
    int openOnLoad; 
};

BOOST_FUSION_ADAPT_STRUCT(
    PdCanvas,
    (int, topX)
    (int, topY)
    (int, wX) 
    (int, wY)
    (std::string, name)
    (int, openOnLoad));


template <typename Iterator>
struct PdCanvasGrammar : qi::grammar<Iterator, PdCanvas(), ascii::space_type> {
    PdCanvasGrammar() : PdCanvasGrammar::base_type(canvasRule){
        
        canvasRule = qi::lit("#N canvas") >> qi::int_ >> qi::int_ >> qi::int_ >> qi::int_ >> +(qi::char_ - qi::space) >> qi::int_ >> ";";        

    }
    qi::rule<Iterator, PdCanvas(), ascii::space_type> canvasRule; 
   
};



int main(int argc, char** argv)
{
    if(argc != 2)
    {
        std::cout << "Usage: "  <<argv[0] << " <PatchFile>" << std::endl;
        exit(1); 
    }

    std::ifstream inputFile(argv[1]); 
    std::string inputString(std::istreambuf_iterator<char>(inputFile), {}); 

    PdCanvas root;
    PdCanvasGrammar<std::string::iterator> parser;
    std::cout << "Loaded file:\n " << inputString << std::endl;

    bool success = qi::phrase_parse(inputString.begin(), inputString.end(), parser, boost::spirit::ascii::space, root); 
    std::cout << "Success: " << success << std::endl;

    
    return 0; 

}

As one can see, the format of a canvas record is

#N canvas <int> <int> <int> <int> <string> <int>;

And that's what the rule should expect, but when I try to parse the following:

#N canvas 0 0 400 300 moo 1;

qi::phrase_parse returns false, indicating an unsuccessful parse.

As an aside, there is another form of the canvas grammar in PD, specifically for the root, which is of the form:

#N canvas <int> <int> <int> <int> <int>;

Which I have successfully parsed using a different rule, so my assumption is the problem comes from attempting to parse the string in the middle of the integers.

So my question is thus: What is wrong with my qi::rule and how can I change it to properly parse?

Upvotes: 1

Views: 139

Answers (1)

sehe
sehe

Reputation: 392911

Two things:

Greedy Parsing

Note that PEG grammars are "greedy left-to-right", so you will want to make sure that the int_ >> ";" is not parsed into the name:

Live On Coliru

#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>
#include <iomanip>
namespace qi = boost::spirit::qi;

struct PdCanvas { int topX, topY, wX, wY, openOnLoad; std::string name; };
BOOST_FUSION_ADAPT_STRUCT(PdCanvas, topX, topY, wX, wY, name, openOnLoad);

template <typename Iterator> struct PdCanvasGrammar : qi::grammar<Iterator, PdCanvas(), qi::space_type> {
    PdCanvasGrammar() : PdCanvasGrammar::base_type(canvasRule) {

        canvasRule =                            //
            "#N canvas" >> qi::int_ >> qi::int_ //
            >> qi::int_ >> qi::int_             //
            >> *(qi::char_ - (qi::int_ >> ';')) //
            >> qi::int_ >> ';'                  //
            ;
    }

    qi::rule<Iterator, PdCanvas(), qi::space_type> canvasRule;
};

int main() {
    PdCanvasGrammar<std::string::const_iterator> const parser;

    for (std::string const input :
         {
             "#N canvas 0 0 400 300 moo 1;",
             "#N canvas -10 -10 390 290 42 answers LtUaE -9;",
             R"(#N canvas -10 -10 390 290 To be, or not to be, that is the question:
Whether 'tis nobler in the mind to suffer
The slings and arrows of outrageous fortune,
Or to take Arms against a Sea of troubles,

-9;)",
         }) //
    {
        // std::cout << "Input:\n " << quoted(input) << std::endl;

        if (PdCanvas root; phrase_parse(input.begin(), input.end(), parser, qi::space, root))
            std::cout << "Success -> " << boost::fusion::as_vector(root) << "\n";
        else
            std::cout << "Failed\n";
    }
}

Prints:

Success -> (0 0 400 300 moo 1)
Success -> (-10 -10 390 290 42answersLtUaE -9)
Success -> (-10 -10 390 290 Tobe,ornottobe,thatisthequestion:Whether'tisnoblerinthemindtosufferTheslingsandarrowsofoutrageousfortune,OrtotakeArmsagainstaSeaoftroubles, -9)

Skipping Whitespace

I chose some outrageous "names" on purpose:

Your rule has a skipper: space_type. This - by definition - means that +(qi::char_ - qi::space) is equivalent to +qi::char_ because spaces aren't even seen by the expression.

To alleviate the issue make sure that the space-sensitive expression does not execute under the skipper, see Boost spirit skipper issues.

Using lexeme[] here is the quickest solution:

    canvasRule =                                        //
        "#N canvas" >> qi::int_ >> qi::int_             //
        >> qi::int_ >> qi::int_                         //
        >> qi::lexeme[*(qi::char_ - (qi::int_ >> ';'))] //
        >> qi::int_ >> ';'                              //
        ;

Prints Live:

Success -> (0 0 400 300 moo  1)
Success -> (-10 -10 390 290 42 answers LtUaE  -9)
Success -> (-10 -10 390 290 To be, or not to be, that is the question:
Whether 'tis nobler in the mind to suffer
The slings and arrows of outrageous fortune,
Or to take Arms against a Sea of troubles,

 -9)

To also disallow space in the name, use qi::graph instead of qi::char_:

Prints Live:

Success -> (0 0 400 300 moo 1)
Failed
Failed

Bonus Tips

To make things easier to maintain, debug (!!) and also express intent, I'd

  • restructure the grammar using rules - some of which can be implicit lexemes
  • also encapsulate the skipper (the caller should not be dictating that)
  • making sure the entire input is matched (qi::eoi)

Live On Coliru

// #define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>
#include <iomanip>
namespace qi = boost::spirit::qi;

struct PdCanvas { int topX, topY, wX, wY, openOnLoad; std::string name; };
BOOST_FUSION_ADAPT_STRUCT(PdCanvas, topX, topY, wX, wY, name, openOnLoad);

template <typename Iterator> struct PdCanvasGrammar : qi::grammar<Iterator, PdCanvas()> {
    PdCanvasGrammar() : PdCanvasGrammar::base_type(start) {
        using namespace qi;
        start      = skip(space)[canvasRule >> eoi];
        name       = +graph;
        canvasRule = "#N canvas" >> int_ >> int_ >> int_ >> int_ >> name >> int_ >> ';';

        BOOST_SPIRIT_DEBUG_NODES((start)(canvasRule)(name))
    }

  private:
    qi::rule<Iterator, PdCanvas()>                 start;
    qi::rule<Iterator, PdCanvas(), qi::space_type> canvasRule;
    qi::rule<Iterator, std::string()> name;
};

int main() {
    PdCanvasGrammar<std::string::const_iterator> const parser;

    for (std::string const input :
         {
             "#N canvas 0 0 400 300 foo 1;",
             "#N canvas 0 0 400 300 bar 1;",
             "#N canvas 0 0 400 300 qux1 1;",
             "#N canvas 0 0 400 300 qux23 1;",
             "#N canvas 0 0 400 300 qux23;funky 1;",
             "#N canvas 0 0 400 300 trailing 1; junk",
         }) //
    {
        std::cout << "Input: " << quoted(input) << std::endl;

        if (PdCanvas root; parse(input.begin(), input.end(), parser, root))
            std::cout << "    Success -> " << boost::fusion::as_vector(root) << "\n";
        else
            std::cout << "    Failed\n";
    }
}

Prints

Input: "#N canvas 0 0 400 300 foo 1;"
    Success -> (0 0 400 300 foo 1)
Input: "#N canvas 0 0 400 300 bar 1;"
    Success -> (0 0 400 300 bar 1)
Input: "#N canvas 0 0 400 300 qux1 1;"
    Success -> (0 0 400 300 qux1 1)
Input: "#N canvas 0 0 400 300 qux23 1;"
    Success -> (0 0 400 300 qux23 1)
Input: "#N canvas 0 0 400 300 qux23;funky 1;"
    Success -> (0 0 400 300 qux23;funky 1)
Input: "#N canvas 0 0 400 300 trailing 1; junk"
    Failed

Or, with debug enabled, e.g.

Input: "#N canvas 0 0 400 300 foo 1;"
<start>
  <try>#N canvas 0 0 400 30</try>
  <canvasRule>
    <try>#N canvas 0 0 400 30</try>
    <name>
      <try>foo 1;</try>
      <success> 1;</success>
      <attributes>[[f, o, o]]</attributes>
    </name>
    <success></success>
    <attributes>[[0, 0, 400, 300, [f, o, o], 1]]</attributes>
  </canvasRule>
  <success></success>
  <attributes>[[0, 0, 400, 300, [f, o, o], 1]]</attributes>
</start>
    Success -> (0 0 400 300 foo 1)

Upvotes: 1

Related Questions