boost::spirit::qi performance

Question

I have following snippet.

#include 
#include 
#include 

#include 
#include 

namespace qi = boost::spirit::qi;
namespace classic = boost::spirit::classic;

template
void output_time(const T& end, const T& begin)
{
   std::cout << std::chrono::duration_cast(
         end - begin).count() << std::endl;
}

template
struct qi_grammar : public qi::grammar
{
   qi_grammar():qi_grammar::base_type(rule_)
   {
      rule_ = *string_;
      string_ = qi::char_('"') >> *(qi::char_ - '"') >> qi::char_('"');
   }
   qi::rule rule_;
   qi::rule string_;
};

template
struct classic_grammar : public classic::grammar>
{
   template
   struct definition
   {
      definition(const classic_grammar&)
      {
         rule = *string_;
         string_ = classic::ch_p('"') >> *(classic::anychar_p - '"') >> classic::ch_p('"');
      }
      classic::rule rule, string_;
      const classic::rule& start() const { return rule; }
   };
};

template
void parse(Iter first, Iter last, const qi_grammar& prs)
{
   auto start = std::chrono::system_clock::now();
   for (int i = 0; i < 100; ++i)
   {
      Iter next = first;
      if (!qi::parse(next, last, prs) || next != last)
      {
         assert(false);
      }
   }
   auto finish = std::chrono::system_clock::now();
   output_time(finish, start);
}

template
void parse_c(Iter first, Iter last, const classic_grammar& prs)
{
   auto start = std::chrono::system_clock::now();
   for (int i = 0; i < 100; ++i)
   {
      auto info = classic::parse(first, last, prs);
      if (!info.hit) assert(false);
   }
   auto finish = std::chrono::system_clock::now();
   output_time(finish, start);
}

int main()
{
   qi_grammar qi_lexeme;
   classic_grammar classic_lexeme;
   std::stringstream ss;
   for (int i = 0; i < 1024 * 500; ++i)
   {
      ss << "\"name\"";
   }
   const std::string s = ss.str();
   std::cout << "Size: " << s.size() << std::endl;
   std::cout << "Qi" << std::endl;
   parse(s.begin(), s.end(), qi_lexeme);
   std::cout << "Classic" << std::endl;
   parse_c(s.begin(), s.end(), classic_lexeme);
}

results are

forever@pterois:~/My_pro1/cpp_pro$ ./simple_j 
Size: 3072000
Qi
0
Classic
1

so, qi parse faster than classic. But when i change attribute of string_ rule to std::string() (i.e. qi::rule string_;) i have

forever@pterois:~/My_pro1/cpp_pro$ ./simple_j 
Size: 3072000
Qi
19
Classic
1

It's very-very slow. I doing something wrong? Thanks.

compiler:gcc 4.6.3. boost - 1.48.0. flags: -std=c++0x -O2. On LWS results are same.

Usage of semantic actions for char_ i.e.

string_ = qi::char_('"') >> *(qi::char_[boost::bind(&some_f, _1)] - '"')
 >> qi::char_('"')[boost::bind(&some_clear_f, _1)];

improve perfomance, but i'm looking for another solution too, if it exists.

sehe · Accepted Answer

I think I answered a very similar question one before on SO. Sadly, I can't find it.

In short, you might prefer to use iterators into the source data instead of allocating (and copying) strings on each match.

When using

qi::rule()> string_;
string_ = qi::raw [ qi::char_('"') >> *(qi::char_ - '"') >> qi::char_('"') ];

I got (with considerably (16x) larger data set):

Size: 49152000
Qi
12
Classic
11

In fact, after changing the rule itself to

  string_ = qi::raw [ qi::lit('"') >> *~qi::char_('"') >> '"' ];

I got

Size: 49152000
Qi
7
Classic
11

So... that's pretty decent, I suppose. See it on LWS: http://liveworkspace.org/code/opA5s$0

For completeness, obviously you can get a string from the iterator_range by doing something like
const std::string dummy("hello world");
auto r = boost::make_iterator_range(begin(dummy), end(dummy));
std::string asstring(r.begin(), r.end());

The trick is to delay actual string construction to when it's needed. You might want to have this trick happen automatically. This is what Spirit Lex does for token attributes. You might want to look into that.

boost::spirit::qi performance

Answers (1)

Related Questions