joshu
joshu

Reputation: 463

Spirit Grammar For Path Verificiation

I am trying to write a simple grammar using boost spirit to validate that a string is a valid directory. I am using these tutorials since this is the first grammar I have attempted: http://www.boost.org/doc/libs/1_36_0/libs/spirit/doc/html/spirit/qi_and_karma.html http://www.boost.org/doc/libs/1_48_0/libs/spirit/doc/html/spirit/qi/reference/directive/lexeme.html http://www.boost.org/doc/libs/1_44_0/libs/spirit/doc/html/spirit/qi/tutorials/employee___parsing_into_structs.html

Currently, what I have come up with is:

// I want these to be valid matches
std::string valid1 = "./";
// This string could be any number of sub dirs i.e. /home/user/test/ is valid
std::string valid2 = "/home/user/";

using namespace boost::spirit::qi;
bool match = phrase_parse(valid1.begin(), valid1.end(), lexeme[
    ((char_('.') | char_('/')) >> +char_ >> char_('/')],
    ascii::space);
if (match)
{
    std::cout << "Match!" << std::endl;
} 

However, this matches nothing. I had a few ideas as to why; however, after doing some research I haven't found the answers. For example I assume the +char_ will probably consume all chars? So how can I find out if some sequence of characters all end with /?

Essentially my thoughts behind writing the above code was I want directories starting with . and / to be valid and then the last character has to be a /. Could someone help me with my grammar or point me to something more similar example to what I want to do? This is purely an excise to learn how to use spirit.

Edit So I have got the parser to match using:

bool match = phrase_parse(valid1.begin(), valid1.end(), lexeme[
    ((char_('.') | char_('/')) >> *(+char_ >> char_('/'))],
    ascii::space);
if (match)
{
    std::cout << "Match!" << std::endl;
} 

Not sure if that is proper or not? Or if it is matching for other reasons... Also should the ascii::space be used here? I read in a tutorial that it was to make spaces agnostic i.e. a b is equivalent to ab. Which I wouldn't want in a path name? If it isn't the correct thing to use what would be?

SSCCE:

#include <string>
#include <iostream>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/qi_char.hpp>
#include <boost/spirit/include/qi_eoi.hpp>

int main()
{
  namespace qi = boost::spirit::qi;
  std::string valid1 = "./";
  std::string valid2 = "/home/blah/";
  bool match = qi::parse(valid2.begin(), valid2.end(), &((qi::lit("./")|'/') >> (+~qi::char_('/') % '/') >> qi::eoi));

  if (match)
  {
    std::cout << "Match" << std::endl;
  }
}

Upvotes: 1

Views: 88

Answers (1)

sehe
sehe

Reputation: 392979

If you don't want to ignore space differences (which you shouldn't), use parse instead of phrase_parse. The use of lexeme inhibits the skipper again (so you were just stripping leading/trailing space). See also stackoverflow.com/questions/17072987/boost-spirit-skipper-issues/17073965#17073965

Use char_("ab") instead of char_('a')|char_('b').

*char_ matches everything. You may have meant *~char_('/').

I'd suggest something like

 bool ok = qi::parse(b, f, &(lit("./")|'/') >> (*~char_('/') % '/'));

This won't expose the matched input. Add raw[] around it to achieve that .

Add > qi::eoi to assert all of the input was consumed.

Upvotes: 2

Related Questions