tgoodhart
tgoodhart

Reputation: 3266

Parser description with repetition "meta-token"

Is there a well-known parser description language (like Backus-Naur) that allows for repetitions where the number of repetitions is extracted from the token stream? For bonus points, are there any C++ libraries that support this syntax?

Example:

Lets call the "meta-token" #, then I'm looking for a description language that would treat a production rule of the following form:

RULE = # EXPRESSION

As:

RULE = '1' EXPRESSION
     | '2' EXPRESSION EXPRESSION
     | '3' EXPRESSION EXPRESSION EXPRESSION
     | '4' EXPRESSION EXPRESSION EXPRESSION EXPRESSION
     | ...

Note that the counts are actual character literals. This is in contrast to augmented Backus-Naur form, where we can have rules of the form:

RULE = 2*3EXPRESSION

Which are equivalent to:

RULE = EXPRESSION EXPRESSION
     | EXPRESSION EXPRESSION EXPRESSION

Response to dgarant:

I'm not sure that's quite what I want. I'm thinking something along the following lines:

int i;

bool r = phrase_parse(first, last,
     (
       int_[ phoenix::ref(i) = _1] >> repeat(i)[/*EXPRESSION*/]
     )
     space );

More importantly though I was hoping for some formalized schema that could describe this idea. On a side node, Spirit does take some getting use to, but is pretty awesome. I'm a fan.

Upvotes: 0

Views: 120

Answers (1)

Dan Garant
Dan Garant

Reputation: 733

I can't think of a formal language which allows rule = # EXPRESSION to specify repetition where # is a character literal. In my opinion, it shouldn't be a problem to abuse the formal language specification provided you make a comment to clarify what you mean. If you really want to stick to standards, you could do the following in ABNF:

rule = '3' 3EXPRESSION
     | '4' 4EXPRESSION
     | '5' 5EXPRESSION

It doesn't look exactly like what you want but it gets the job done.

I believe boost::spirit::qi can suit your needs for parsing. Have a look at the repeat directive.

Spirit would allow you to write rules such as

rule = char_("'") >> int_[qi::_a = qi::_1] >> char_("'") >> repeat(qi::_a)[EXPRESSION]

If you're interested in determining the number of repetitions that were parsed, you can append another action to the rule: [phoenix::ref(pCt) = qi::_a]

std::vector<double>& v;
int pCt;

bool r = phrase_parse(first, last,
         (
           // to parse a collection of double expressions
           char_("'") >> int_[qi::_a = qi::_1] >> char_("'") >> repeat(qi::_a)[double_[push_back(phoenix::ref(v), _1)]]
           [phoenix::ref(pCt) = qi::_a]
         )
         space);
// assuming the parse was successful
std::cout << "Parsed " << pCt << " elements" << std::endl;

The style of Spirit::Qi parsers takes a while to get used to, but they're very powerful since you can integrate them directly into your code.

Upvotes: 0

Related Questions