0xFFFFFFFF
0xFFFFFFFF

Reputation: 852

ask for a parser grammar, using boost spirit qi is better

I am trying to use boost::spirit::qi to parse an expression.

The expression is simple, it can be

  1. id, like x
  2. member of an object, like obj.x
  3. an element of an array, like arr[2]
  4. a result of function call. func(x, y)

A member of object can be array or function type so x.y[2], x.y() are legal.

A function result might be an array or object so func(x,y).value, func(x)[4] are legal too.

An array element might be object or function type so arr[5].y, arr[3](x, y) are legal.

Combined together, the following expression should be legal:

x[1]().y(x, y, x.z, z[4].y)().q[2][3].fun()[5].x.y.z

All these [...] (...) and . has same precedence and from left to right.

My grammar like this

expression
    = postfix_expr
    | member_expr
    ;

postfix_expr = elem_expr | call_expr | id;
elem_expr = postfix_expr >> "[" >> qi::int_ >> "]";
call_expr = postfix_expr >> "(" >> expression_list >> ")";
member_expr = id >> *("." >> member_expr);

expression_list
    = -(expression % ",")

but it always crashes, I think maybe somewhere has infinity loop.

Please give me some suggestion on how to parse this grammar.

EDIT FOLLOW UP QUESTION: thanks cadrian, it works!

now expression can parse correctly, but I want to introduce a new ref_exp which is an expression too, but not end with () because function result cannot be placed to left of assignment.

my definition is :

    ref_exp
        = id
        | (id >> *postfix_exp >> (memb_exp | elem_exp))
        ;

    postfix_exp
        = memb_exp
        | elem_exp
        | call_exp
        ;

    memb_exp = "." >> id;
    elem_exp = "[" >> qi::uint_ >> "]";
    call_exp = ("(" >> expression_list >> ")");

but boost::spirit::qi cannot parse this, I think the reason is (memb_exp | elem_exp) is part of postfix_exp, how to make it not parse all, and leave the very last part to match (memb_exp | elem_exp)

ref_exp examples: x, x.y, x()[12][21], f(x, y, z).x[2] not ref_exp : f(), x.y(), x[12]()

Upvotes: 3

Views: 174

Answers (2)

0xFFFFFFFF
0xFFFFFFFF

Reputation: 852

finally I think I solve this problem, but this solution has a side-effects, it will change operator associativity.

    lvalue_exp
        = id >> -(ref_exp);
        ;

    ref_exp
        = (postfix_exp >> ref_exp)
        | memb_exp
        | elem_exp
        ;

    postfix_exp
        = call_exp
        | memb_exp
        | elem_exp
        ;

    memb_exp
        = ("." >> id)
        ;

    elem_exp
        = ("[" >> qi::uint_ >> "]")
        ;

    call_exp
        = ("(" >> expression_list >> ")")
        ;

so for the expression f().y()[0] will parse like:

  1. f and ref_exp - ().y()[0]
  2. ().y()[0] parsed as ().y() and [0]
  3. ().y() parsed as () and .y()
  4. .y() parsed as .y and ()

if I do not distinguish lvalue

f().y()[0] will parse like:

  1. f and ().y()[0]
  2. () and .y()[0]
  3. .y and ()[0]
  4. () and [0]

so I will use second one and check reference when I generate the ast.

Thanks @cadrian

Upvotes: 2

cadrian
cadrian

Reputation: 7376

boost::spirit::qi is a descending parser; your grammar must not be left recursive.

See this question.

Here you definitely have a left-recursive grammar: postfix_expr -> elem_expr -> postfix_expr

EDIT One way to fix this.

As I see it, your expression is a string of ids with possible suffixes: [], (), ..

expression = id >> *cont_expr;
cont_expr = elem_expr | call_expr | member_expr
elem_expr = "[" >> qi::int_ >> "]";
call_expr = "(" >> expression_list >> ")";
member_expr = "." >> expression;
expression_list = -(expression % ",")

EDIT 2 If you want to be able to force precedence – for instance with parentheses:

expression = prefix_expr >> *cont_expr;
prefix_expr = id | par_expr
par_expr = "(" >> expression >> ")"

This way you could even write expressions like x.(y[3].foo)[5](fun(), foo(bar)) – if that makes sense.

EDIT 3 I answer to your comment here.

You need the left side of assignments not to be functions. That means that you have a specific suffix for left-hand expressions. Let's call that rule ref_exp as in your comment.

ref_exp = id >> -( *cont_expr >> cont_ref );
cont_ref = elem_expr | member_expr;

Upvotes: 3

Related Questions