Reputation: 61
I am using LALR(1) parsing from lark-parser library. I have written a grammar to parse an ORM like language. An example of my language is pasted below:
Table1
.join(table=Table2, left_on=[column1], right_on=[column_2])
.group_by(col=[column1], agg=[sum])
.join(table=Table3, left_on=[column1], right_on=[column_3])
.some_column
My grammar is:
start: [CNAME (object)*]
object: "." (CNAME|operation)
operation: [(join|group) (object)*]
join: "join" "(" [(join_args ",")* join_args] ")"
join_args: "table" "=" CNAME
| "left_on" "=" list
| "right_on" "=" list
group: "group_by" "(" [(group_args ",")* group_args] ")"
group_args: "col" "=" list
| "agg" "=" list
list: "[" [CNAME ("," CNAME)*] "]"
%import common.CNAME //# Variable name declaration
%import common.WS //# White space declaration
%ignore WS
When I parse the language, it gets parsed correctly but I get shift-reduce conflict warning. I believe that this is due to collision at object: "." (CNAME|operation)
, but I may be wrong. Is there any other way to write this grammar ?
Upvotes: 0
Views: 304
Reputation: 241921
I think you should replace
operation: [(join|group) (object)*]
With just
operation: join | group
You've already allowed repetition of object
in
start: [CNAME (object)*]
So also allowing object*
at the end of operation
is ambiguous, leading to the conflict.
Personally, I would have gone for something like:
start : [ CNAME ("." qualifier)* ]
qualifier: CNAME | join | group
Because I don't see the point of object
. But that's just a minor style difference.
Upvotes: 3