user3139545
user3139545

Reputation: 7394

Unable write parser where the AST can be turned into Clojure code

Given the following example "~a{b=1}&(a{b=1}|a{b=1})|a{b=1}|a{b=1}" I have written the following parser using Instaparse

((insta/parser
  "
S = (group | exp)+
group = '~'? <'('> exp+ <')'> op?
exp = '~'? path <'='> (v | r) <'}'> op?
path = (p <'{'>)* p
op = '|' | '&'
<p> = #'[a-z]'
v = #'[a-z0-9]'
r = <'\\''> #'[^\\']*' <'\\''>
")
 "~a{b=1}&(a{b=1}|a{b=1})|a{b=1}|a{b=1}")

Running the above output the following output

[:S
 [:exp "~" [:path "a" "b"] [:v "1"] [:op "&"]]
 [:group
  [:exp [:path "a" "b"] [:v "1"] [:op "|"]]
  [:exp [:path "a" "b"] [:v "1"] [:op "|"]]
  [:exp [:path "a" "b"] [:v "1"]]
  [:op "|"]]
 [:exp [:path "a" "b"] [:v "1"]]]

However from this output I have a very hard time writing a transformation into Clojure expressions. To have a more straightforward transformation I would need something more like:

[:S
 [:op "|"
  [:op "&"
   [:exp "~" [:path "a" "b"] [:v "1"]]
   [:group
    [:op "|"
     [:exp [:path "a" "b"] [:v "1"]]
     [:op "|"
      [:exp [:path "a" "b"] [:v "1"]]
      [:exp [:path "a" "b"] [:v "1"]]]]]]
  [:exp [:path "a" "b"] [:v "1"]]]]

Given this structure it would be much easier to transform this in to Clojure.

How would your write a generic parser that can parse structures like the above and similar into an AST that can then be turned into Clojure code using a simple insta/transfrom?

Upvotes: 1

Views: 84

Answers (1)

cfrick
cfrick

Reputation: 37063

I'd follow the example from Operator-precedence parser; this will give you "single" and/or terms, but that should be easy to trim/optimize away in your following steps.

S = expr
<expr> = and
and = or ( <'&'> or ) *
or= primary ( <'|'> primary ) *
<primary> = ( group | not | term )
<group> = <'('> expr <')'>
not = <'~'> term
term = #'a\\{b=[0-9]\\}'

E.g.

((insta/parser
   "
   S = expr
   <expr> = and
   and = or ( <'&'> or ) *
   or= primary ( <'|'> primary ) *
   <primary> = ( group | not | term )
   <group> = <'('> expr <')'>
   not = <'~'> term
   term = #'a\\{b=[0-9]\\}'
   ")
    "~a{b=1}&(a{b=2}|a{b=3})|a{b=4}|a{b=5}")
; →
; [:S
;  [:and
;    [:or [:not [:term "a{b=1}"]]]
;    [:or
;       [:and [:or [:term "a{b=2}"] [:term "a{b=3}"]]]
;       [:term "a{b=4}"]
;       [:term "a{b=5}"]]]]

Upvotes: 2

Related Questions