Jdv
Jdv

Reputation: 993

Clojure: Transform nested maps into custom map keeping only specific attributes

I have a vector of maps (result of xml/parse) which contains the following vector of nested maps (I already got rid of some parts I don't want to keep):

[
{:tag :SoapObject, :attrs nil, :content [
    {:tag :ObjectData, :attrs nil, :content [
        {:tag :FieldName, :attrs nil, :content ["ID"]}
        {:tag :FieldValue, :attrs nil, :content ["8d8edbb6-cb0f-11e8-a8d5-f2801f1b9fd1"]}
    ]}
    {:tag :ObjectData, :attrs nil, :content [
        {:tag :FieldName, :attrs nil, :content ["Attribute_1"]}
        {:tag :FieldValue, :attrs nil, :content ["Value_1a"]}
    ]} 
    {:tag :ObjectData, :attrs nil, :content [
        {:tag :FieldName, :attrs nil, :content ["Attribute_2"]}
        {:tag :FieldValue, :attrs nil, :content ["Value_2a"]}
    ]} 
]}
{:tag :SoapObject, :attrs nil, :content [
    {:tag :ObjectData, :attrs nil, :content [
        {:tag :FieldName, :attrs nil, :content ["ID"]}
        {:tag :FieldValue, :attrs nil, :content ["90e39036-cb0f-11e8-a8d5-f2801f1b9fd1"]}
    ]}
    {:tag :ObjectData, :attrs nil, :content [
        {:tag :FieldName, :attrs nil, :content ["Attribute_1"]}
        {:tag :FieldValue, :attrs nil, :content ["Value_1b"]}
    ]}
    {:tag :ObjectData, :attrs nil, :content [
        {:tag :FieldName, :attrs nil, :content ["Attribute_2"]}
        {:tag :FieldValue, :attrs nil, :content ["Value_2b"]}
    ]}
]}
]

Now I want to extract only some specific data from this structure, producing a result which looks like this:

[
{"ID" "8d8edbb6-cb0f-11e8-a8d5-f2801f1b9fd1",
"Attribute_1" "Value_1a",
"Attribute_2" "Value_1a"}

{"ID" "90e39036-cb0f-11e8-a8d5-f2801f1b9fd1",
"Attribute_1" "Value_1b",
"Attribute_2" "Value_1b"}
]

Which clojure tool could help me accomplish this?

I've found another question which is a bit similar, but whenever I tried some version of a map call the result I got was some kind of clojure.lang.LazySeq or clojure.core$map which I couldn't get to print properly to verify the result.

Upvotes: 1

Views: 141

Answers (4)

akond
akond

Reputation: 16060

No need for fancy tools here. You can get away with the simplest chunk of code.

(use '[plumbing.core])
(let [A ...your-data...]
    (map (fn->> :content
            (mapcat :content)
            (mapcat :content)
            (apply hash-map)) 
         A))

Upvotes: 0

Simon Polak
Simon Polak

Reputation: 1989

You can also compose transducers. I was reading the other day something on JUXT blog about creating xpath like functionality with transducers.

(def children (map :content))

(defn tagp [pred]
  (filter (comp pred :tag)))

(defn tag= [tag-name]
  (tagp (partial = tag-name)))

(def text (comp (mapcat :content) (filter string?)))

(defn fields [obj-datas]
  (sequence (comp
             (tag= :ObjectData)
             (mapcat :content)
             text)
            obj-datas))

(defn clean [xml-map]
  (let [fields-list (sequence (comp
                               (tag= :SoapObject)
                               children
                               (map fields))
                              xml-map)]
    (map (partial apply hash-map) fields-list)))

Upvotes: 0

Alan Thompson
Alan Thompson

Reputation: 29984

You can easily solve tree-based problems using the Tupelo Forest library. You can see a video introduction from last year's Clojure Conj here.

For your problem, I'd approach it as follows. First, the data:

(dotest
  (let [data-enlive
        {:tag   :root
         :attrs nil
         :content
            [{:tag     :SoapObject, :attrs nil,
              :content 
                 [{:tag     :ObjectData, :attrs nil,
                   :content [{:tag :FieldName, :attrs nil, :content ["ID"]}
                             {:tag :FieldValue, :attrs nil, :content ["8d8edbb6-cb0f-11e8-a8d5-f2801f1b9fd1"]}]}
                  {:tag     :ObjectData, :attrs nil,
                   :content [{:tag :FieldName, :attrs nil, :content ["Attribute_1"]}
                             {:tag :FieldValue, :attrs nil, :content ["Value_1a"]}]}
                  {:tag     :ObjectData, :attrs nil,
                   :content [{:tag :FieldName, :attrs nil, :content ["Attribute_2"]}
                             {:tag :FieldValue, :attrs nil, :content ["Value_2a"]}]}]}
             {:tag     :SoapObject, :attrs nil,
              :content
                 [{:tag     :ObjectData, :attrs nil,
                   :content [{:tag :FieldName, :attrs nil, :content ["ID"]}
                             {:tag :FieldValue, :attrs nil, :content ["90e39036-cb0f-11e8-a8d5-f2801f1b9fd1"]}]}
                  {:tag     :ObjectData, :attrs nil,
                   :content [{:tag :FieldName, :attrs nil, :content ["Attribute_1"]}
                             {:tag :FieldValue, :attrs nil, :content ["Value_1b"]}]}
                  {:tag     :ObjectData, :attrs nil,
                   :content [{:tag :FieldName, :attrs nil, :content ["Attribute_2"]}
                             {:tag :FieldValue, :attrs nil, :content ["Value_2b"]}]}]}]}]

and then the code

(with-debug-hid
  (with-forest (new-forest)
    (let [root-hid     (add-tree-enlive data-enlive)
          soapobj-hids (find-hids root-hid [:root :SoapObject])
          objdata->map (fn [objdata-hid]
                         (let [fieldname-node  (hid->node (find-hid objdata-hid [:ObjectData :FieldName]))
                               fieldvalue-node (hid->node (find-hid objdata-hid [:ObjectData :FieldValue]))]
                           { (grab :value fieldname-node) (grab :value fieldvalue-node) }))
          soapobj->map (fn [soapobj-hid]
                         (apply glue
                           (for [objdata-hid (hid->kids soapobj-hid)]
                             (objdata->map objdata-hid))))
          results      (mapv soapobj->map soapobj-hids)]

with intermediate results:

          (is= (hid->bush root-hid)
            [{:tag :root}
             [{:tag :SoapObject}
              [{:tag :ObjectData}
               [{:tag :FieldName, :value "ID"}]
               [{:tag :FieldValue, :value "8d8edbb6-cb0f-11e8-a8d5-f2801f1b9fd1"}]]
              [{:tag :ObjectData}
               [{:tag :FieldName, :value "Attribute_1"}]
               [{:tag :FieldValue, :value "Value_1a"}]]
              [{:tag :ObjectData}
               [{:tag :FieldName, :value "Attribute_2"}]
               [{:tag :FieldValue, :value "Value_2a"}]]]
             [{:tag :SoapObject}
              [{:tag :ObjectData}
               [{:tag :FieldName, :value "ID"}]
               [{:tag :FieldValue, :value "90e39036-cb0f-11e8-a8d5-f2801f1b9fd1"}]]
              [{:tag :ObjectData}
               [{:tag :FieldName, :value "Attribute_1"}]
               [{:tag :FieldValue, :value "Value_1b"}]]
              [{:tag :ObjectData}
               [{:tag :FieldName, :value "Attribute_2"}]
               [{:tag :FieldValue, :value "Value_2b"}]]]])
          (is= soapobj-hids [:0009 :0013])

and the final results:

          (is= results
            [{"ID"          "8d8edbb6-cb0f-11e8-a8d5-f2801f1b9fd1",
              "Attribute_1" "Value_1a",
              "Attribute_2" "Value_2a"}
             {"ID"          "90e39036-cb0f-11e8-a8d5-f2801f1b9fd1",
              "Attribute_1" "Value_1b",
              "Attribute_2" "Value_2b"}]))))))

Further documentation is still in progress, but you can see API docs here and a live example of your problem here.

Upvotes: 0

leetwinski
leetwinski

Reputation: 17849

usually you can start from the bottom, gradually going up:

first you would like to parse the attr item:

(def first-content (comp first :content))

(defn get-attr [{[k v] :content}]
  [(first-content k)
   (first-content v)])

user> (get-attr {:tag :ObjectData, :attrs nil, :content [
        {:tag :FieldName, :attrs nil, :content ["ID"]}
        {:tag :FieldValue, :attrs nil, :content ["90e39036-cb0f-11e8-a8d5-f2801f1b9fd1"]}
        ]})
;;=> ["ID" "90e39036-cb0f-11e8-a8d5-f2801f1b9fd1"]

then you would turn every item into a map of attrs:

(defn parse-item [item]
  (into {} (map get-attr (:content item))))

(parse-item {:tag :SoapObject, :attrs nil, :content [
    {:tag :ObjectData, :attrs nil, :content [
        {:tag :FieldName, :attrs nil, :content ["ID"]}
        {:tag :FieldValue, :attrs nil, :content ["90e39036-cb0f-11e8-a8d5-f2801f1b9fd1"]}
    ]}
    {:tag :ObjectData, :attrs nil, :content [
        {:tag :FieldName, :attrs nil, :content ["Attribute_1"]}
        {:tag :FieldValue, :attrs nil, :content ["Value_1b"]}
    ]}
    {:tag :ObjectData, :attrs nil, :content [
        {:tag :FieldName, :attrs nil, :content ["Attribute_2"]}
        {:tag :FieldValue, :attrs nil, :content ["Value_2b"]}
    ]}
]})

;;=> {"ID" "90e39036-cb0f-11e8-a8d5-f2801f1b9fd1", "Attribute_1" "Value_1b", "Attribute_2" "Value_2b"}

so the last thing you need do, is to map over the top level form, producing the required result:

(mapv parse-item data)

;;=> [{"ID" "8d8edbb6-cb0f-11e8-a8d5-f2801f1b9fd1", "Attribute_1" "Value_1a", "Attribute_2" "Value_2a"} 
;;    {"ID" "90e39036-cb0f-11e8-a8d5-f2801f1b9fd1", "Attribute_1" "Value_1b", "Attribute_2" "Value_2b"}]

Upvotes: 3

Related Questions