Reputation: 10695
Are there secondary clojure xml parsing projects that could be used after or in conjunction with clojure-xml/parse
, and, if so, what are they?
clojure-xml/parse
works wonderfully, but the map returned by clojure-xml/parse
is deeply nested, at least after parsing one of our water cuts/tampers xml files. I am wondering if a secondary library exists that would allow me to parse further.
Here is just part of our xml file deliberately folded so you do not have to scroll.
:content [{:tag :Header, :attrs nil, :content [{:tag :ExportType,
:attrs nil, :content ["Tamper Export"]}
{:tag :CurrentDateTime, :attrs nil, :content ["
Notice the vector with embedded maps.
I can certainly develop something that could be used to parse this further, but I was just wondering if a module already exists.
Thank You.
Upvotes: 2
Views: 725
Reputation: 4014
The library to "parse" the content further is clojure.core. The functions and macros there can do a very good job of transforming the data structure generated from the XML into something useful. My personal favorite technique is using the two threading macros while making use of first and the keyword functions. If I need to do more than just digging deep, I'll write a quick function I can use map on.
The data structure you get back from the clojure.xml/parse is just as deep as the xml - each element has one map with three items, the content being a vector of child elements and strings. It may look a little bit deeper, but it's just an open representation of what would be stored, say, in the Java XML objects. It's biggest advantage is you don't need a special API to work with it - the functions you use on normal data work on the XML just as well. If anything, you write a few functions to translate into your domain and that's it.
Say you have something like the following (I'm leaving out attrs for brevity):
{:tag :stuff
:content [{:tag item
:content [{:tag :key :content ["Key one"]}
{:tag :value :content ["Item one"]}]}
{:tag item
:content [{:tag :key :content ["Key two"]}
{:tag :value :content ["Item two"]}]}]}
It's nested, but make a utility function for transforming each item into something usable.
(defn transform-item [item]
(let [key-element (-> item :content first)
value-element (-> item :content second)]
[(-> key-element :content first)
(-> value-element :content first)]))
And then map that on the content of the root element.
(defn transform-stuff [stuff-xml]
(into {} (map transform-item (:content stuff-xml)))
And you should end up with some data which actually represents your domain.
{"Key one" "Item One", "Key two" "Item 2"}
The key is to not think of it as parsing, but just as translating one data structure into another.
Upvotes: 3