Shile
Shile

Reputation: 1063

Parsing xml clojure

I am parsing a live rss feed and i am using the zipper method.Now i need my zipped xml to convert to a map with values something like this...

 {{:title "TITLE1" :description "DESCRIPTION1" :pubDate "PUBDATE1"}{:title "TITLE2" :description "DESCRIPTION2" :pubDate "PUBDATE2"}{:title "TITLE3" :description "DESCRIPTION3" :pubDate "PUBDATE3"} }

Here is my current code...i can get all the values individually,but i want it to be grouped together for each item.I want to do it in one traversal...

 (def xml (xml/parse "http://www.link.com/"))
 (def zipped (zip/xml-zip xml))
 (xml-> zipped :channel :item :title text)
 (xml-> zipped :channel :item :description text)
 (xml-> zipped :channel :item :pubDate text)

Here is an example that looks like my xml document...

 <?xml version="1.0"?><rss version="2.0"><channel>
 <item><title>Title 1</title><description>Description 1</description> <pubDate>pubdate 1</pubDate></item>
 <item><title>Title 2</title><description>Description 2</description> <pubDate>pubdate 2</pubDate></item>
 <item><title>Title 3</title><description>Description 3</description> <pubDate>pubdate 3</pubDate></item>

 </channel></rss>

Any help would be appreciated!

Upvotes: 1

Views: 367

Answers (4)

Aleksei Sotnikov
Aleksei Sotnikov

Reputation: 643

Alternatively, for parsing RSS/Atom feeds to a map, Buran library can be used.

(consume-http "https://stackoverflow.com/feeds/tag?tagnames=clojure")

=> 
{:info {:description "most recent 30 from stackoverflow.com",
        :encoding nil,
        :feed-type "atom_1.0",
        :style-sheet nil,
        :docs nil,
        :copyright nil,
        :published-date #inst"2018-08-20T08:03:33.000-00:00",
        :icon nil,
        :title "Active questions tagged clojure - Stack Overflow",
        :author nil,
        :categories (),
        :language nil,
        :link "https://stackoverflow.com/questions/tagged/?tagnames=clojure&sort=active",
        :contributors (),
        :web-master nil,
        :generator nil,
        :image nil,
        :managing-editor nil,
        :uri "https://stackoverflow.com/feeds/tag?tagnames=clojure",
        :authors (),
        :links ({:hreflang nil,
                 :title nil,
                 :href "https://stackoverflow.com/questions/tagged/?tagnames=clojure&sort=active",
                 :type "text/html",
                 :rel "alternate",
                 :length 0}, ...)},
 :entries ({:description {:mode nil,
                          :type "html",
                          :value "<p>..."},
            :updated-date #inst"2018-08-20T06:16:12.000-00:00",
            :comments nil,

Upvotes: 0

redhands
redhands

Reputation: 357

(ns parser (:require [clojure.xml :as xml])  
 (:require [clojure.zip :as zip])
  (:require [clojure.contrib.zip-filter.xml :as zf]))  

(defn get-field [element child]
(zf/xml1-> element child zf/text))

(defn parse-record [rec-xml]
(into {}
    (map 
        #(vector % (get-field rec-xml %))
        [:title :description :pubDate 
        ])))


(defn get-records [xml]
(map 
    parse-record
    (zf/xml-> (zip/xml-zip xml)
              :channel :item 

              )))
(doall (get-records (xml/parse "sample.xml")))

Upvotes: 1

ponzao
ponzao

Reputation: 20934

To get a list of maps this will work:

(for [item (xml-> zipped :channel :item)]
  {:title (xml1-> item :title text)
   :description (xml1-> item :description text)
   :pubDate (xml1-> item :pubDate text)})
;=> ({:title "Title 1", :description "Description 1", :pubDate "pubdate 1"} {:title "Title 2", :description "Description 2", :pubDate "pubdate 2"} {:title "Title 3", :description "Description 3", :pubDate "pubdate 3"})

As already commented I am not sure what keys you expect your map to contain, so I cannot provide a way to do that transformation.

Upvotes: 1

ymonad
ymonad

Reputation: 12090

Here's the code. maybe it's a little bit hard to read, but it's combination of basic functions.

I don't think this is the simplest solution, but it works.

(ns zp
    (:require [clojure.zip :as zip]
              [clojure.xml :as xml])
    (:use clojure.contrib.zip-filter.xml))

(def xml (xml/parse "sample.xml"))
(def zipped (zip/xml-zip xml))
(print (map (fn [elem] 
             (apply array-map (flatten (map #(cons % (xml-> elem % text)) '(:pubDate :description :title)
                  ))))
            (xml-> zipped :channel :item)))

Upvotes: 2

Related Questions