Reputation: 345
I'm trying to parse some xml using clj-xpath, and basically I want to make a function that looks like this
(map
(fn [item]
{:title ($x:text "./title" item)
:url ($x:text "./url" item)})
(take 5
($x "/search/events/event" (xmldoc))))
But with arbitrary tags. So far, I have this
ns mashup-dsl.datamodel
(:use
[clj-xpath.core])
(def data-url "http://api.eventful.com/rest/events/search? app_key=4H4Vff4PdrTGp3vV&keywords=music&location=Belgrade&date=Future")
(def events-xml
(fn [] (slurp data-url)))
(def xmldoc
(fn [] (xml->doc (events-xml))))
(def item (take 5 ($x "/search/events/event" (xmldoc))))
(defn create-xpath [tag] (str "./" tag))
(def tags ["title" "url"])
(defn parse [item]
(doseq [tag tags])(into {} (keyword tag) ($x:text (create-xpath tag) item)))
But I'm getting this error, TransformerException Extra illegal tokens: '$', 'tag', '@', '64516c52' org.apache.xpath.compiler.XPathParser.error (XPathParser.java:610). So the problem is in parse function. Any ideas?
Upvotes: 2
Views: 395
Reputation: 14197
The simplest form would be:
(def url
(str
"http://api.eventful.com/rest/events/search?"
"app_key=4H4Vff4PdrTGp3vV&"
"keywords=music&"
"location=Tokyo&"
"date=Future"))
(def xml (slurp url))
(def event-titles (map #($x:text "./title" %) ($x "//event" xml)))
And the printout of event-titles would be:
("FLOPPY 10th Anniversary 「This is computer music」" "IN BUSINESS" "UNIT 10th Anniversary Erection" "In The Mix at 0" "\" 20140530 - Sick Team Release Party \"" "Fanfare Ciocarlia @ World Beat Festival" "Fanfare Ciocarlia @ Musashino Hall" "DBS presents PINCH Birthday Bash!!!" "BLUES SISTERS (from RESPECT)" "UNIST 2nd Album「Acoustic」リリースパーティー 「リリースしちゃってウカれNight(ドヤッ)☆」")
EDIT For a versatile function, you could define:
(defn search-for [tag local-path]
(map #($x:text (str (local-path) %) ($x (str "//" tag) *xml*)))
and use it like:
(search-for "event" "@id")
or
(search-for "event" "./title")
or
(search-for "image" "./url")
Upvotes: 3
Reputation: 3951
Here is how to extract first 5 titles:
user=> (map #($x:text "./title" %) (take 5 ($x "//event" (xmldoc))))
("9th International Belgrade Early Music Festival" "Belgrade Baroque Academy, Mijanovic, Gosta / 9th Belgrade Early Music Festival / Monteverdi: \"L'Incoronazione di Poppea\"" "Belgrade Baroque Academy, Mijanovic, Gosta / 9th Belgrade Early Music Festival / Monteverdi: \"L'Incoronazione di Poppea\"" "ICTM Study Group on Music and Dance in Southeastern Europe Conference" "New Belgrade Opera, Madlenianum Opera-Theatre, New Trinity Baroque; Mijanovic, Gosta / 9th Belgrade Early Music Festival / Monteverdi: \"L'incoronazione di Poppea\"")
It your example doseq
inproperly closed and you need to complile expression to use against xml->doc
result.
You can create a helper function that will return function to extract text from tag:
(defn tag-fn [tag] (partial $x:text tag))
Now, you can generate functions for "title" and "url":
user=> (tag-fn "title")
#<core$partial$fn__4190 clojure.core$partial$fn__4190@71cc2b7a>
and
user=> (map (tag-fn "title") (take 5 ($x "//event" (xmldoc))))
("9th International Belgrade Early Music Festival" "Belgrade Baroque Academy, Mijanovic, Gosta / 9th Belgrade Early Music Festival / Monteverdi: \"L'Incoronazione di Poppea\"" "Belgrade Baroque Academy, Mijanovic, Gosta / 9th Belgrade Early Music Festival / Monteverdi: \"L'Incoronazione di Poppea\"" "ICTM Study Group on Music and Dance in Southeastern Europe Conference" "New Belgrade Opera, Madlenianum Opera-Theatre, New Trinity Baroque; Mijanovic, Gosta / 9th Belgrade Early Music Festival / Monteverdi: \"L'incoronazione di Poppea\"")
or url and title:
user=> (map (juxt (tag-fn "url") (tag-fn "title")) (take 2 ($x "//event" (xmldoc))))
(["http://eventful.com/belgrade/events/9th-international-belgrade-/E0-001-064654999-7@2014061420?utm_source=apis&utm_medium=apim&utm_campaign=apic" "9th International Belgrade Early Music Festival"] ["http://eventful.com/belgrade/events/belgrade-baroque-academy-mijanovic-gosta-9th-belg-/E0-001-059734872-8?utm_source=apis&utm_medium=apim&utm_campaign=apic" "Belgrade Baroque Academy, Mijanovic, Gosta / 9th Belgrade Early Music Festival / Monteverdi: \"L'Incoronazione di Poppea\""])
or both url and title:
user=> (map (apply juxt (map tag-fn ["url" "title"])) (take 2 ($x "//event" (xmldoc))))
(["http://eventful.com/belgrade/events/9th-international-belgrade-/E0-001-064654999-7@2014061420?utm_source=apis&utm_medium=apim&utm_campaign=apic" "9th International Belgrade Early Music Festival"] ["http://eventful.com/belgrade/events/belgrade-baroque-academy-mijanovic-gosta-9th-belg-/E0-001-059734871-9?utm_source=apis&utm_medium=apim&utm_campaign=apic" "Belgrade Baroque Academy, Mijanovic, Gosta / 9th Belgrade Early Music Festival / Monteverdi: \"L'Incoronazione di Poppea\""])
Upvotes: 3