Aaron Lee
Aaron Lee

Reputation: 5467

How to get namespaced nodes from scala.xml?

Looking at RSS, something like Craigslist's(http://chambana.craigslist.org/cta/index.rss) gives both nodes that are namespaced and not.

something like:

<item rdf:about="http://dallas.craigslist.org/sdf/cto/4206532641.html">
<title>
<![CDATA[ 1965 Pontiac Tempest GTO tribute ]]>
</title>
...tl;dr...
<dc:title>
<![CDATA[ 1965 Pontiac Tempest GTO tribute ]]>
</dc:title>
</item>

something like:

(item \ "title").text

gives the title twice. How do you access a namespaced node?

Upvotes: 1

Views: 1346

Answers (1)

Travis Brown
Travis Brown

Reputation: 139048

You'll need to filter the resulting NodeSeq:

val unprefixedTitle = (item \ "title").filter(_.prefix == null)
val dublinCoreTitle = (item \ "title").filter(_.prefix == "dc")

Each of these filtered sequences will contain a single element.

If you have the entire document (or at least the part with the namespace declarations) you can filter by namespace instead of prefix, which is more robust:

val dublinCoreTitle = (item \ "title").filter(
  _.namespace == "http://purl.org/dc/elements/1.1/"
)

Now you'll get the desired element even if you're working with a document that happens to map this namespace to a different prefix.

Upvotes: 5

Related Questions