Reputation: 494
Just when I thought I understood XPath! I must be missing something really simple, but I can't select the value of the node "citedby-count" in the following:
xml <- "<?xml version='1.0' encoding='UTF-8'?>
<search-results xmlns='http://www.w3.org/2005/Atom' xmlns:cto='http://www.elsevier.com/xml/cto/dtd' xmlns:atom='http://www.w3.org/2005/Atom' xmlns:prism='http://prismstandard.org/namespaces/basic/2.0/' xmlns:opensearch='http://a9.com/-/spec/opensearch/1.1/' xmlns:dc='http://purl.org/dc/elements/1.1/'>
<entry>
<prism:url>http://api.elsevier.com/content/abstract/scopus_id/111111</prism:url>
<dc:title>Paper Title</dc:title>
<citedby-count>1</citedby-count>
</entry>
</search-results>"
doc <- xmlParse(xml)
I've tried
doc["//citedby-count"]
and
doc["//{'citedby-count'}"]
and
doc["//entry"]
but all return
list()
attr(,"class")
[1] "XMLNodeSet"
however,
doc["//dc:title"]
works just fine.
Have I just been looking at this too long? Please help!
**Edit:**I thought this was because of the hyphen but it can't be because
doc["//entry"]
doesn't work either.
Upvotes: 0
Views: 402
Reputation: 89285
Common namespace prefix is declared as xmlns:foo="..."
, where foo
is the prefix, and it is used in element name explicitly as <foo:bar>
where bar
is the element's local-name. Apart from that there is default namespace. It is namespace declared without prefix like xmlns="..."
, and the usage is implied on the element where default prefix is declared as well as the descendant elements, unless something is overriding the default namespace inheritance i.e having local default namespace or using explicit prefix in the descendant element's name.
That's the first part the story, which is about namespace in XML. On the other hand, XPath has no idea about default namespace. In XPath, element without prefix is always considered in empty namespace. To bridge the difference between XML and XPath regarding default namespace, usually when you need to query element in default namespace, you have to define a prefix pointing to the XML's default namespace and use that prefix in the XPath expression. That's basically what @hrbrmstr suggested in the first comment, something like the following (the prefix can be anything as long as it is mapped to the correct default namespace) :
doc["//d:citedby-count", namespaces=c(d="http://www.w3.org/2005/Atom")]
but turns out that your XML has an explicit prefix, atom
, which already points to the same namespace uri and can be used directly.
Upvotes: 1
Reputation: 97
You can also do doc["//x:citedby-count", namespace = "x"]
to deal with default namespaces (it is from the examples of xpathApply
).
Upvotes: 0