Reputation: 6755
I have this xml document
require(XML)
url <- "http://stats.oecd.org/restsdmx/sdmx.ashx/GetData/SNA_TABLE1/NOR+CAN+FRA+DEU+GBR+USA+ITA+JAP.B1_GA+B1G_P119+B1G+B1GVA+B1GVB_E+B1GVC+B1GVF+B1GVG_I+B1GVJ+B1GVK+B1GVL+B1GVM_N+B1GVO_Q+B1GVR_U+D21_D31+D21S1+D31S1+DB1_GA.CXC/all?startTime=1950&endTime=2013"
xml <- xmlParse(url)
which I am trying to access.
I can access the root element with
getNodeSet(xml, "//message:MessageGroup")
but then I can't descend the three to parse all elements DataSet/Series
.
getNodeSet(xml, "//message:MessageGroup/DataSet/Series")
returns an empty list. Is it a problem with the namespace of the document?
Upvotes: 2
Views: 74
Reputation: 206207
Yes. The problem is with the default namespace. You need to give it a name in order to be able to select nodes from it. You can do something like this
xml <- xmlParse(url)
ns<-xmlNamespaceDefinitions(xml, simplify=T)
names(ns)[1] <- "def" #assign name "def"
Then you can do
getNodeSet(xml, "//message:MessageGroup/def:DataSet/def:Series", namespaces=ns)
Upvotes: 2