CptNemo
CptNemo

Reputation: 6755

Accessing children elements of XML

I have this xml document

require(XML)
url <- "http://stats.oecd.org/restsdmx/sdmx.ashx/GetData/SNA_TABLE1/NOR+CAN+FRA+DEU+GBR+USA+ITA+JAP.B1_GA+B1G_P119+B1G+B1GVA+B1GVB_E+B1GVC+B1GVF+B1GVG_I+B1GVJ+B1GVK+B1GVL+B1GVM_N+B1GVO_Q+B1GVR_U+D21_D31+D21S1+D31S1+DB1_GA.CXC/all?startTime=1950&endTime=2013"
xml <- xmlParse(url)

which I am trying to access.

I can access the root element with

getNodeSet(xml, "//message:MessageGroup")

but then I can't descend the three to parse all elements DataSet/Series.

getNodeSet(xml, "//message:MessageGroup/DataSet/Series")

returns an empty list. Is it a problem with the namespace of the document?

Upvotes: 2

Views: 74

Answers (1)

MrFlick
MrFlick

Reputation: 206207

Yes. The problem is with the default namespace. You need to give it a name in order to be able to select nodes from it. You can do something like this

xml <- xmlParse(url)
ns<-xmlNamespaceDefinitions(xml, simplify=T)
names(ns)[1] <- "def"   #assign name "def"

Then you can do

getNodeSet(xml, "//message:MessageGroup/def:DataSet/def:Series", namespaces=ns)

Upvotes: 2

Related Questions