Brideau
Brideau

Reputation: 4761

Issues Parsing XML File Using xpathSApply, R v3.1.1, XML v3.98-1.1

I'm attempting the parse the following XML file in R: http://reports.ieso.ca/public/GenOutputCapability/PUB_GenOutputCapability_20140517_v24.xml

My script is dead simple so far:

file <- "http://reports.ieso.ca/public/GenOutputCapability/PUB_GenOutputCapability_20140517_v24.xml"
doc <- xmlTreeParse(file, useInternal=TRUE)
rootNode <- xmlRoot(doc)
xpathSApply(rootNode, "//GeneratorName", xmlValue)

Whenever I run this, my output is simply an empty list.

Using this for other XML files, I can extract values no problem, but for this particular file, I can't extract anything. I've tried a number of different nodes, capitalizations, using useInternal=FALSE, and any other combination of things I could, but still no luck.

I can access parts using the rootNode[["IMODocBody"]][["Date"]] syntax to get the date, for example, so I know the file is loaded. Any ideas?

Upvotes: 2

Views: 275

Answers (1)

jdharrison
jdharrison

Reputation: 30425

You need to use the appropriate namespace:

> head(xpathSApply(doc, "//ns:GeneratorName", xmlValue
                   , namespaces = c(ns = "http://www.theIMO.com/schema")))
[1] "BRUCEA-G1" "BRUCEA-G2" "BRUCEA-G3" "BRUCEA-G4" "BRUCEB-G5" "BRUCEB-G6"

see ?xmlNamespaceDefinitions

> xmlNamespaceDefinitions(doc)
[[1]]
$id
[1] ""

$uri
[1] "http://www.theIMO.com/schema"

$local
[1] TRUE

attr(,"class")
[1] "XMLNamespaceDefinition"

$xsi
$id
[1] "xsi"

$uri
[1] "http://www.w3.org/2001/XMLSchema-instance"

$local
[1] TRUE

attr(,"class")
[1] "XMLNamespaceDefinition"

attr(,"class")
[1] "XMLNamespaceDefinitions"

Upvotes: 6

Related Questions