Reputation: 1311
I want to extract exchange rates from the ECB website to convert my local currencies data. However, I am struggling a lot with using xpath (although this helped me a lot).
library(XML)
fileURL <- "https://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml?93aad09b8f8b7bdb69cd1574b5b2665f"
download.file(fileURL, destfile=tf <- tempfile(fileext=".xml"))
xml_file <- xmlParse(tf)
xml_data <- xmlRoot(xml_file)
currency <- xml_data[["number(//Cube/@currency)"]]
rate <- xml_data[["number(//Cube/@rate)"]]
Then I just want to create simple data frame:
df <- data.frame(currency, rate)
Upvotes: 4
Views: 912
Reputation: 269556
1) xpathSApply The following line gives a character matrix m
with currency and rate columns:
m <- t(xpathSApply(xml_data, "//*[@rate]", xmlAttrs))
If needed in the form of a data frame with character and numeric columns add this:
read.table(text = paste(m[, 1], m[, 2]), as.is = TRUE)
Note: We avoided having to deal with namespaces by using *
in the XPath expression but if it were desired to explicitly refer to Cube
, as in the question, then it would be done like this:
m <- xpathSApply(xml_data, "//x:Cube[@rate]", xmlAttrs, namespaces = "x")
2) read.pattern An alternative way is to parse the XML file using read.pattern
in gsubfn. (This does not use the XML package.)
library(gsubfn)
read.pattern(tf, pattern = "'(...)' rate='([0-9.]+)'", col.names = c("currency", "rate"))
Upvotes: 4
Reputation: 78792
Alternate version of G's answer in the Hadleyverse:
library(xml2)
library(purrr)
library(readr)
fileURL <- "https://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml?93aad09b8f8b7bdb69cd1574b5b2665f"
doc <- read_xml(fileURL)
dat <- type_convert(map_df(xml_attrs(xml_find_all(doc, "//*[@rate]")), as.list))
dat
## Source: local data frame [31 x 2]
##
## currency rate
## (chr) (dbl)
## 1 USD 1.0933
## 2 JPY 130.5800
## 3 BGN 1.9558
## 4 CZK 27.0250
## 5 DKK 7.4623
## 6 GBP 0.7533
## 7 HUF 310.5500
## 8 PLN 4.3982
## 9 RON 4.5093
## 10 SEK 9.3580
## .. ... ...
You get typed columns in a data frame with this approach, too.
Upvotes: 4