larry77
larry77

Reputation: 1533

R XML and Eurostat Data Download

I know that several people need to download data from the Eurostat website (see e.g. http://bit.ly/HrDTgT ), but what I am looking for is NOT a bulk download, but something more similar to downloading a properly formatted (small) CSV file. Consider for instance the following snippet

library(XML)

mylines <- readLines(url("http://bit.ly/1czdbRq"))
closeAllConnections()
mylist <- readHTMLTable(mylines,## stringsAsFactors = FALSE ,
                    asText=TRUE)
mytable <- mylist$xTable

That is already close to what I need, but there are a couple of things I cannot fix 1) the column names are lost 2) only the numerical values are left. I lose all the info about the countries the numbers refer to and the (eventual) levels/units of the statistical indicator.

Any idea about how to improve that (possibly in R)? Cheers

Lorenzo

Upvotes: 1

Views: 892

Answers (1)

eblondel
eblondel

Reputation: 603

As indicated by @Sergey you can use SDMX web services to query data from Eurostat. With the SDMX Eurostat REST API, this data (even if you specify a filter) will result in a single web URL (see Eurostat indications to build a SDMX data query).

In R, you can use the rsdmx package to read the data. See below example:

#in case you want to install rsdmx from Github
#(otherwise you can install it from CRAN)
require(devtools)
install_github("rsdmx", "opensdmx")
require(rsdmx)

#read EUROSTAT dataset
dataURL <- "http://ec.europa.eu/eurostat/SDMX/diss-web/rest/data/cdh_e_fos/..PC.FOS1.BE/?startperiod=2005&endPeriod=2011 "
sdmx <- readSDMX(dataURL)
stats <- as.data.frame(sdmx)
head(stats)

Note: You can find rsdmx either from CRAN or install it directly from GitHub repository. https://github.com/opensdmx/rsdmx

I invite you to check the rsdmx wiki if you want more examples.

Upvotes: 1

Related Questions