Mario GS
Mario GS

Reputation: 879

Eurostat Bulk SDMX data download into R?

I'm having some troubles trying to download bulk data from Eurostat, hope that you can help me out. I based my code from this post.

library(devtools)
require(devtools)
install_github("rsdmx", "opensdmx")
require(rsdmx)

# Make a temporary file (tf) and a temporary folder (tdir)
tf <- tempfile(tmpdir = tdir <- tempdir())

## Download the zip file 
download.file("http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?sort=1&file=data%2Frd_e_gerdsc.sdmx.zip", tf)

## Unzip it in the temp folder
test <- unzip(tf, exdir = tdir)

sdmx <- readSDMX(test)

stats <- as.data.frame(sdmx)
head(stats)

I'm receiving this warning, and the dataframe is empty:

Warning message:
In if (attr(regexpr("<!DOCTYPE html>", content), "match.length") ==  :
  the condition has length > 1 and only the first element will be used

Upvotes: 1

Views: 451

Answers (1)

eblondel
eblondel

Reputation: 603

in EUROSTAT, the result of an extraction is made of two separate XML files:

  • the DSD (data structure definition), which describes the SDMX dataset
  • the dataset itself

Based on your code, try this:

testfile <- test[2] #path for the dataset
sdmx <- readSDMX(testfile, isURL = FALSE) # isURL = FALSE (to read a local file)
stats <- as.data.frame(sdmx)
head(stats)

Note: calling as.data.frame might take some time to complete, depending on the size of the dataset. I have been performing more tests in order to further improve the performance of reading large SDMX datasets.

Your use case is very interesting, i will add it to the rsdmx documentation as it shows how to use both Eurostat Bulk download service and rsdmx.

Hope this helps!

Upvotes: 1

Related Questions