Reputation: 879
I'm having some troubles trying to download bulk data from Eurostat, hope that you can help me out. I based my code from this post.
library(devtools)
require(devtools)
install_github("rsdmx", "opensdmx")
require(rsdmx)
# Make a temporary file (tf) and a temporary folder (tdir)
tf <- tempfile(tmpdir = tdir <- tempdir())
## Download the zip file
download.file("http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?sort=1&file=data%2Frd_e_gerdsc.sdmx.zip", tf)
## Unzip it in the temp folder
test <- unzip(tf, exdir = tdir)
sdmx <- readSDMX(test)
stats <- as.data.frame(sdmx)
head(stats)
I'm receiving this warning, and the dataframe is empty:
Warning message:
In if (attr(regexpr("<!DOCTYPE html>", content), "match.length") == :
the condition has length > 1 and only the first element will be used
Upvotes: 1
Views: 451
Reputation: 603
in EUROSTAT, the result of an extraction is made of two separate XML
files:
DSD
(data structure definition), which describes the SDMX datasetBased on your code, try this:
testfile <- test[2] #path for the dataset
sdmx <- readSDMX(testfile, isURL = FALSE) # isURL = FALSE (to read a local file)
stats <- as.data.frame(sdmx)
head(stats)
Note: calling as.data.frame
might take some time to complete, depending on the size of the dataset. I have been performing more tests in order to further improve the performance of reading large SDMX datasets.
Your use case is very interesting, i will add it to the rsdmx documentation as it shows how to use both Eurostat Bulk download service and rsdmx.
Hope this helps!
Upvotes: 1