Download ASPX page with R

Question

There are a number of fairly detailed answers on SO which cover authenticated login to an aspx site and a download from it. As a complete n00b I haven't been able to find a simple explanation of how to get data from a web form

The following MWE is intended as an example only. And this question is more intended to teach me how to do it for a wider collection of webpages.

website :

http://data.un.org/Data.aspx?d=SNA&f=group_code%3a101

what I tried and (obviously) failed.

test=read.csv('http://data.un.org/Handlers/DownloadHandler.ashx?DataFilter=group_code:101;country_code:826&DataMartId=SNA&Format=csv&c=2,3,4,6,7,8,9,10,11,12,13&s=_cr_engNameOrderBy:asc,fiscal_year:desc,_grIt_code:asc')

giving me goobledegook with a View(test)

Anything that steps me through this or points me in the right direction would be very gratefully received.

user1609452 · Accepted Answer

The URL you are accessing using read.csv is returning a zipped file. You could download it using httr say and write the contents to a temp file:

 library(httr)
 urlUN <- "http://data.un.org/Handlers/DownloadHandler.ashx?DataFilter=group_code:101;country_code:826&DataMartId=SNA&Format=csv&c=2,3,4,6,7,8,9,10,11,12,13&s=_cr_engNameOrderBy:asc,fiscal_year:desc,_grIt_code:asc"
 response <- GET(urlUN)
 writeBin(content(response, as = "raw"), "temp/temp.zip")
 fName <- unzip("temp/temp.zip", list = TRUE)$Name
 unzip("temp/temp.zip", exdir = "temp")
 read.csv(paste0("temp/", fName))

Alternatively Hmisc has a useful getZip function:

 library(Hmisc)
 urlUN <- "http://data.un.org/Handlers/DownloadHandler.ashx?DataFilter=group_code:101;country_code:826&DataMartId=SNA&Format=csv&c=2,3,4,6,7,8,9,10,11,12,13&s=_cr_engNameOrderBy:asc,fiscal_year:desc,_grIt_code:asc"
 unData <- read.csv(getZip(urlUN))

Download ASPX page with R

Answers (2)

Related Questions