Downloading xls files with a loop through url's gives me corrupted files

Question

I am downloading xls files from this page with a loop through url's with R (based on this first step):

getURLFilename <- function(url){
  require(stringi)
  hdr <-paste(curlGetHeaders(url),collapse = '')
  fname <- as.vector(stri_match(hdr,regex = '(?<=filename=\").*(?=\")'))
  fname
}


for(i in 8:56) {
  i1 <- sprintf('%02d', i)
  url <- paste0("https://journals.openedition.org/acrh/29", i1, "?file=1")
  file <- paste0("myExcel_", i, ".xls")
  if (!file.exists(file)) download.file(url, file) 
 }

The files are downloaded but corrupted.

Marco Sandri · Accepted Answer

You should use mode="wb" in download.file to write the file in binary mode.

library(readxl)
for (i in 8:55) {
  i1 <- sprintf('%02d', i)
  url <- paste0("https://journals.openedition.org/acrh/29", i1, "?file=1")
  if (is.na(format_from_signature(url))) {
    file <- paste0("myPdf_", i, ".pdf")
  } else {
    file <- paste0("myExcel_", i, ".xls")
  }
  if (!file.exists(file)) download.file(url, file, mode="wb") 
}

Now the downloaded Excel files are not corrupted.

Downloading xls files with a loop through url's gives me corrupted files

Answers (2)

Related Questions

Downloading xls files with a loop through url&#39;s gives me corrupted files

Answers (2)

Related Questions

Downloading xls files with a loop through url's gives me corrupted files