Reputation: 3152
I am trying to create a package to download, import and clean data from the Dominican Republic Central Bank web page. I have done all the coding in Rstudio.cloud and everything works just fine, but when I try the functions in my local machine they do not work.
After digging a bit on each function, I realized that the problem was the downloaded file, it is corrupt.
I am including the first steps of a function just to illustrate my issue.
# Packages
library(readxl)
# file url.
url <- paste0("https://cdn.bancentral.gov.do/documents/",
"estadisticas/precios/documents/",
"ipc_base_2010.xls?v=1570116997757")
# termporary path
file_path <- tempfile(pattern = "", fileext = ".xls")
# downloading
download.file(url, file_path, quiet = TRUE)
# reading the file
ipc_general <- readxl::read_excel(
file_path,
sheet = 1,
col_names = FALSE,
skip = 7
)
Error:
filepath: C:\Users\Johan Rosa\AppData\Local\Temp\RtmpQ1rOT3\2a74778a1a64.xls
libxls error: Unable to open file
I am using temporary files, but that is not the problem, you can try to download the file in your working directory and the problem persist.
I want to konw:
By the way, I am using Windows 10
Edit
Answer:
1- Rstudio.cloud runs on linux, but for Windows, I need to make some adjustments to the download.file()
command.
2- download.file(url, file_path, quiet = TRUE, mode = "wb")
This is what I was looking for.
Now I have a different problem. I have to think a way to detect if the function is running on Linux or Windows, to set that argument accordingly.
I can write a new download file function using if
else
calls on .Platform$OS.type result.
Or, can I set mode = "wb" for all download.file() calls?
Do you have any recommendations?
Upvotes: 1
Views: 720
Reputation: 1898
From the Documentation of download.file()
The choice of binary transfer (mode = "wb" or "ab") is important on Windows, since unlike Unix-alikes it does distinguish between text and binary files and for text transfers changes \n line endings to \r\n (aka CRLF).
Code written to download binary files must use mode = "wb" (or "ab"), but the problems incurred by a text transfer will only be seen on Windows.
From the source of download.file
head(print(download.file),12)
1 function (url, destfile, method, quiet = FALSE, mode = "w", cacheOK = TRUE,
2 extra = getOption("download.file.extra"), headers = NULL,
3 ...)
4 {
5 destfile
6 method <- if (missing(method))
7 getOption("download.file.method", default = "auto")
8 else match.arg(method, c("auto", "internal", "wininet", "libcurl",
9 "wget", "curl", "lynx"))
10 if (missing(mode) && length(grep("\\\\.(gz|bz2|xz|tgz|zip|rd[as]|RData)$",
11 URLdecode(url))))
12 mode <- "wb"
So looking at the source, if you did not set mode, the function uses automatically "w", except, the URL contains gz,bz2,xz etc. (that is why you get the first error).
In my humble opinion I think that in Unix-alikes (e.g. Linux) "w" and "wb" are the same, because they do not differentiate between text and binary files, but Windows does.
So you could set mode="wd" for all download.file calls (as long as it is not a text transfer under Windows), this will not affect the function in Linux.
Upvotes: 2