Reputation: 1981
R version 3.0.1 (2013-05-16) for Windows 8 knitr
version 1.5 Rstudio 0.97.551
I am using knitr
to do the markdown of my R code.
As part of my analysis I downloaded various data sets from the web, knitr
is totally fine with getting data from http sites but from https ones where it generates an unsupported URL scheme
message.
I know when using the download.file
function on a mac the method
parameter has to be set to curl
to get data from an https however this doesn't help when using knitr
.
What do I need to do so that knitr
will gather data from Https websites?
Edit: Here is the code chunk that returns an error in Knitr but when run through R works without error.
```{r}
fileurl <- "https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv"
download.file(fileurl, destfile = "C:/Users/xxx/yyy")
```
Upvotes: 11
Views: 25992
Reputation: 44525
Edit (May 2016): As of R 3.3.0, download.file()
should handle SSL websites automatically on all platforms, making the rest of this answer moot.
You want something like this:
library(RCurl)
data <- getURL("https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv",
ssl.verifypeer=0L, followlocation=1L)
That reads the data into memory as a single string. You'll still have to parse it into a dataset in some way. One strategy is:
writeLines(data,'temp.csv')
read.csv('temp.csv')
You can also separate out the data directly without writing to file:
read.csv(text=data)
Edit: A much easier option is actually to use the rio package:
library("rio")
import("https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv")
This will read directly from the HTTPS URL and return a data.frame.
Upvotes: 9
Reputation: 2100
Using the R download package takes care of the quirky details typically associated with file downloads. For you example, all you needed to do would have been:
```{r}
library(download)
fileurl <- "https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv"
download(fileurl, destfile = "C:/Users/xxx/yyy")
```
Upvotes: 1
Reputation: 576
Use setInternet2(use = TRUE)
before using the download.file()
function. It works on Windows 7.
setInternet2(use = TRUE)
download.file(url, destfile = "test.csv")
Upvotes: 9
Reputation: 140
I am sure you have already found solution to your problem by now.
I was working on an assignment right now and ended up getting the same error. I tried some of the tricks, but that did not work for me. Maybe because I am working on Windows machine.
Anyhow, I changed the link to http: rather than https: and that did the trick.
Following is chunk of my code:
if (!file.exists("./PeerAssesment2")) {dir.create("./PeerAssessment2")}
fileURL <- "http://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(fileURL, dest = "./PeerAssessment2/Data.zip")
install.packages("R.utils")
library(R.utils)
if (!file.exists("./PeerAssessment2/Data")) {
bunzip2 ("./PeerAssessment2/Data.zip", destname = "./PeerAssessment2/Data")
}
list.files("./PeerAssessment2")
noaaData <- read.csv ('./PeerAssessment2/Data')
Hope this helps.
Upvotes: 5
Reputation: 111
I had the same problem with a https with the following code running perfectly in R and getting unsupported URL scheme
when knitting to html:
temp = tempfile()
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2Factivity.zip", temp)
data = read.csv(unz(temp, "activity.csv"), colClasses = c("numeric", "Date", "numeric"))
I tried all the solutions posted here and nothing worked, in my absolute desperation I just eliminated the "s" in the "https" in the url and everything got fine...
Upvotes: 1
Reputation: 1524
You could use https with download.file() function by passing "curl" to method as :
download.file(url,destination,method="curl")
Upvotes: 21
Reputation: 1178
I had the same issue with knitr and download.file() with a https url, on Windows 8.
You could try setInternet2(TRUE)
before using the download.file()
function. However I'm not sure that this fix works on Unix-like systems.
setInternet2(TRUE) # set the R_WIN_INTERNET2 to TRUE
fileurl <- "https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv"
download.file(fileurl, destfile = "C:/Users/xxx/yyy") # now it should work
Source : R documentation (?download.file()
) :
Note that https:// URLs are only supported if --internet2 or environment variable R_WIN_INTERNET2 was set or setInternet2(TRUE) was used (to make use of Internet Explorer internals), and then only if the certificate is considered to be valid.
Upvotes: 4