Reputation: 11934
Consider a file on the internet (like this one (note the s in https) https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.xls
How can the sheet 2 of the file be read into R?
The following code is approximation of what is desired (but fails)
url1<-'https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.xls'
p1f <- tempfile()
download.file(url1, p1f, mode="wb")
p1<-read_excel(path = p1f, sheet = 2)
Upvotes: 32
Views: 30222
Reputation: 76700
As suggested by @IRTFM, only thing wrong with OP is lack of extension in the temporary file. Works fine with:
url1 <- 'https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.xls'
p1f <- tempfile(fileext=".xls")
download.file(url1, p1f, mode="wb")
p1 <- read_excel(path=p1f, sheet=2)
This is implicit in @lukeA's answer, but I dislike the extra import (httr
) and find it less readable.
Upvotes: 0
Reputation: 2755
A simpler solution is using the openxlsx package. Here is an example, which can be adapted to your needs:
library(openxlsx)
df = read.xlsx("https://archive.ics.uci.edu/ml/machine-learning-databases/00242/ENB2012_data.xlsx",sheet=1)
Upvotes: 16
Reputation: 679
use rio
R package. link. Here a reprex:
library(tidyverse)
library(rio)
url <- 'https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.xls'
rio::import(file = url,which = 2) %>%
glimpse()
#>
#> Rows: 30,995
#> Columns: 8
#> $ Code <chr> "C141663", "C141706", "C141707"...
#> $ `Codelist Code` <chr> NA, "C141663", "C141663", "C141...
#> $ `Codelist Extensible (Yes/No)` <chr> "No", NA, NA, NA, "No", NA, NA,...
#> $ `Codelist Name` <chr> "4 Stair Ascend Functional Test...
#> $ `CDISC Submission Value` <chr> "A4STR1TC", "A4STR101", "A4STR1...
#> $ `CDISC Synonym(s)` <chr> "4 Stair Ascend Functional Test...
#> $ `CDISC Definition` <chr> "4 Stair Ascend test code.", "4...
#> $ `NCI Preferred Term` <chr> "CDISC Functional Test 4 Stair ...
Upvotes: 8
Reputation: 12819
From this issue on Github (#278):
some functionality for supporting more general inputs will be pulled out of readr, at which point readxl can exploit that.
So we should be able to pass urls directly to read_excel()
in the (hopefully near) future.
Upvotes: 9
Reputation: 54237
This works for me on Windows:
library(readxl)
library(httr)
packageVersion("readxl")
# [1] ‘0.1.1’
GET(url1, write_disk(tf <- tempfile(fileext = ".xls")))
df <- read_excel(tf, 2L)
str(df)
# Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 20131 obs. of 8 variables:
# $ Code : chr "C115388" "C115800" "C115801" "C115802" ...
# $ Codelist Code : chr NA "C115388" "C115388" "C115388" ...
# $ Codelist Extensible (Yes/No): chr "No" NA NA NA ...
# $ Codelist Name : chr "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" ...
# $ CDISC Submission Value : chr "SIXMW1TC" "SIXMW101" "SIXMW102" "SIXMW103" ...
# $ CDISC Synonym(s) : chr "6 Minute Walk Functional Test Test Code" "SIXMW1-Distance at 1 Minute" "SIXMW1-Distance at 2 Minutes" "SIXMW1-Distance at 3 Minutes" ...
# $ CDISC Definition : chr "6 Minute Walk Test test code." "6 Minute Walk Test - Distance at 1 minute." "6 Minute Walk Test - Distance at 2 minutes." "6 Minute Walk Test - Distance at 3 minutes." ...
# $ NCI Preferred Term : chr "CDISC Functional Test 6MWT Test Code Terminology" "6MWT - Distance at 1 Minute" "6MWT - Distance at 2 Minutes" "6MWT - Distance at 3 Minutes" ...
Upvotes: 32