MikeTP
MikeTP

Reputation: 7986

Read Excel file into R with XLConnect package from URL

There are lots of good examples out there on how to read Microsoft Excel files into R with the XLConnect package, but I can't find any examples of how to read in an Excel file directly from a URL. The reproducible example below returns a "FileNotFoundException (Java)". But, I know the file exists because I can pull it up directly by pasting the URL into a browser.

fname <- "https://www.misoenergy.org/Library/Repository/Market%20Reports/20140610_sr_nd_is.xls"
sheet <- c("Sheet1")
data  <- readWorksheetFromFile(fname, sheet, header=TRUE, startRow=11, startCol=2, endCol=13)

Although, the URL is prefixed with "https:" it is a public file that does not require a username or password.

I have tried to download the file first using download.file(fname, destfile="test.xls") and got a message that says it was downloaded but when I try to open it in Excel to check to see if it was successful i get a Excel popup box that says "..found unreadable content in 'test.xls'.

Below are the specifics of my system:

Computer: 64-bit Dell running Operating System: Windows 7 Professional R version: R-3.1.0

Any assistance would be greatly appreciated.

Upvotes: 3

Views: 4165

Answers (4)

jdharrison
jdharrison

Reputation: 30425

You can use RCurl to download the file:

library(RCurl)
library(XLConnect)
appURL <- "https://www.misoenergy.org/Library/Repository/Market%20Reports/20140610_sr_nd_is.xls"
f = CFILE("exfile.xls", mode="wb")
curlPerform(url = appURL, writedata = f@ref, ssl.verifypeer = FALSE)
close(f)
out <- readWorksheetFromFile(file = "exfile.xls", sheet = "Sheet1", header = TRUE
                      , startRow = 11, startCol = 2, endCol = 15, endRow = 35)
> head(out)
Col1 EEI Col3 IESO MHEB Col6 PJM SOCO SWPP TVA WAUE Col12 Other Total
1 Hour  1 272   NA  768 1671   NA 148  200  -52 198  280    NA   700  4185
2 Hour  2 272   NA  769 1743   NA 598  200  -29 190  267    NA   706  4716
3 Hour  3 272   NA  769 1752   NA 598  200  -28 194  267    NA   710  4734
4 Hour  4 272   NA  769 1740   NA 598  200  -26 189  266    NA   714  4722
5 Hour  5 272   NA  769 1753   NA 554  200  -27 189  270    NA   713  4693
6 Hour  6 602   NA  769 1682   NA 218  200  -32 223  286    NA   714  4662

Upvotes: 2

kng229
kng229

Reputation: 473

library(relenium)
library(XML)
library(RCurl)

firefox=firefoxClass$new()
url="https://www.misoenergy.org/Library/Repository/Market%20Reports/20140610_sr_nd_is.xls"
url=sprintf(url)
firefox$get(url)

This will open a Firefox instance within R and ask you to download the file, which you could then open in the next line of code. I don't know of any R utilities that will open an excel spreadsheet from HTTPS.

You could then set a delay while you're saving the file and then read the sheet from your downloads folder:

Sys.sleep(10)
sheet <- c("Sheet1")
data  <- readWorksheetFromFile(path, sheet, header=TRUE, startRow=11, startCol=2, endCol=13)

Upvotes: 1

Martin Studer
Martin Studer

Reputation: 2321

XLConnect does not support importing directly from URLs. You have to use e.g. download.file first to download the file to your local machine:

require(XLConnect)
tmp = tempfile(fileext = ".xls")
download.file(url = "http://www.econ.yale.edu/~shiller/data/chapt26.xls", destfile = tmp)
readWorksheetFromFile(file = tmp, sheet = "Data", header = FALSE, startRow = 9, endRow = 151)

or with your originally proposed URL:

require(XLConnect)
tmp = tempfile(fileext = ".xls")
download.file(url = "https://www.misoenergy.org/Library/Repository/Market%20Reports/20140610_sr_nd_is.xls", destfile = tmp, method = "curl")
readWorksheetFromFile(file = tmp, sheet = "Sheet1", header = TRUE, startRow = 11, startCol = 2, endCol = 13)

Upvotes: 2

Steve S
Steve S

Reputation: 1053

Two things:

  1. Try using a different package--I know the gdata package's read.xls function has support for URLs

  2. Try loading in a publicly-available xls file to make sure it's not an issue with the particular website.

For instance, you can try:

library("gdata")
site <- "http://www.econ.yale.edu/~shiller/data/chapt26.xls"
data  <- read.xls(site, header=FALSE, skip=8)
head(data)

Upvotes: 2

Related Questions