user2946746
user2946746

Reputation: 1780

How to download an .xlsx file in R and load the data into a dataframe?

I'm trying to download an .xlsx file from the eia and getting the following error.

The error is: "Error: ZipException (Java): invalid entry size (expected 2385 but got 2390 bytes)"

I have tried the following code:

library(XLConnect)
tmp = tempfile(fileext = ".xlsx")
download.file(url = "http://www.eia.gov/petroleum/drilling/xls/dpr-data.xlsx", destfile = tmp)
readWorksheetFromFile(file = tmp, sheet = "Eagle Ford Region", header = FALSE, startRow = 9, endRow = 151)

I have tried the other recommendations at: Read Excel file into R with XLConnect package from URL

Upvotes: 6

Views: 12401

Answers (2)

user2844936
user2844936

Reputation: 81

I'm really late to the party, but I spent a lot of time stuck on this same error, and this didn't work for me. If you're only trying to download the file for the purpose of loading it from disk using read_xlsx, a better solution which is to skip the disk step entirely:

# install.packages(rio)
library(rio)

data = rio::import(url)

Cheers

Upvotes: 8

m0nhawk
m0nhawk

Reputation: 24280

You should use wb - binary mode while downloading the files, that are not plain text:

download.file(url = "http://www.eia.gov/petroleum/drilling/xls/dpr-data.xlsx", destfile = tmp, mode="wb")

This will solve the issue.

Upvotes: 22

Related Questions