Reputation: 35
I am trying to bring this .xls file into R: https://www.reit.com/sites/default/files/returns/MonthlyHistoricalReturns.xls
I've tried to bring it in directly from the url on a Windows machine. I've already run across the https versus http issues as well as the perl issue for Windows. To get around this, I've tried to run on ubuntu as well as downloading the file first.
My latest two attempts with readxl and gdata both produce a data frame, though neither one has any data in it. There are no error messages.
NAREIT <- readxl::read_xls("~/Downloads/MonthlyHistoricalReturns.xls")
This produces 38 observations of one variable, all NA.
NAREIT <- gdata::read.xls("~/Downloads/MonthlyHistoricalReturns.xls")
And this one produces 0 observations of 1 variable, "No data available in table" is the text written inside the only cell.
The file is admittedly ugly, with multiple unneeded header rows, merged cells, frozen views, etc. I've tried specifying ranges, columns, rows, rows to skip, col names, etc.--everything I could think of from readxl and gdata documentation.
I can just cut the range I need, save as CSV, and work with it. But, as I am likely to have to come back to this regularly, I am looking for the'right' way to open this file. Any thoughts are much appreciated.
Upvotes: 2
Views: 144
Reputation: 475
It looks like there are several rows of header, so you would need to figure out what you would like as a header, or actually consult a few pages on stack overflow that show you how to deal with 2 line headers.
Anyways, I can import it like this, and it seems to be just fine
library(readxl)
MonthlyHistoricalReturns <- read_excel("MonthlyHistoricalReturns.xls", sheet = "Index Data", skip = 7)
I skipped to line 7 to start your header there
Upvotes: 2