Reputation: 10102
I am using read.xls()
from the gdata
package to read Excel workbooks where each workbook has one sheet. The read fails with the following error.
> read.xls(list.files[[1]])
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
no lines available in input
I can't figure out the error. sheetCount()
returns the following error.
> sheetCount(list.files[[1]])
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 161 did not have 13 elements
But this is odd, because the workbook has 27 columns. And nothing seems out of the ordinary for rows 161+/-1 or columns 13+/-1.
Throughout the workbook repeated entries are blank and you're expected to manually extend them down (which is impractical for the 750+ workbooks that I would like to read).
I tried manually setting quote=''
and quote='\''
, but these don't change the output. Is my problem that read.xls()
sees some rows as ragged but not others? Any pointers? (I tried the xlsReadWrite
package, but I am on 64bit Win 7 and it only works on 32bit systems).
Thanks!
Update
I followed @G. Grothendieck's suggestions and get the following.
> k <- count.fields(xls2csv(list.xls[[1]]), sep = ","); k
NULL
> L <- readLines(xls2csv(list.xls[[1]])); L
character(0)
The temp file that xls2csv()
generates is empty, so now I can't figure out why my countSheets()
call returns the "line 161, column 13" error.
I also followed @Joran's suggestions and converted the .xls file to a .csv file in Libre Office and it converts and reads just fine (i.e., it counts 27 fields in all 236 lines and logical readLines()
output).
Update 2
I should add that I think that these .xls files are not generated by Excel (my source is a little secretive about their origin), but I don't get any errors or warnings when I open them in Libre Office.
Upvotes: 1
Views: 4432
Reputation: 4414
Use XLConnect !
library(XLConnect)
readWorksheetFromFile(list.files[[1]], 1, useCachedValues=TRUE)
Upvotes: 1
Reputation: 10102
In my case I think the problem is that the .xls to .csv Perl script fails (this is what gdata
uses). I am still not sure why because LibreOffice converts the .xls to .csv with no warnings. I inspected the .csv with Vim and it looks normal (i.e., no crazy characters). I think the .xls is poorly formed by some proprietary script, so the Perl script fails.
Because LibreOffice works here, the easiest solution is to use command line LibreOffice (i.e., non of the Perl-based tools will work). I am on Win7, so I wrote a simple .bat file that converts every .xls in a directory.
for %%f in (*.xls) do soffice.exe -invisible -convert-to csv "%%f"
Upvotes: 1
Reputation: 269654
Try this and see if it suggests anything:
library(gdata)
k <- count.fields(xls2csv("myfile.xls"), sep = ","); k
L <- readLines(xls2csv("myfile.xls")); L
Upvotes: 1