Reputation: 55
Currently I am using openxlsx package to read a large excel file (~70Mb and 400,000 columns). I have tried other packages (XLConnect, xlsx, readxl) but they all either give me error or bring my computer to standstill. However, a big problem with openxlsx::read.xlsx is that they do not import all columns in the excel worksheet, as detailed below:
Picture above is the preview of the excel file I need to import. It has 15 columns. However, when I import this file into a R dataframe using openxlsx::read.xlsx, it only import 5 columns, as shown below:
It seems to me that openxlsx in this case only import columns with date and numerical values (Col 8 9 10 11 15) and ignore the rest. Please help me explain the reason for such behavior and is there anyway to remedy the issue (i.e. get openxlsx to import all columns). Thank you very much!
Upvotes: 4
Views: 1807
Reputation: 1706
I can't explain why openxlsx
behave like that but the readxl
package seems to work in this case.
Upvotes: 0
Reputation: 11
Had a similar issue today, I believe the cause was the way in which the file was created - by SAS. Have you tried opening the file in excel to get it to interpret all the formatting correctly?
My issue was solved by simply opening, saving, and closing the file.
Alternatively if you've since solved this issue another way I would like to hear it.
Upvotes: 1