Grasshopper_NZ
Grasshopper_NZ

Reputation: 777

Failed to keep the first empty rows when importing data from Excel into R

I'd like to keep the full data, including any empty rows and columns, when importing. However, I am not sure why the read.xlsx() function from the openxlsx package does not achieve this if the first row is an empty row.

For example, if we look at this simple .xlsx file:

enter image description here

And using this code to import data:

library(openxlsx)

read.xlsx(xlsxFile      = "dummy.xlsx",
          sheet         = 1,
          skipEmptyRows = FALSE,
          skipEmptyCols = FALSE,
          colNames      = FALSE)

I'd expect an empty row to also be present in R as the first row. However, the result shows that the first row is being skipped, and the actual value from row 2 is now in the first row:

enter image description here

It's a small issue but I do need the accurate coordinates for some tasks later. I'd be greatly appreciate if you know how to fix this issue. Many thanks!

Upvotes: 1

Views: 75

Answers (2)

Michael Dewar
Michael Dewar

Reputation: 3383

It seems like openxlsx is no longer under active development, but there is a new package openxlsx2 that is. It seems to do what you want.

library(openxlsx2)
read_xlsx(myfile,
         sheet         = 1,
         skip_empty_rows = FALSE,
         skip_empty_cols = FALSE,
         col_names      = FALSE)

gives output

                         A
2 Data row 1 (Excel row 2)

where the original row number is stored as a row name in the data frame. Depending on your future needs, you may want to consider this package.

Upvotes: 0

gabagool
gabagool

Reputation: 1161

openxlsx apparently has no option to do this.

From this question, you can use read_xlsx from readxl which is part of the tidyverse. Set the range parameter using cellranger::cell_limits

library(readxl)
read_xlsx("test.xlsx", col_names = FALSE, range = cellranger::cell_limits(ul = c(1L, 1L)))
New names:                                                                             
• `` -> `...1`
# A tibble: 2 × 1
  ...1                    
  <chr>                   
1 NA                      
2 Data row 1 (Excel row 2)

Upvotes: 3

Related Questions