Read limited number of rows from excel file in R with read_excel

Question

I am reading in excel files in R using the readxl package like this:

library(readxl)
file_names <- list.files(pattern = ".xlsx")

list_collection <- list() 
for(i in 1:length(file_names)){
  frame <- read_excel(file_names[i], )
  frame_sub <- frame[1:100,]
  list_collection[i] <- list(frame_sub)  
}

Since there are a lot of excel files, and I only want the first 100 rows. Obviously this is not efficient. Is there a way to read in only 100 rows from excel initially, instead of reading in this whole file and THEN subsetting?

Rich Scriven · Accepted Answer

Try xlsx::read.xlsx() instead. It has arguments for specifying the start and end rows. Also note that I have made some improvements to your for() loop (memory allocation being the most important).

library(xlsx)
## get file names
file_names <- list.files(pattern = "\.xlsx$")
## allocate memory for our list
out <- vector("list", length(file_names)) 
## read the files and assign them to the list
for(i in seq_along(file_names)) {
    out[[i]] <- read.xlsx(file_names[i], startRow = 1, endRow = 100)  
}

Or you could do a named list by changing the for() loop to

for(file in file_names) {
    out[[file]] <- read.xlsx(file, startRow = 1, endRow = 100)  
}

Read limited number of rows from excel file in R with read_excel

Answers (2)

Related Questions