user227710
user227710

Reputation: 3194

Use regular expression in sheet names of excel

I have over 100 excel files, each with 50 sheets. I am trying to read one sheet from each file. The location and name of the sheet that I am trying to read varies in each file.

However, the name of sheet (that I am trying to read) in each file always contains the word "validate". So, I am trying to read the sheet containing the word validate from 100 files. This requires me to use the regular expression (like "grepl" to choose the sheet containing the word "validate"). I want to know whether this is possible.

To my knowledge, read_excel and xlsx packages do not allow this. I want something like following:

library(xlsx2)

files<-paste0("file",1:100)

lapply(files, function(i){
          read.xlsx2(
            paste0(i,".xlsx"), sheetName = grep("validate", ??,value=TRUE))})

Upvotes: 4

Views: 1450

Answers (1)

Stedy
Stedy

Reputation: 7469

I have found that readxl is much easier for this type of task:

First I made some fake data in an Excel file then check the names and read in the tabs of interest:

sheets <- readxl::excel_sheets("test_38757844.xlsx")

for(x in sheets[grepl("validate", sheets) == T]){
  print(x)
  validate_x <- readxl::read_excel("test_38757844.xlsx", x)
  print(head(validate_x))
}
[1] "validate1"
  1   A
1 2  AA
2 3 Sas
3 4 SAS
4 4 GDD
5 5 ASD
6 6 asd
[1] "validate2"
  1 a
1 2 s
2 3 s
3 4 d
4 5 d
5 6 f
6 7 g

Upvotes: 1

Related Questions