Kumar
Kumar

Reputation: 188

how to combine multiple excel files having multiple worksheets into a single excel file with multiple worksheets in R

I have multiple file names, for example, C1.xlsx; C2.xlsx; C3.xlsx; C4.xlsx, etc. where each file is having multiple worksheets such as C_1; C_2; C_3 and so... on i.e., worksheet names in all the files are same and the number of sheets are equal in all the files. Now, I need to combine all the worksheets having similar worksheet names from all the files. In addition, the column names in each worksheet are same.

i am using the following code to read all the excel files in a folder

library(readxl)

files <- list.files(path = "~/Dropbox/Data/multiple_files", pattern = 
"*.xlsx", full.names = T)

tbl <- sapply(files, read_excel, simplify=FALSE) %>% 
bind_rows(.id = "id")

the excel files i have are as shown below:

C1.xlsx (worksheet:C_1)   (worksheet:C_2)   (worksheet:C_3)
A  B  C  D  E               A  B  C  D  E       A  B  C  D  E
1  4  6  8  C_1             2  4  6  1  C_2     1  4  6  8  C_3
3  56 7  8  C_1             2  3  6  8  C_2     2  3  5  6  C_3
2  4  6  1  C_1             7  8  3  4  C_2     3  4  6  7  C_3

C2.xlsx (worksheet:C_1)   (worksheet:C_2)   (worksheet:C_3)
A  B  C  D  E               A  B  C  D  E      A  B  C  D  E
3  7  1  3  C_1             1  4  7  1  C_2    1  9  6  1  C_3
1  6  9  2  C_1             2  3  6  8  C_2    2  3  5  6  C_3
2  4  6  1  C_1             7  1  3  4  C_2    3  4  2  7  C_3

C3.xlsx (worksheet:C_1)   (worksheet:C_2)   (worksheet:C_3)
A  B  C  D  E               A  B  C  D  E      A  B  C  D  E
9  4  6  8  C_1             1  4  6  1  C_2    1  4  1  1  C_3
3  5  7  1  C_1             1  3  6  4  C_2    2  1  5  1  C_3
2  7  6  1  C_1             7  7  3  4  C_2    3  4  6  7  C_3

the combined worksheets in an excel file are expected to be as shown below:

Combined.xlsx (worksheet:C_1)   (worksheet:C_2)   (worksheet:C_3)
A  B  C  D  E               A  B  C  D  E        A  B  C  D  E
1  4  6  8  C_1             2  4  6  1  C_2      1  4  6  8  C_3
3  56 7  8  C_1             2  3  6  8  C_2      2  3  5  6  C_3
2  4  6  1  C_1             7  8  3  4  C_2      3  4  6  7  C_3
3  7  1  3  C_1             1  4  7  1  C_2      1  9  6  1  C_3
1  6  9  2  C_1             2  3  6  8  C_2      2  3  5  6  C_3
2  4  6  1  C_1             7  1  3  4  C_2      3  4  2  7  C_3
9  4  6  8  C_1             1  4  6  1  C_2      1  4  1  1  C_3
3  5  7  1  C_1             1  3  6  4  C_2      2  1  5  1  C_3
2  7  6  1  C_1             7  7  3  4  C_2      3  4  6  7  C_3

i am looking for a code to first read all the excel files in a folder and then combine the worksheets

thanks in advance

Upvotes: 1

Views: 742

Answers (2)

Kumar
Kumar

Reputation: 188

for ordering the list of dataframe i used the following code

sorted_bound_ss <- lapply(combined, function(C_1){
 C_1[order(C_1$A),]
})

it worked ....

Upvotes: 1

Synchronicity
Synchronicity

Reputation: 143

openxlsx or writexl will work. I think writexl is a bit faster. From the documentation:

Writes a data frame to an xlsx file. To create an xlsx with (multiple) named sheets, simply set x to a named list of data frames.

The following code 'should' work:

spreadsheets <- purrr::map(files, function(.x){
  .spreadsheet <- purrr::map(1:3, .file = .x, function(.x, .file){
    .sheet <- readxl::read_xlsx(.file, sheet = .x)
  })
})
bound_ss <- purrr::map(1:3, .spreadsheet = spreadsheets, function(.x, .spreadsheet){
    do.call("rbind", purrr::map(.spreadsheet,.ind = .x, function(.x, .ind){
      purrr::pluck(.x, .ind)
      }))
})
# Should be a list of three sheets in the long format you expect
names(bound_ss) <- paste0("Sheet", 1:3)
  writexl::write_xlsx(bound_ss, path = "~/Dropbox/Data/multiple_files/sheet.xlsx")

## Sorting based on a column C1
library(magrittr)
bound_ss %<>% purrr::map(~ dplyr::arrange(.x, C1)) # ascending
bound_ss %<>% purrr::map(~ dplyr::arrange(.x, dplyr::desc(C1))) # descending

Upvotes: 1

Related Questions