Reputation: 188
I have multiple file names, for example, C1.xlsx; C2.xlsx; C3.xlsx; C4.xlsx, etc. where each file is having multiple worksheets such as C_1; C_2; C_3 and so... on i.e., worksheet names in all the files are same and the number of sheets are equal in all the files. Now, I need to combine all the worksheets having similar worksheet names from all the files. In addition, the column names in each worksheet are same.
i am using the following code to read all the excel files in a folder
library(readxl)
files <- list.files(path = "~/Dropbox/Data/multiple_files", pattern =
"*.xlsx", full.names = T)
tbl <- sapply(files, read_excel, simplify=FALSE) %>%
bind_rows(.id = "id")
the excel files i have are as shown below:
C1.xlsx (worksheet:C_1) (worksheet:C_2) (worksheet:C_3)
A B C D E A B C D E A B C D E
1 4 6 8 C_1 2 4 6 1 C_2 1 4 6 8 C_3
3 56 7 8 C_1 2 3 6 8 C_2 2 3 5 6 C_3
2 4 6 1 C_1 7 8 3 4 C_2 3 4 6 7 C_3
C2.xlsx (worksheet:C_1) (worksheet:C_2) (worksheet:C_3)
A B C D E A B C D E A B C D E
3 7 1 3 C_1 1 4 7 1 C_2 1 9 6 1 C_3
1 6 9 2 C_1 2 3 6 8 C_2 2 3 5 6 C_3
2 4 6 1 C_1 7 1 3 4 C_2 3 4 2 7 C_3
C3.xlsx (worksheet:C_1) (worksheet:C_2) (worksheet:C_3)
A B C D E A B C D E A B C D E
9 4 6 8 C_1 1 4 6 1 C_2 1 4 1 1 C_3
3 5 7 1 C_1 1 3 6 4 C_2 2 1 5 1 C_3
2 7 6 1 C_1 7 7 3 4 C_2 3 4 6 7 C_3
the combined worksheets in an excel file are expected to be as shown below:
Combined.xlsx (worksheet:C_1) (worksheet:C_2) (worksheet:C_3)
A B C D E A B C D E A B C D E
1 4 6 8 C_1 2 4 6 1 C_2 1 4 6 8 C_3
3 56 7 8 C_1 2 3 6 8 C_2 2 3 5 6 C_3
2 4 6 1 C_1 7 8 3 4 C_2 3 4 6 7 C_3
3 7 1 3 C_1 1 4 7 1 C_2 1 9 6 1 C_3
1 6 9 2 C_1 2 3 6 8 C_2 2 3 5 6 C_3
2 4 6 1 C_1 7 1 3 4 C_2 3 4 2 7 C_3
9 4 6 8 C_1 1 4 6 1 C_2 1 4 1 1 C_3
3 5 7 1 C_1 1 3 6 4 C_2 2 1 5 1 C_3
2 7 6 1 C_1 7 7 3 4 C_2 3 4 6 7 C_3
i am looking for a code to first read all the excel files in a folder and then combine the worksheets
thanks in advance
Upvotes: 1
Views: 742
Reputation: 188
for ordering the list of dataframe i used the following code
sorted_bound_ss <- lapply(combined, function(C_1){
C_1[order(C_1$A),]
})
it worked ....
Upvotes: 1
Reputation: 143
openxlsx
or writexl
will work. I think writexl
is a bit faster.
From the documentation:
Writes a data frame to an xlsx file. To create an xlsx with (multiple) named sheets, simply set
x
to a named list of data frames.
The following code 'should' work:
spreadsheets <- purrr::map(files, function(.x){
.spreadsheet <- purrr::map(1:3, .file = .x, function(.x, .file){
.sheet <- readxl::read_xlsx(.file, sheet = .x)
})
})
bound_ss <- purrr::map(1:3, .spreadsheet = spreadsheets, function(.x, .spreadsheet){
do.call("rbind", purrr::map(.spreadsheet,.ind = .x, function(.x, .ind){
purrr::pluck(.x, .ind)
}))
})
# Should be a list of three sheets in the long format you expect
names(bound_ss) <- paste0("Sheet", 1:3)
writexl::write_xlsx(bound_ss, path = "~/Dropbox/Data/multiple_files/sheet.xlsx")
## Sorting based on a column C1
library(magrittr)
bound_ss %<>% purrr::map(~ dplyr::arrange(.x, C1)) # ascending
bound_ss %<>% purrr::map(~ dplyr::arrange(.x, dplyr::desc(C1))) # descending
Upvotes: 1