Reputation: 1498
I would like to split a data frame in order to produce multiple excel files containing multiple sheets on the basis of another variable of the original split data frame.
The data frame I used is the toy data set mtcars
. I split it by cyl
in order to create multiple files with sheets based on the variable gear
.
So, I would expect to get three excel files with the following names:
Each of them containing:
What I did it seems to overwrite the files.
This is what I did:
library(tidyverse)
library(writexl)
# I split the data frame
list_of_cars_by_cyl <- mtcars %>%
dplyr::group_split(cyl)
# I gave names to split data frame elements
names(list_of_cars_by_cyl) <- list_of_cars_by_cyl %>%
purrr::map(~pull(.,cyl)) %>% # individua le modalit? della variabile area
purrr::map(~as.character(.)) %>% # converte il fattore in character
purrr::map(~unique(.))
nomi <- names(list_of_cars_by_cyl)
# I create a function in order to save split data frames in .xlsx with several sheets based on a second variable
save_to_excel <- function(x) {
# list by new variable
list_of_cars_by_gear <- x %>%
dplyr::group_split(gear)
# name list's elements
names(list_of_cars_by_gear) <- list_of_cars_by_gear %>%
purrr::map(~pull(., gear)) %>% # individua le modalit? della variabile area
purrr::map(~as.character(.)) %>% # converte il fattore in character
purrr::map(~unique(.))
# save to .xlsx
list_of_cars_by_gear %>%
writexl::write_xlsx(path = paste(cartelle[5], paste0("Cars_by_cyl_", nomi, "_", format(Sys.time(), format = "%d%m%Y_%H%M%S"), ".xlsx"), sep = "/"))
}
# run the function iteratively
list_of_cars_by_cyl %>%
purrr::map(save_to_excel)
Upvotes: 0
Views: 1219
Reputation: 1784
In your example, you are using the same file name for each list element - nomi
is used identically in every call to save_to_excel
. There are (at least) two ways to resolve this: construct the correct file name completely inside the save_to_excel
function, or simulateneously iterate over both list_of_cars_by_cyl
and nomi
with purrr::map2
.
Both options produce the expected output (3 Excel files containing the cyl number in the name, each with the respective sheets for the gear split).
Get the cyl part of the file name from the dataframe passed to save_to_excel
:
save_to_excel <- function(x) {
# list by new variable
list_of_cars_by_gear <- x %>%
dplyr::group_split(gear)
# name list's elements
names(list_of_cars_by_gear) <- list_of_cars_by_gear %>%
purrr::map(~pull(., gear)) %>% # individua le modalit? della variabile area
purrr::map(~as.character(.)) %>% # converte il fattore in character
purrr::map(~unique(.))
# cyl part for current file name
this_cyl <- unique(x$cyl)
# save to .xlsx
list_of_cars_by_gear %>%
writexl::write_xlsx(path = paste(
#cartelle[5], #not defined example code,
paste0("Cars_by_cyl_", this_cyl, "_", format(Sys.time(), format = "%d%m%Y_%H%M%S"), ".xlsx"),
sep = "/"))
}
list_of_cars_by_cyl %>%
purrr::map(save_to_excel)
#$`4`
#[1] "...\\Cars_by_cyl_4_25032022_220454.xlsx"
#$`6`
#[1] "...\\Cars_by_cyl_6_25032022_220454.xlsx"
#$`8`
#[1] "...\\Cars_by_cyl_8_25032022_220454.xlsx"
Add second argument to save_to_excel
and iterate over dataframes and names:
save_to_excel <- function(x, name) {
# list by new variable
list_of_cars_by_gear <- x %>%
dplyr::group_split(gear)
# name list's elements
names(list_of_cars_by_gear) <- list_of_cars_by_gear %>%
purrr::map(~pull(., gear)) %>% # individua le modalit? della variabile area
purrr::map(~as.character(.)) %>% # converte il fattore in character
purrr::map(~unique(.))
# save to .xlsx
list_of_cars_by_gear %>%
writexl::write_xlsx(path = paste(
#cartelle[5], #not defined example code,
paste0("Cars_by_cyl_", name, "_", format(Sys.time(), format = "%d%m%Y_%H%M%S"), ".xlsx"),
sep = "/"))
}
# run the function iteratively
# over dataframe list and nomi
list_of_cars_by_cyl %>%
purrr::map2(., nomi, save_to_excel)
#$`4`
#[1] "...\\Cars_by_cyl_4_25032022_221107.xlsx"
#$`6`
#[1] "...\\Cars_by_cyl_6_25032022_221107.xlsx"
#$`8`
#[1] "...\\Cars_by_cyl_8_25032022_221107.xlsx"
As the question also asks about understanding the purrr logic, here are a few different ways how the last step could have been formulated instead. All produce the same result, and which is most easy to use/understand is imho a matter of personal preference.
# refer to first, second argument with .x, .y
purrr::map2(list_of_cars_by_cyl, nomi, ~save_to_excel(x = .x, name = .y))
# refer to first, second argument with ..1, ..2
purrr::map2(list_of_cars_by_cyl, nomi, ~save_to_excel(x = ..1, name = ..2))
# define anonymous function with appropriate number of arguments
purrr::map2(list_of_cars_by_cyl, nomi, function(x, y) {save_to_excel(x, y)})
Upvotes: 1