Reputation: 41
I've got a question with the map function from the Purrr package.
As an example with the mtcars dataset:
#I create a second df
mtcars2 <- mtcars
#change one variable just to distinguish them
mtcars2$mpg <- mtcars2$mpg / 2
#create the list
dflist <- list(mtcars,mtcars2)
#then, a simple function example
my_fun <- function(x)
{x <- x %>%
summarise(`sum of mpg` = sum(mpg),
`sum of cyl` = sum(cyl)
)
}
#then, using map, this works and prints the desired results
list_results <- map(dflist,my_fun)
But, I would need to have the modified mtcars and mtcars2 saved as r objects (dataframes).
In advance, thanks a lot to all of you !
Upvotes: 1
Views: 2528
Reputation: 3200
Here is solution using purrr::walk()
with get()
and assign()
. Similar to those above, but not identical.
library(dplyr)
library(purrr)
data(mtcars)
Create the second data frame.
mtcars2 <- mtcars
mtcars2$mpg <- mtcars2$mpg / 2
Create the function to apply to each data frame.
sum_mpg_cyl <- function(.data) {
.data %>%
summarise(
`sum of mpg` = sum(mpg),
`sum of cyl` = sum(cyl)
)
}
Apply sum_mpg_cyl()
to mtcars
and mtcars2
, saving two data frames of summary stats by the same names to the global environment. A potential advantage of this method is that you do not need to create a separate list of data frames.
walk(
.x = c("mtcars", "mtcars2"),
.f = function(df_name) {
# Get the data frame from the global environment
df <- get(df_name, envir = .GlobalEnv)
# Calculate the summary statistics
df <- sum_mpg_cyl(df)
# Save the data frames containing summary statistics back to the global
# environment
assign(df_name, df, envir = .GlobalEnv)
}
)
I would probably also use an anonymous function and save the two data frames of summary stats with different names like this:
# Reset the data
data(mtcars)
mtcars2 <- mtcars
mtcars2$mpg <- mtcars2$mpg / 2
walk(
.x = c("mtcars", "mtcars2"),
.f = function(df_name) {
# Get the data frame from the global environment
df <- get(df_name, envir = .GlobalEnv)
# Calculate the summary statistics
df <- df %>%
summarise(
`sum of mpg` = sum(mpg),
`sum of cyl` = sum(cyl)
)
# Rename the data frames containing summary statistics to distinguish
# them from the input data frames
new_df_name <- paste(df_name, "stats", sep = "_")
# Save the data frames containing summary statistics back to the global
# environment
assign(new_df_name, df, envir = .GlobalEnv)
}
)
Upvotes: 0
Reputation: 19716
Here is an attempt:
library(purrr)
library(tidyverse)
mtcars2 <- mtcars
mtcars2$mpg <- mtcars2$mpg / 2
dflist <- list(mtcars,mtcars2)
To save the objects one would need to give them specific names, and use:
assign("name", object, envir = .GlobalEnv)
here is one way to achieve that:
my_fun <- function(x, list) {
listi <- list[[x]]
assign(paste0("object_from_function_", x), dflist[[x]], envir = .GlobalEnv)
x <- listi %>%
summarise(`sum of mpg` = sum(mpg),
`sum of cyl` = sum(cyl)
)
return(x)
}
my_fun
has two arguments - seq_along(list)
to generate specific names and the list
that is to be processed
this saves two objects object_from_function_1
and object_from_function_2
:
list_results <- map(seq_along(dflist), my_fun, dflist)
another approach would be to use list2env
outside of the map function as akrun suggested
dflist <- list(mtcars,mtcars2)
names(dflist) <- c("mtcars","mtcars2")
list2env(dflist, envir = .GlobalEnv) #this will create two objects `mtcars` and `mtcars2`
and run map
after you have created the objects as you have already done.
Upvotes: 4