Mario Gronert
Mario Gronert

Reputation: 45

How use purrr::map family to apply function to a list of data frames directly, not create new objects

I want to apply a function to a set of data frames and have those data frames be updated directly, instead of creating new output that I use to overwrite the current data frames.

As an example, I have two data frames, df_a and df_b and a function age_category, which adds a column, stating whether the person is a child or an adult, depending on their age.

df_a <- data.frame(Region = c("North", "South"), Age = c(14, 50))
df_b <- data.frame(Staple = c("Rice", "Potato"), Age = c(35, 2))

df_a
>   Region Age
> 1  North  14
> 2  South  50

df_b
>   Staple Age
> 1   Rice  35
> 2 Potato   2

age_category <- function(x){
  x$category <- ifelse(x$Age >= 18, "adult", "child")
  return(x)
}

I create a list of the data frames and apply the function to them.

df_list <- list(df_a, df_b)

library(purrr)
exmpl_1 <- purrr::map(df_list, age_category)
exmpl_1

> [[1]]
>   Region Age category
> 1  North  14    child
> 2  South  50    adult

> [[2]]
>   Staple Age category
> 1   Rice  35    adult
> 2 Potato   2    child

Now I could use exmpl_1[[1]] to overwrite df_a (df_a <- exmpl_1[[1]]) and the same for df_b.

I am looking for a way to directly have the function overwrite the data frames as they go. Since I am not creating any output I would think I would need to change the function and use walk instead of map.

age_category_alt <- function(x){
  x$category <- ifelse(x$Age >= 18, "adult", "child")
  assign(deparse(substitute(x)), x)
}

walk(df_list, age_category_alt)

But this does not work. The data frames do not change and the only outcome is this warning:

Warning messages:
1: In assign(deparse(substitute(x)), x) :
  only the first element is used as variable name
2: In assign(deparse(substitute(x)), x) :
  only the first element is used as variable name

I kindly ask for assistance.

Upvotes: 1

Views: 452

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388807

There are multiple ways to handle this, although I personally prefer to keep data in lists instead of separate dataframes.

library(purrr)

1) Using named list and age_category function from the OP, we can use map and list2env

df_list <- list(df_a = df_a, df_b = df_b)

df_list <- map(df_list, age_category)
list2env(df_list, .GlobalEnv)

df_a
#  Region Age category
#1  North  14    child
#2  South  50    adult

df_b
#  Staple Age category
#1   Rice  35    adult
#2 Potato   2    child

2) Using same named list from above and assign with imap.

age_category_alt <- function(x, y){
  x$category <- ifelse(x$Age >= 18, "adult", "child")
  assign(y, x, envir = .GlobalEnv)
}

imap(df_list, age_category_alt)

Upvotes: 1

Related Questions