DSH
DSH

Reputation: 427

Tidyeval: apply function to data frames extracted from list

This is a simplified version of a problem involving a large list containing complex tables. I want to extract the tables from the list and apply a function to each one. Here we can create a simple list containing small named data frames:

library(tidyverse)

table_names <- c('dfA', 'dfB', 'dfC')

dfA <- tibble(a = 1:3, b = 4:6, c = 7:9)
dfB <- tibble(a = 10:12, b = 13:15, c = 16:18)
dfC <- tibble(a = 19:21, b = 22:24, c = 25:27)

df_list <- list(dfA, dfB, dfC) %>% setNames(table_names)

Here is a simplified example of the kind of operation I would like to apply:

dfA_mod <- df_list$dfA %>% 
  mutate(name = 'dfA') %>%
  select(name, everything()) 

In this example, I extract one of three tables in the list df_list$dfA, create a new column with the same value in each row mutate(name = 'dfA'), and re-order the columns so that the new column appears in the left-most position select(name, everything()). The resulting object is assigned to dfA_mod.

To solve the larger problem, I want to use one of the purrr::map() variants to apply the function over the character vector table_names, which was initiated in the first block of code above. The elements of table_names serve two purposes: 1) naming the tables held in the list; and 2) supplying values for the name column in the modified table.

I could write a function such as:

fun <- function(x) {
df_list$x %>% 
  mutate(name = x) %>%
  select(name, everything()) %>%
  assign(paste0(x, '_mod'), ., envir = .GlobalEnv)
}

And then use map() to create a new list of modified tables:

new_list <- df_list %>% map(table_name, fun(x))

But of course this code does not work, with the main obstacle being (for me at least) figuring out how to quote and unquote the right terms within the function. I'm a beginner at tidy evaluation, and I could use some help in specifying the function and using map properly.

Here is the desired output (for one modified table):

# A tibble: 3 x 4
  name      a     b     c
  <chr> <int> <int> <int>
1 dfA       1     4     7
2 dfA       2     5     8
3 dfA       3     6     9

Thanks in advance for any help!

Upvotes: 0

Views: 86

Answers (2)

akrun
akrun

Reputation: 887118

We can convert it to a single data.frame with map while passing the .id

library(purrr)
map_dfr(df_list,  I, .id = 'name')

Or with bind_rows

library(dplyr)
bind_rows(df_list, .id = 'name')
# A tibble: 9 x 4
#  name      a     b     c
#  <chr> <int> <int> <int>
#1 dfA       1     4     7
#2 dfA       2     5     8
#3 dfA       3     6     9
#4 dfB      10    13    16
#5 dfB      11    14    17
#6 dfB      12    15    18
#7 dfC      19    22    25
#8 dfC      20    23    26
#9 dfC      21    24    27

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388982

We can use purrr::imap which passes data in the list as well as name of the list

library(dplyr)
library(purrr)

df_out <- imap(df_list, ~.x %>% mutate(name = .y) %>% select(name, everything()))
df_out

#$dfA
# A tibble: 3 x 4
#  name      a     b     c
#  <chr> <int> <int> <int>
#1 dfA       1     4     7
#2 dfA       2     5     8
#3 dfA       3     6     9

#$dfB
# A tibble: 3 x 4
#  name      a     b     c
#  <chr> <int> <int> <int>
#1 dfB      10    13    16
#....
#....

This gives a list of desired dataframes, if you want them as separate dataframes, you can do

names(df_out) <- paste0(names(df_out), "_mod")
list2env(df_out, .GlobalEnv)

We can also do it using base R Map

df_out <- Map(function(x, y) transform(x, name = y)[c('name', names(x))], 
                               df_list, names(df_list))

and give list names same as above.

Upvotes: 1

Related Questions