bretauv
bretauv

Reputation: 8506

Rename dynamically using a dataframe name with dplyr

In this example, I am using the iris dataset and I would like to rename Petal.Length as iris:

library(dplyr)

some_fun <- function(x){
  head(x) %>%
    rename(!!quo_name(x) := "Petal.Length")
}

some_fun(iris)

But this gives the following error:

Error: `expr` must quote a symbol, scalar, or call

If I use enquo instead of quo_name, I have this error:

Error: The LHS of `:=` must be a string or a symbol

I guess the problem comes from the fact that I call some_fun(iris) and not some_fun("iris"), but I have to call some_fun(iris).

How can I do that, while using some_fun(iris)?

Edit: I need this function to run through a list using purrr::map(). Updated example:

library(dplyr)
library(purrr)

list_df <- list(mtcars2 = mtcars %>% mutate(Petal.Length = 1),
                iris2 = iris)

some_fun <- function(x){
  df_name <- deparse(substitute(x))
  head(x) %>%
    rename("{df_name}" := "Petal.Length")
}

test <- map(list_df, some_fun)
list2env(test, .GlobalEnv)

mtcars2
iris2

Upvotes: 3

Views: 1202

Answers (3)

R me matey
R me matey

Reputation: 685

Try getting the data set's name using deparse(substitute()), then use dplyr's new curly brackets for non-standard evaluation:

library(dplyr)

some_fun <- function(x){
  df_name <- deparse(substitute(x)) #Comes out as string of df's name
  head(x) %>%
     rename("{df_name}" := "Petal.Length") #df_name is evaluated, THEN becomes the new variable name for Petal.Length
}

some_fun(iris)

Basically everything within the curly brackets is evaluated first.

EDIT: Here's an update based on OPs update. Just extract the names beforehand, then pass them through the (slightly updated) function.

library(dplyr)
library(purrr)

list_df <- list(mtcars2 = mtcars %>% mutate(Petal.Length = 1),
                iris2 = iris)

df_names <- names(list_df)

some_fun <- function(x, x_name){
  df_name <- x_name
  head(x) %>%
    rename("{df_name}" := "Petal.Length")
}

test <- map2(list_df, df_names, some_fun) 
list2env(test, .GlobalEnv)

mtcars2
#   mpg cyl disp  hp drat    wt  qsec vs am gear carb mtcars2
#1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4       1
#2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4       1
#3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1       1
#4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1       1
#5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2       1
#6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1       1

iris2
#  Sepal.Length Sepal.Width iris2 Petal.Width Species
#1          5.1         3.5   1.4         0.2  setosa
#2          4.9         3.0   1.4         0.2  setosa
#3          4.7         3.2   1.3         0.2  setosa
#4          4.6         3.1   1.5         0.2  setosa
#5          5.0         3.6   1.4         0.2  setosa
#6          5.4         3.9   1.7         0.4  setosa

Upvotes: 2

RyanFrost
RyanFrost

Reputation: 1428

Here are another few methods that I think could be useful to you, based on the information added in your comments.

Starting with a named list:

library(purrr)
library(dplyr)

countries <- c("ABC", "DEF", "GHI", "JKL", "MNO")
df1 <- data.frame(country = countries, value = 1:5)
df2 <- data.frame(country = countries, value = 6:10)

df_list <- list(df1 = df1, df2 = df2)

df_list
#> $df1
#>   country value
#> 1     ABC     1
#> 2     DEF     2
#> 3     GHI     3
#> 4     JKL     4
#> 5     MNO     5
#> 
#> $df2
#>   country value
#> 1     ABC     6
#> 2     DEF     7
#> 3     GHI     8
#> 4     JKL     9
#> 5     MNO    10

We can use purrr's imap to use the names of each element to rename that element's 'value' column:

df_list %>%
  imap(~ .x %>% rename("{.y}" := value))
#> $df1
#>   country df1
#> 1     ABC   1
#> 2     DEF   2
#> 3     GHI   3
#> 4     JKL   4
#> 5     MNO   5
#> 
#> $df2
#>   country df2
#> 1     ABC   6
#> 2     DEF   7
#> 3     GHI   8
#> 4     JKL   9
#> 5     MNO  10

However, there's another way to merge these datasets that may be preferable if all of the 'value' columns are the same type.

In this case, we can use dplyr's bind_rows with the .id parameter to add an identifier column in the merged dataset. This way all of the values are in the same column, but we can still tell which source they came from.

df_list %>%
  bind_rows(.id = "df")
#>     df country value
#> 1  df1     ABC     1
#> 2  df1     DEF     2
#> 3  df1     GHI     3
#> 4  df1     JKL     4
#> 5  df1     MNO     5
#> 6  df2     ABC     6
#> 7  df2     DEF     7
#> 8  df2     GHI     8
#> 9  df2     JKL     9
#> 10 df2     MNO    10

Created on 2020-07-01 by the reprex package (v0.3.0)

Upvotes: 1

user63230
user63230

Reputation: 4636

I think you can skip this by using bind_rows with .id which adds the df name as a column in your merge:

library(tidyverse)
df1 <- data.frame(a = c(1, 2),
                  b = c(1, 2))
df2 <- data.frame(a = c(1, 2),
                  b = c(1, 2))
df_list <- lst(df1, df2)
dplyr::bind_rows(df_list, .id = "df_name")
#   df_name a b
# 1     df1 1 1
# 2     df1 2 2
# 3     df2 1 1
# 4     df2 2 2

Upvotes: 0

Related Questions