Rickard
Rickard

Reputation: 3670

mutate_at with named anonymous functions

We have a dataframe.

df <- data_frame(x = 1:5, y = 101:105) 

and a function that operates on a column and returns several columns

ff <- function(df, col) df %>% 
       mutate_at(col, funs(c1 = .*2,  c2 = .*3, c3 =  .*4))

How can replace the hardcoded column names c1, c2, c3 with names constructed from the parameter col e.g. paste0(col, 1).

such that

df %>% ff("x")

returns a tibble with

# A tibble: 10 × 5
       x     y    x1    x2    x3
    <int> <int> <dbl> <dbl> <dbl>
1      1   100     2     3     4
2      2   101     4     6     8
3      3   102     6     9    12
4      4   103     8    12    16
5      5   104    10    15    20

Upvotes: 0

Views: 650

Answers (1)

Mark Peterson
Mark Peterson

Reputation: 9570

You can take advantage of rename_ and pass it names that you can readily construct. Here, I am setting the names to match the hardcoded names using setNames then simply renaming them.

updatedFF <- function(df, col){
  colNames <-
    setNames(
      paste0("c", 1:3)
      , paste0(col, 1:3) )

  df %>% 
    mutate_at(col, funs(c1 = .*2,  c2 = .*3, c3 =  .*4)) %>%
    rename_(.dots = colNames)
}

df %>% updatedFF("x")

gives

# A tibble: 5 × 5
      x     y    x1    x2    x3
  <int> <int> <dbl> <dbl> <dbl>
1     1   101     2     3     4
2     2   102     4     6     8
3     3   103     6     9    12
4     4   104     8    12    16
5     5   105    10    15    20

Note that this will fail if you pass in more than one column name. That is because when you pass in a single column name, the named functions are prepended with the column name already. You can see this in your original ff function if you pass both "x" and "y":

df %>% ff(c("x", "y"))

gives

# A tibble: 5 × 8
      x     y  x_c1  y_c1  x_c2  y_c2  x_c3  y_c3
  <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1     1   101     2   202     3   303     4   404
2     2   102     4   204     6   306     8   408
3     3   103     6   206     9   309    12   412
4     4   104     8   208    12   312    16   416
5     5   105    10   210    15   315    20   420

This discrepancy can occasionally cause problems, so you may want to ensure that it is handled consistently by setting the names to have the column name included even when only one column is used. Here, it only resets names if only one column is passed and it sets them to match the format that occurs when multiple columns are passed in.

moreComplexFF <- function(df, col){
  if(length(col) == 1){
    colNames <-
      setNames(
        paste0("c", 1:3)
        , paste0(col, "_c", 1:3) )
  } else{
    colNames = NULL
  }

  df %>% 
    mutate_at(col, funs(c1 = .*2,  c2 = .*3, c3 =  .*4)) %>%
    rename_(.dots = colNames)
}

With one column, it works much like before (though with the "_c" included):

df %>% moreComplexFF(c("x"))

gives

      x     y  x_c1  x_c2  x_c3
  <int> <int> <dbl> <dbl> <dbl>
1     1   101     2     3     4
2     2   102     4     6     8
3     3   103     6     9    12
4     4   104     8    12    16
5     5   105    10    15    20

With two columns, it leaves the names alone (so it doesn't throw an error):

df %>% moreComplexFF(c("x", "y"))

gives

      x     y  x_c1  y_c1  x_c2  y_c2  x_c3  y_c3
  <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1     1   101     2   202     3   303     4   404
2     2   102     4   204     6   306     8   408
3     3   103     6   206     9   309    12   412
4     4   104     8   208    12   312    16   416
5     5   105    10   210    15   315    20   420

Upvotes: 2

Related Questions