Canovice
Canovice

Reputation: 10173

In R and dplyr, replace multiple calls of "mutate" with a single call using "mutate" and "across"

We have the following dataframe in R

# Create example dataframe
df <- data.frame(gp = c(0, 1, 0, 1), 
                 col1A = c(1, 2, 3, 4), 
                 col1B = c(5, 6, 7, 8), 
                 col2A = c(11, 12, 13, 14), 
                 col2B = c(15, 16, 17, 18),
                 col3A = c(11, 12, 13, 14), 
                 col3B = c(15, 16, 17, 18))

We are looking to apply the following logic:

df %>%
  dplyr::mutate(col1A = ifelse(gp == 0, col1B, col1A)) %>%
  dplyr::mutate(col2A = ifelse(gp == 0, col2B, col2A)) %>%
  dplyr::mutate(col3A = ifelse(gp == 0, col3B, col3A))

However we are looking to replace the 3 mutate calls with 1 call that combines mutate and across (or some other approach). Assume we have these variables in strings, so aVars = c('col1A', 'col2A', 'col3A') and bVars = c('col1B', 'col2B', 'col3B').

Is this type of consolidation possible to do? We've used mutate and across together before, but it seems more difficult to do so when using two sets of variables like we are doing with the A and the B variables here...

Upvotes: 2

Views: 412

Answers (2)

Ma&#235;l
Ma&#235;l

Reputation: 52004

With dplyover::across2:

library(dplyr)
library(dplyover)
df %>% 
  mutate(across2(ends_with("A"), ends_with("B"), ~ ifelse(gp == 0, .y, .x), 
                 .names = "{xcol}"))

  gp col1A col1B col2A col2B col3A col3B
1  0     5     5    15    15    15    15
2  1     2     6    12    16    12    16
3  0     7     7    17    17    17    17
4  1     4     8    14    18    14    18

Maybe a more reliable answer, using glue and rlang. Specify the columns in cols, the function in exprs and the names of the new columns in names(exprs):

library(glue)
library(dplyr)

cols <- paste0("col", 1:3)
exprs <- glue("ifelse(gp == 0, {cols}B, {cols}A)")
names(exprs) <- glue("{cols}A")
df %>% 
  mutate(!!!rlang::parse_exprs(exprs))

  gp col1A col1B col2A col2B col3A col3B
1  0     5     5    15    15    15    15
2  1     2     6    12    16    12    16
3  0     7     7    17    17    17    17
4  1     4     8    14    18    14    18

Upvotes: 2

r2evans
r2evans

Reputation: 160447

We can use cur_column() and modify it (for the trailing "B") to reference both versions of each variable.

df %>%
  mutate(
    across(ends_with("A"),
           ~ if_else(gp == 0, cur_data()[[sub("A$", "B", cur_column())]], .))
  )
#   gp col1A col1B col2A col2B col3A col3B
# 1  0     5     5    15    15    15    15
# 2  1     2     6    12    16    12    16
# 3  0     7     7    17    17    17    17
# 4  1     4     8    14    18    14    18

Upvotes: 6

Related Questions