J. Doe
J. Doe

Reputation: 1750

dplyr - recode several columns at once

Suppose you have a dataframe with variables named X1 - X30 and Y1 - Y30. Each of these variables holds integers 1 - 5. We wish to recode some of the variables starting with X like this:

df %<>%
   mutate_at(vars(starts_with("X") & 
                  ends_with("5", "8", "16", "22", "28")), 
             recode, "1" = 5, "2" = 4, "4" = 2, "5" = 1)

This will, however, return the following error:

Error in UseMethod("recode") : 
  no applicable method for 'recode' applied to an object of class "c('tbl_df', 'tbl', 'data.frame')"

This is because recode needs to take a vector as an argument. So what is the way to bypass this?

Upvotes: 2

Views: 2583

Answers (3)

Adrian Fletcher
Adrian Fletcher

Reputation: 170

Adding a 2021 updated solution including the across function that supersedes the mutate_* functions as well as regex and tidy_select alternatives

library(dplyr)

set.seed(123)
(df <- data.frame("X1" = sample(1:5, 10, TRUE),
                 "X2" = sample(1:5, 10, TRUE),
                 "X3" = sample(1:5, 10, TRUE)))
#>    X1 X2 X3
#> 1   3  5  2
#> 2   3  3  1
#> 3   2  3  3
#> 4   2  1  4
#> 5   3  4  1
#> 6   5  1  3
#> 7   4  1  5
#> 8   1  5  4
#> 9   2  3  2
#> 10  3  2  5

with regex

df %>%
      mutate(across(matches("^X.*1|2$"),
                recode, "1" = 5, "2" = 4, "3" = 3,"4" = 2, "5" = 1))

#>    X1 X2 X3
#> 1   3  1  2
#> 2   3  3  1
#> 3   4  3  3
#> 4   4  5  4
#> 5   3  2  1
#> 6   1  5  3
#> 7   2  5  5
#> 8   5  1  4
#> 9   4  3  2
#> 10  3  4  5

without regex

df %>%
  mutate(across((starts_with("X") & ends_with(as.character(1:2))),
                recode, "1" = 5, "2" = 4, "3" = 3,"4" = 2, "5" = 1))

    #>    X1 X2 X3
    #> 1   3  1  2
    #> 2   3  3  1
    #> 3   4  3  3
    #> 4   4  5  4
    #> 5   3  2  1
    #> 6   1  5  3
    #> 7   2  5  5
    #> 8   5  1  4
    #> 9   4  3  2
    #> 10  3  4  5

Upvotes: 1

caldwellst
caldwellst

Reputation: 5956

mutate_at is entirely designed to take functions that take vectors as an argument, like recode, that is not the issue. Your error is just because you don't use select helpers as logical calls chained with &, instead chain them using , within vars().

Also, if you want what you were aiming for, you would want to use matches to select only columns starting with X and ending with certain numbers.

library(dplyr)

set.seed(123)
df <- data.frame("X1" = sample(1:5, 10, TRUE),
                 "X2" = sample(1:5, 10, TRUE),
                 "X3" = sample(1:5, 10, TRUE)) 
df
#>    X1 X2 X3
#> 1   3  5  2
#> 2   3  3  1
#> 3   2  3  3
#> 4   2  1  4
#> 5   3  4  1
#> 6   5  1  3
#> 7   4  1  5
#> 8   1  5  4
#> 9   2  3  2
#> 10  3  2  5

df %>%
  mutate_at(vars(matches("^X.*1|2$")),
            recode, "1" = 5, "2" = 4, "3" = 3,"4" = 2, "5" = 1)
#>    X1 X2 X3
#> 1   3  1  2
#> 2   3  3  1
#> 3   4  3  3
#> 4   4  5  4
#> 5   3  2  1
#> 6   1  5  3
#> 7   2  5  5
#> 8   5  1  4
#> 9   4  3  2
#> 10  3  4  5

Upvotes: 3

StupidWolf
StupidWolf

Reputation: 47008

One option is to substring the colnames, and then do mutate_if:

set.seed(111)
df = data.frame(matrix(round(runif(60*4,min=1,max=5)),ncol=60))
colnames(df) = c(paste0("X",1:30),paste0("Y",1:30))

start_X = substr(colnames(df),1,1) == "X"
ends_w = substr(colnames(df),2,nchar(colnames(df))) %in% c("5", "8", "16", "22", "28")

df %>% 
mutate_if(start_X & ends_w,
recode, "1" = 5, "2" = 4, "4" = 2, "5" = 1) %>%
select(c("X5","X8","X16","X22","X28"))

  X5 X8 X16 X22 X28
1  4  2   5   5   3
2  1  3   3   4   1
3  4  5   4   2   4
4  3  3   4   2   2

df %>% select(c("X5","X8","X16","X22","X28"))
  X5 X8 X16 X22 X28
1  2  4   1   1   3
2  5  3   3   2   5
3  2  1   2   4   2
4  3  3   2   4   4

Upvotes: 0

Related Questions