Reap409
Reap409

Reputation: 39

Rename multiple variables at once using dplyr

I am trying to rename multiple variables at once using Dplyr. Here is what the data looks like:

ID   Satisfaction_Baseline   Satisfaction_FollowUp   Satisfaction_Exit
1               1                      3                     4
2               5                      5                     5
3               5                      3                     4
4               5                      3                     2
5               5                      3                     5           

There is a lot of variables that have '_Baseline', '_FollowUp' and '_Exit'. I really would like a shorthand way to rename variables at once so the data looks more like this:

    ID   Satisfaction_1         Satisfaction_2        Satisfaction_3
1               1                      3                     4
2               5                      5                     5
3               5                      3                     4
4               5                      3                     2
5               5                      3                     5  

I tried this:

rename(D_wide,  Satisfaction_1 = Satisfaction_Baseline)

This works... but I wanted to know if there was a succinct way to write this piece of code so every variable that ends with '_Baseline', '_FollowUp' and '_Exit' end with '_1', '_2' and '_3'. Thank you!

Upvotes: 1

Views: 132

Answers (3)

G. Grothendieck
G. Grothendieck

Reputation: 270045

1) dplyr Since there are only a small number of keywords (Baseline, FollowUp and Exit) we can simply pipe the second argument of rename_with through three sub calls. If the keywords do not appear in other column names as is the situation here we could optionally omit the $ signs.

library(dplyr)

df %>%
  rename_with(. %>%
    sub("Baseline$", 1, .) %>%
    sub("FollowUp$", 2, .) %>%
    sub("Exit$", 3, .)
)

giving

  ID Satisfaction_1 Satisfaction_2 Satisfaction_3
1  1              1              3              4
2  2              5              5              5
3  3              5              3              4
4  4              5              3              2
5  5              5              3              5

2) purrr Alternately the same code works with set_names in place of rename_with . In this case dplyr is not used.

library(purrr)

df %>%
  set_names(. %>%
    sub("Baseline$", 1, .) %>%
    sub("FollowUp$", 2, .) %>%
    sub("Exit$", 3, .)
)

3) Base R To ensure that we only need to write df once at the beginning so as to maintain the left to right nature of the pipe create a list with a single component data. Then we can use data multiple times in the following line.

df |>
  list(data = _) |>
  with(setNames(data, names(data) |>
    sub("Baseline$", 1, x = _) |> 
    sub("FollowUp$", 2, x = _) |>
    sub("Exit$", 3, x = _)
  ))

4) Reduce This Base R variation uses Reduce with a list L whose names are the keywords and whose values are to replace them. Then use Reduce to repeatedly apply sub.

L <- list(Baseline = 1, FollowUp = 2, Exit = 3)

df |>
  list(data = _) |>
  with(setNames(data, 
    Reduce(\(x, nm) sub(nm, L[[nm]], x), names(L), init = names(data))))

5) gsubfn Create a list L as in (4) whose names are the keywords and whose values are to replace them. Then use setNames with new names computed using gsubfn. Pass gsubfn a pattern that matches substrings not containing underscore and ending at the end of the string. It will look for matches to that pattern and any that equal a name in L will be replaced by the corresponding value in L.

library(gsubfn)

L <- list(Baseline = 1, FollowUp = 2, Exit = 3)
df |>
  list(data = _) |>
  with(setNames(data, gsubfn("[^_]+$", L, names(data))))

Note

The input in reproducible form:

df <- data.frame(
  ID = 1:5,
  Satisfaction_Baseline = rep(c(1L, 5L), c(1L, 4L)),
  Satisfaction_FollowUp = c(3L, 5L, 3L, 3L, 3L),
  Satisfaction_Exit = c(4L, 5L, 4L, 2L, 5L)
)

Upvotes: 5

TarJae
TarJae

Reputation: 79184

If Satisfaction is the first part for all names we can use rename_with like see here:

library(dplyr)

df <- tibble(
  Satisfaction_Baseline = 1:3,
  Satisfaction_Followup = 1:3, 
  Satisfaction_Exit = 1:3
)


df |> 
  rename_with(~paste0("Satisfaction_", 1:ncol(df)))
  Satisfaction_1 Satisfaction_2 Satisfaction_3
           <int>          <int>          <int>
1              1              1              1
2              2              2              2
3              3              3              3

Upvotes: 2

Eonema
Eonema

Reputation: 1320

rename_with() lets you apply a function to the column names, and str_replace_all() lets you substitute multiple pattern-replacement pairs:

D_wide |> rename_with(~ stringr::str_replace_all(.x, c(
  "_Baseline" = "_1",
  "_FollowUp" = "_2",
  "_Exit"     = "_3"
)))

Upvotes: 3

Related Questions