Change the name of multiple columns

Question

I have a data set with column names like below:

colnames(samp) 

[1] "RESPID"             "SAMPLE"             "Weight"             "Q1"                 "Q19A_1"            
 [6] "Q19B_1"             "Q19C_1"             "Q19E_1"             "Q19F_1"             "RECORDERLOOP_Q20_1"
[11] "RECORDERLOOP_Q20_2" "RECORDERLOOP_Q20_3" "RECORDERLOOP_Q20_4" "Q20_1_1"            "Q20_2_1"           
[16] "Q20_3_1"

For the column names that start with "Q19" or "Q20" (i.e. a certain pattern), I want to remove _1 (i.e. _ and the number).

I know how it works for one column (e.g. Q19). It would be something like this:

library(dplyr)

samp_subset = samp %>%
  select(dplyr::contains("Q19")) 

colnames(samp_subset) = sub('.{02}$', '', colnames(samp_subset))

However, I don't know how to define the expression of certain columns (e.g. for Q19 and Q20 but not for RESPID or Sample etc.).

Ronak Shah · Accepted Answer

Using dplyr, you can try rename_at

library(dplyr)
df %>%  rename_at(vars(matches("^Q19|^Q20")), ~sub("_\d+$", "", .))

Using base R, I think would involve two steps identify the columns and replace the values.

vals <- grep("^Q19|^Q20", names(df))
names(df)[vals] <- sub("_\d+$", "", names(df)[vals])

Change the name of multiple columns

Answers (2)

Related Questions