Reputation: 1251
I have a data set with column names like below:
colnames(samp)
[1] "RESPID" "SAMPLE" "Weight" "Q1" "Q19A_1"
[6] "Q19B_1" "Q19C_1" "Q19E_1" "Q19F_1" "RECORDERLOOP_Q20_1"
[11] "RECORDERLOOP_Q20_2" "RECORDERLOOP_Q20_3" "RECORDERLOOP_Q20_4" "Q20_1_1" "Q20_2_1"
[16] "Q20_3_1"
For the column names that start with "Q19" or "Q20" (i.e. a certain pattern), I want to remove _1 (i.e. _ and the number).
I know how it works for one column (e.g. Q19). It would be something like this:
library(dplyr)
samp_subset = samp %>%
select(dplyr::contains("Q19"))
colnames(samp_subset) = sub('.{02}$', '', colnames(samp_subset))
However, I don't know how to define the expression of certain columns (e.g. for Q19 and Q20 but not for RESPID or Sample etc.).
Upvotes: 0
Views: 32
Reputation: 887118
We can use
library(dplyr)
library(stringr)
df %>%
rename_at(vars(matches("^Q(19|20)")), ~ str_remove(., "_\\d+$"))
Upvotes: 1
Reputation: 388982
Using dplyr
, you can try rename_at
library(dplyr)
df %>% rename_at(vars(matches("^Q19|^Q20")), ~sub("_\\d+$", "", .))
Using base R, I think would involve two steps identify the columns and replace the values.
vals <- grep("^Q19|^Q20", names(df))
names(df)[vals] <- sub("_\\d+$", "", names(df)[vals])
Upvotes: 2