Dennis
Dennis

Reputation: 107

Adding prefix to specific R columns

I can select specific R columns by using "colnames" and "grepl". For example, to select columns starting with "EXP" and having 3 characters after that, I use the following code:

df[ , grepl( "EXP...", names( df) ) ]

( # using ... to indicate there should be 3 characters after EXP)

I can store the names of these specific columns by simply adding "colnames" function to the beginning.

colnames1 <- colnames(df[ , grepl( "EXP...", names( df) ) ])

So far, I have no problems with reaching these specific column names. I use the following code to add a suffix, "New", to these column names:

colnames(df[ , grepl( "EXP...", names( df) ) ])
<- paste("New",df(data1[ , grepl( "EXP...", names( df) ) ] ), sep="_")

But, it does not work. Shouldn't the names of these columns appear with the suffix ("New_EXP...") after using this code?

Thanks in advance.

Upvotes: 0

Views: 1294

Answers (3)

LMc
LMc

Reputation: 18642

A base R solution:

names(df) <- ifelse(grepl("EXP...", names(df)), paste0("New_", names(df)), names(df))

You can do this easily with the dplyr library using the tidy-select helper matches

library(dplyr)

df %>% 
  dplyr::rename_with(~ paste0("New_", .x), matches("EXP.{3}"))

Upvotes: 1

utubun
utubun

Reputation: 4520

You aren't obliged to use grepl(). It's possible to use gsub() instead. Since gsub() if no match found, returns unchanged values, you can change your column names on the fly, without intermediate indexing step:

Example data set

sed.seed(66248086) 

(
  dat <- data.frame(
    ID         = 1:3,
    EXPFRQ     = runif(3),
    EXPMETADAT = sample(c(T, F), 3, T),
    .EXPGRP    = sample(LETTERS[1:3], 3, T),
    EXPOUT     = sample(c('*', '**', '***'), 3, T),
    NOTES      = c("Bad project...", "What is in this tube?", "Blot, blot western baby")
    )
  )

#   ID       EXPFRQ EXPMETADAT .EXPGRP EXPOUT                   NOTES
# 1  1 0.5483680151      FALSE       C      *          Bad project...
# 2  2 0.0628816469      FALSE       B     **   What is in this tube?
# 3  3 0.0001267055       TRUE       B    *** Blot, blot western baby

Actually where renaming takes place

Just give to gsub() your pattern, and call the chunk it found using back-reference (\\1):

colnames(dat) <- gsub('^(EXP[A-Z]{3})$', 'New_\\1', colnames(dat))

Result

dat

#   ID   New_EXPFRQ EXPMETADAT .EXPGRP New_EXPOUT                   NOTES
# 1  1 0.5483680151      FALSE       C          *          Bad project...
# 2  2 0.0628816469      FALSE       B         **   What is in this tube?
# 3  3 0.0001267055       TRUE       B        *** Blot, blot western baby

Note: Please, reade the note written by @akrun under his answer!

Upvotes: 1

akrun
akrun

Reputation: 887138

Instead of dong the assingment by subsetting the data, the assignment can be done directly on the names or colnames

# // get the index where there are are EXP followed by three characters
i1 <- grep("EXP...", names(df))
# // subset the names with the index, use paste to create new names
nm2 <- paste0(names(df)[i1], "_New")
# // change the column names with the newly created names
names(df)[i1] <- nm2

NOTE: NO external packages are needed

Upvotes: 1

Related Questions