user10577351
user10577351

Reputation: 67

Rename multiple columns with a vector

I have gene expression data where the column names are Ensembl gene names, including the version (.xx). I want to replace the column names for a few columns (ignoring the .xx) with the gene symbols. My genes of interest are in a "genes" vector, and the corresponding Ensembl names are in an "ensembl" vector.

df <- tibble(~"sample", ~"Ensembl1111.13", ~"Ensembl2222.1",
             "sample_A", 12, 20)
genes <- c("ACTB", "GAPDH")
ensembl <- c("Ensembl1111", "Ensembl2222")

I tried to use rename_at(), with no success. Where is the mistake in my syntax? Is there a simpler/better way?

Thanks!!

df_ren <- df %>%
    rename_at(vars(starts_with(ensembl)), funs(str_replace(., "/.", genes)))

Upvotes: 1

Views: 141

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388797

You can remove everything after "." from the names, match it with ensembl and replace it with genes value.

names(df)[match(ensembl, sub('\\..*', '', names(df)))] <- genes
#  sample    ACTB GAPDH
#  <chr>    <dbl> <dbl>
#1 sample_A    12    20

Upvotes: 0

akrun
akrun

Reputation: 886938

Instead of starts_with, we could use matches as starts_with expects a single input, while matches can have multiple values if we create a single string concatened with | (OR)

library(dplyr)
library(stringr)
df %>%
   rename_at(vars(matches(str_c(ensembl, collapse="|"))),
          ~genes) 

-output

# A tibble: 1 x 3
#  sample    ACTB GAPDH
#  <chr>    <dbl> <dbl>
1 sample_A    12    20

Or another option is to remove the . followed by the characters in the column names that starts with 'Ensembl' and use a named vector in rename

df %>% 
    rename_at(vars(starts_with("Ensembl")), ~ str_remove(., "\\..*")) %>%
    rename(!!! setNames(ensembl, genes))
# A tibble: 1 x 3
#  sample    ACTB GAPDH
#  <chr>    <dbl> <dbl>
#1 sample_A    12    20

Upvotes: 1

Related Questions