Reputation: 67
I have gene expression data where the column names are Ensembl gene names, including the version (.xx). I want to replace the column names for a few columns (ignoring the .xx) with the gene symbols. My genes of interest are in a "genes" vector, and the corresponding Ensembl names are in an "ensembl" vector.
df <- tibble(~"sample", ~"Ensembl1111.13", ~"Ensembl2222.1",
"sample_A", 12, 20)
genes <- c("ACTB", "GAPDH")
ensembl <- c("Ensembl1111", "Ensembl2222")
I tried to use rename_at()
, with no success. Where is the mistake in my syntax? Is there a simpler/better way?
Thanks!!
df_ren <- df %>%
rename_at(vars(starts_with(ensembl)), funs(str_replace(., "/.", genes)))
Upvotes: 1
Views: 141
Reputation: 388797
You can remove everything after "."
from the names, match
it with ensembl
and replace it with genes
value.
names(df)[match(ensembl, sub('\\..*', '', names(df)))] <- genes
# sample ACTB GAPDH
# <chr> <dbl> <dbl>
#1 sample_A 12 20
Upvotes: 0
Reputation: 886938
Instead of starts_with
, we could use matches
as starts_with
expects a single input, while matches
can have multiple values if we create a single string concatened with |
(OR
)
library(dplyr)
library(stringr)
df %>%
rename_at(vars(matches(str_c(ensembl, collapse="|"))),
~genes)
-output
# A tibble: 1 x 3
# sample ACTB GAPDH
# <chr> <dbl> <dbl>
1 sample_A 12 20
Or another option is to remove the .
followed by the characters in the column names that starts with 'Ensembl' and use a named vector in rename
df %>%
rename_at(vars(starts_with("Ensembl")), ~ str_remove(., "\\..*")) %>%
rename(!!! setNames(ensembl, genes))
# A tibble: 1 x 3
# sample ACTB GAPDH
# <chr> <dbl> <dbl>
#1 sample_A 12 20
Upvotes: 1