nad7wf
nad7wf

Reputation: 97

Replace certain columns in dataframe with corresponding names from another dataframe

I have a dataframe with SRR names as column headers, and I would like to replace those with their corresponding PI names from another dataframe, using dplyr.

SRR dataframe:

CHR  POS  ALLELE  SRR6  SRR8  SRR9  SRR10
01   10   A,T     A     T     T     A
01   20   C,G     G     C     C     C
02   15   T       T     T     T     T

PI dataframe:

PI_NAME  SRR_NAME
PI1      SRR6
PI2      SRR7
PI3      SRR8
PI4      SRR9
PI5      SRR10

Desired Output:

CHR  POS  ALLELE  PI1   PI3   PI4   PI5
01   10   A,T     A     T     T     A
01   20   C,G     G     C     C     C
02   15   T       T     T     T     T

So far, I've tried something like this:

SRR %>%
   rename_at(vars(matches("SRR")), funs(str_replace(., ., PI$PI_NAME[PI$SRR == .])))

but have not been successful.

Thanks in advance for any help.

Upvotes: 1

Views: 37

Answers (1)

akrun
akrun

Reputation: 886928

We can use a named key/value vector to match the column names and replace the names

library(dplyr)
SRR %>% 
   rename_at(vars(matches("SRR")), ~ setNames(PI$PI_NAME, PI$SRR_NAME)[.])
#  CHR POS ALLELE PI1 PI3 PI4 PI5
#1   1  10    A,T   A   T   T   A
#2   1  20    C,G   G   C   C   C
#3   2  15      T   T   T   T   T

It can be translated in base R as well

i1 <- grep("SRR", names(SRR))
names(SRR)[i1] <- setNames(PI$PI_NAME, PI$SRR_NAME)[names(SRR)[i1]]

data

SRR <- structure(list(CHR = c(1L, 1L, 2L), POS = c(10L, 20L, 15L), ALLELE = c("A,T", 
"C,G", "T"), SRR6 = c("A", "G", "T"), SRR8 = c("T", "C", "T"), 
    SRR9 = c("T", "C", "T"), SRR10 = c("A", "C", "T")), class = "data.frame",
       row.names = c(NA, 
-3L))

PI <- structure(list(PI_NAME = c("PI1", "PI2", "PI3", "PI4", "PI5"), 
    SRR_NAME = c("SRR6", "SRR7", "SRR8", "SRR9", "SRR10")), 
    class = "data.frame", row.names = c(NA, 
-5L))

Upvotes: 1

Related Questions