Reputation: 59
a newbie here. I have a dataframe genes
which contains two columns: Index
and Name
such as:
Index Name
1 A
2 B
3 C
4 D
Another dataframe similarity
contains 6-7 columns one of which is Members
which includes different Index
values separated by a space such as:
Members
1 3 5 7
3 7
6 9 2
What I am trying to do is to replace the indices to Names
by matching the index
column from the dataframe genes
. If the index is not found on genes
, I simply want to put NA
in its position.
So, based the example, my desired output is:
Members
A C NA NA
C NA
NA NA B
.
Upvotes: 1
Views: 367
Reputation: 887851
We may do this with chartr
and gsub
df2$Members <- gsub("\\d+", "NA", chartr(paste(df1$Index, collapse=""),
paste(df1$Name, collapse=""), df2$Members))
df2
# Members
#1 A C NA NA
#2 C NA
#3 NA NA B
Or another approach is to split
the 'Members' column and then do the match based on the key/value pair from the first dataset ('df1')
df2$Members <- sapply(strsplit(df2$Members, "\\s+"), function(x)
paste(setNames(df1$Name, df1$Index)[x], collapse=" "))
df1 <- structure(list(Index = 1:4, Name = c("A", "B", "C", "D")), .Names = c("Index",
"Name"), class = "data.frame", row.names = c(NA, -4L))
df2 <-structure(list(Members = c("1 3 5 7", "3 7", "6 9 2")),
.Names = "Members", class = "data.frame", row.names = c(NA, -3L))
Upvotes: 1