Reputation: 545
We need to compare one col with another in a df and identify if the entry from col"a" matches any of the entries in col"b". The result would be a new col with T = match or F = no match.
# task df
df <- data.frame(
a = c("ABC", 'ABB', 'ACC', 'AAG'),
b = c("XXC TTZ", "XCT ABB", "TTG WHO ACC", 'AAG')
)
# expected result
df <- data.frame(
a = c("ABC", 'ABB', 'ACC', 'AAG'),
b = c("XXC", "XCT ABB", "TTG WHO ACC", 'AAG'),
match = c("F", "T", "T", "T")
)
I just come out of one year clinical rotation so my coding got a bit rusty. Could not find an answer here, excuse the hustle if this has been asked before. I guess the solution is rather straight forward but I can't wrap my head around it. Thanks a lot for helping (dplyr solutions much appreciated).
Upvotes: 1
Views: 40
Reputation: 101403
A base R option
transform(
df,
match = mapply(grepl, a, b, USE.NAMES = FALSE)
)
gives
a b match
1 ABC XXC TTZ FALSE
2 ABB XCT ABB TRUE
3 ACC TTG WHO ACC TRUE
4 AAG AAG TRUE
Upvotes: 1
Reputation: 887138
Use str_detect
from stringr
which is vectorized for both string and pattern
library(stringr)
library(dplyr)
df %>%
mutate(match = str_detect(b, a))
a b match
1 ABC XXC FALSE
2 ABB XCT ABB TRUE
3 ACC TTG WHO ACC TRUE
4 AAG AAG TRUE
Upvotes: 1