Reputation: 63
I'm trying to figure out a way for me to use grepl()
of only one partial pattern over multiple columns with mutate()
. I want to have a new column that will be TRUE or FALSE if ANY of a set of columns contains a certain string.
df <- structure(list(ID = c("A1.1234567_10", "A1.1234567_20"),
var1 = c("NORMAL", "NORMAL"),
var2 = c("NORMAL", "NORMAL"),
var3 = c("NORMAL", "NORMAL"),
var4 = c("NORMAL", "NORMAL"),
var5 = c("NORMAL", "NORMAL"),
var6 = c("NORMAL", "NORMAL"),
var7 = c("NORMAL", "ABNORMAL"),
var8 = c("NORMAL", "NORMAL")),
.Names = c("ID", "var1", "var2", "var3", "var4", "var5", "var6", "var7", "var8"),
class = "data.frame", row.names = c(NA, -2L))
ID var1 var2 var3 var4 var5 var6 var7 var8
1 A1.1234567_10 NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL
2 A1.1234567_20 NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL ABNORMAL NORMAL
I tried
df$abnormal %>% mutate( abnormal = ifelse(grepl("abnormal",df[,119:131]) , TRUE, FALSE)))
and about 100 other things. I want the final format to be
ID var1 var2 var3 var4 var5 var6 var7 var8 abnormal
1 A1.1234567_10 NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL FALSE
2 A1.1234567_20 NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL ABNORMAL NORMAL TRUE
Whenever I try I get false every time
Upvotes: 1
Views: 3349
Reputation: 146164
I'd probably do this:
temp = sapply(your_data[columns_you_want_to_check],
function(x) grepl("suspected", x, ingore.case = TRUE))
your_data$abnormal = rowSums(temp) > 0
I just used your_data
since your question switches between df
and test.file
.
If you really want to use mutate
, you could do
df %>%
mutate(abnormal = rowSums(
sapply(select(., starts_with("var")),
function(x) grepl("suspected", x, ingore.case = TRUE)
)) > 0
)
If you need more efficiency, you can use fixed = TRUE
instead of ignore.case = TRUE
if you can count on the case being consistent. (Maybe convert everything to_lower()
first.)
Leave off the > 0
to get the count for each row.
Upvotes: 3