Reputation: 59
Apologies if this has already been asked.
Let's say I have the following dataframe
sample = c("A", "B", "C", "D", "E")
bla_1 = c("CTX-M", NA, "CTX-M", NA, NA)
bla_2 = c(NA, "CTX-M", "OXA-1", NA, NA)
bla_3 = c(NA, "OXA-1", NA, "CTX-M", "OXA-1")
MIC = c(2, 4, 8, 16, 32)
df = data.frame(sample, bla_1, bla_2, bla_3, MIC)
I want to subset "df" so that I am left with the samples which contain "CTX-M". How do I achieve this when "CTX-M" exists in the three "bla_" columns?
Upvotes: 1
Views: 49
Reputation: 102625
A base R option using which
with argument arr.ind = TRUE
> df[which(df == "CTX-M", arr.ind = TRUE)[, "row"], ]
sample bla_1 bla_2 bla_3 MIC
1 A CTX-M <NA> <NA> 2
3 C CTX-M OXA-1 <NA> 8
2 B <NA> CTX-M OXA-1 4
4 D <NA> <NA> CTX-M 16
Upvotes: 1
Reputation: 21442
A base R
solution:
df[which(apply(df, 1, function(x) any(x == "CTX-M"))), ]
sample bla_1 bla_2 bla_3 MIC
1 A CTX-M <NA> <NA> 2
2 B <NA> CTX-M OXA-1 4
3 C CTX-M OXA-1 <NA> 8
4 D <NA> <NA> CTX-M 16
Upvotes: 2
Reputation: 887851
We can use filter
with if_any
library(dplyr)
library(stringr)
df %>%
filter(if_any(everything(), ~ str_detect(., 'CTX-M')))
-output
# sample bla_1 bla_2 bla_3 MIC
#1 A CTX-M <NA> <NA> 2
#2 B <NA> CTX-M OXA-1 4
#3 C CTX-M OXA-1 <NA> 8
#4 D <NA> <NA> CTX-M 16
Or for specific columns
df %>%
filter(if_any(bla_1:bla_3, ~ str_detect(., 'CTX-M')))
Upvotes: 1
Reputation: 4708
Is this what you are looking for?
library(tidyverse)
df %>%
filter_all(any_vars(str_detect(., "CTX-M")))
# sample bla_1 bla_2 bla_3 MIC
# 1 A CTX-M <NA> <NA> 2
# 2 B <NA> CTX-M OXA-1 4
# 3 C CTX-M OXA-1 <NA> 8
# 4 D <NA> <NA> CTX-M 16
or specifically looking at certain columns:
df %>%
filter_at(vars(bla_1, bla_2, bla_3), any_vars(str_detect(., "CTX-M")))
# sample bla_1 bla_2 bla_3 MIC
# 1 A CTX-M <NA> <NA> 2
# 2 B <NA> CTX-M OXA-1 4
# 3 C CTX-M OXA-1 <NA> 8
# 4 D <NA> <NA> CTX-M 16
Upvotes: 2