Reputation: 89
Suppose I have a data.frame in the following format:
Site CowId Result
FarmA 1000 c("Aerococcus viridans", "Staphylococcus chromogenes")
FarmA 1001 Staphylococcus aureus
FarmA 1002 Contaminated
How can I check if Staphylococcus chromogenes is a member within any of the sets without unnesting any potential vectors within the Result column?
df <- structure(list(Site = structure(c(1L, 1L, 1L), .Label = "FarmA", class = "factor"), CowId = 1000:1002, Result = list(c("Aerococcus viridans", "Staphylococcus chromogenes"), "Staphylococcus aureus", "Contaminated")), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, -3L), groups = structure(list( Site = structure(c(1L, 1L, 1L), .Label = "FarmA", class = "factor"), CowId = 1000:1002, .rows = structure(list(1L, 2L, 3L), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame" ), row.names = c(NA, -3L), .drop = TRUE))
Upvotes: 5
Views: 143
Reputation: 25323
Another possible solution, based on dplyr
:
library(dplyr)
df %>%
rowwise %>%
mutate(Presence = "Staphylococcus chromogenes" %in% Result)
#> # A tibble: 3 × 4
#> # Rowwise: Site, CowId
#> Site CowId Result Presence
#> <fct> <int> <list> <lgl>
#> 1 FarmA 1000 <chr [2]> TRUE
#> 2 FarmA 1001 <chr [1]> FALSE
#> 3 FarmA 1002 <chr [1]> FALSE
Upvotes: 4
Reputation: 16836
Another option is to use tidyverse
. Here, I mutate
a new column so that you can see where the taxa occurs. I use str_detect
within map
to check to see if the string occurs within a given list, then return TRUE
if the string occurs at all in a given list (i.e., using any
).
library(tidyverse)
df %>%
mutate(taxa_present = map_lgl(Result, function(v)
str_detect(v, "Staphylococcus chromogenes") %>% any()))
Output
# A tibble: 3 × 4
# Groups: Site, CowId [3]
Site CowId Result taxa_present
<fct> <int> <list> <lgl>
1 FarmA 1000 <chr [2]> TRUE
2 FarmA 1001 <chr [1]> FALSE
3 FarmA 1002 <chr [1]> FALSE
Or if you just want a simple logical vector, then you could just do:
map_lgl(df$Result, function(v)
str_detect(v, "Staphylococcus chromogenes") %>% any())
#[1] TRUE FALSE FALSE
Upvotes: 3
Reputation: 101064
Try grepl
+ toString
> grepl("Staphylococcus chromogenes", sapply(df$Result, toString), fixed = TRUE)
[1] TRUE FALSE FALSE
Upvotes: 4
Reputation: 24069
You could use lapply/sapply to test the string on all of the members of df$Result.
testString <-"Staphylococcus chromogenes"
sapply(df$Result, function(results){testString %in% results})
#[1] TRUE FALSE FALSE
Upvotes: 6