Reputation: 371
I have a dataframe in which I'd like to subset a column to only contain strings that match multiple strings in a different column. Here's some mock data:
df1 <- data.frame(species = c("Rufl","Rufl","Soca","Assp","Assp","Elre"),
state = c("warmed","ambient","warmed","warmed","ambient","ambient"))
I'd like have a dataframe with only species that match both the "warmed" and "ambient" states, removing species that only match one string, so the final dataframe would have "Rufl" and "Assp" with their given states, as shown below
species state
Rufl warmed
Rufl ambient
Assp warmed
Assp ambient
I've tried a few different attempts at this, both with the subset function and dplyr, but can't figure out the right way to get this to work. Here's my failed attempts:
df2 <- subset(df1$species, state == "warmed" & state == "ambient")
# or this?
df2 <- df1 %>%
group_by(species) %>%
filter(state == "warmed",
state == "ambient")
Thanks for the help!
Using R version 4.0.2, Mac OS X 10.13.6
Upvotes: 2
Views: 752
Reputation: 101343
Another base R option using ave
subset(
df1,
ave(state, species, FUN = function(x) sum(c("warmed", "ambient") %in% x)) == 2
)
gives
species state
1 Rufl warmed
2 Rufl ambient
4 Assp warmed
5 Assp ambient
Upvotes: 0
Reputation: 887118
We need a group by all
library(dplyr)
df1 %>%
group_by(species) %>%
filter(all(c('warmed', 'ambient') %in% state)) %>%
ungroup
-output
# A tibble: 4 x 2
# species state
# <chr> <chr>
#1 Rufl warmed
#2 Rufl ambient
#3 Assp warmed
#4 Assp ambient
The &
operation doesn't work as the elements are not present in the same location
Or using subset
subset(df1, species %in% names(which(rowSums(table(df1) > 0) == 2)))
Upvotes: 1