Reputation: 763
I have the following data frame
fileN<-c("510-1","510-1","510-2","510-2","510-3","510-3","510-3")
disp<-c("account","fail","fail","fail","account","account","fail")
df<-data.frame(fileN,disp)
df
fileN disp
1 510-1 account
2 510-1 fail
3 510-2 fail
4 510-2 fail
5 510-3 account
6 510-3 account
7 510-3 fail
I want to obtain unique fileN
's where the disp
has both fail
and account
I know I can do the following to get only those with a fail
df %>%
group_by(fileN) %>%
filter( all(disp == 'fail')) %>%
distinct
but how do I get the fileN
of those that have both fail
and account
so that the desired result is
510-1
510-3
Upvotes: 1
Views: 57
Reputation: 887711
An option would be after grouping by 'fileN', filter
where the all
the elements in the desired vector
is present %in%
'disp' column, then ungroup
and get the distinct
elements from 'fileN'
library(dplyr)
df %>%
group_by(fileN) %>%
filter(all(c('fail', 'account') %in% disp)) %>%
ungroup %>%
distinct(fileN)
# A tibble: 2 x 1
# fileN
# <fct>
#1 510-1
#2 510-3
If these are the only values possible, another option is
distinct(df) %>%
group_by(fileN) %>%
filter(n() == 2) %>%
distinct(fileN)
Upvotes: 4
Reputation: 32558
Couple of base R options
v = tapply(df$disp, df$fileN, function(x){
all(c("account", "fail") %in% x)
})
v[v]
#510-1 510-3
# TRUE TRUE
#OR
with(aggregate(disp ~ fileN, df, function(x){
all(c("account", "fail") %in% x)
}), fileN[disp])
#[1] 510-1 510-3
#Levels: 510-1 510-2 510-3
Upvotes: 2