Reputation: 763
I have the following data frame
filen<-c('510-1','510-2','510-2','510-2','510-3','510-3','510-4')
disp<-c('g','ng','ng','ng','g','ng','ng')
df<-data.frame(filen,disp)
filen disp
1 510-1 g
2 510-2 ng
3 510-2 ng
4 510-2 ng
5 510-3 g
6 510-3 ng
7 510-4 ng
Basically I want to isolate the file numbers where ng
is the only type of disp associated with that filen. So that I get a dataset like this. How do I do this using dplyr
filen disp
510-2 ng
510-4 ng
Upvotes: 1
Views: 490
Reputation: 887501
We can group by 'filen', filter
the groups where all
the 'disp' values are 'ng' and get the distinct
rows
library(dplyr)
df %>%
group_by(filen) %>%
filter( all(disp == 'ng')) %>%
distinct
# A tibble: 2 x 2
# Groups: filen [2]
# filen disp
# <fct> <fct>
#1 510-2 ng
#2 510-4 ng
Or
df %>%
distinct %>%
group_by(filen) %>%
filter(n_distinct(disp) == 1, disp == 'ng')
Or we can use data.table
library(data.table)
setDT(unique(df))[, .SD[uniqueN(disp)==1 & disp == "ng"], filen]
Upvotes: 3
Reputation: 389135
Using base R, we one option can be to calculate the frequency of df
using table
, find the filen
which have ng
value greater than 0 and g
value equal to 0 and keep only unique
rows.
df1 <- as.data.frame.matrix(table(df))
unique(df[df$filen %in% rownames(df1)[df1$ng > 0 & df1$g == 0], ])
# filen disp
#2 510-2 ng
#7 510-4 ng
Or with ave
unique(df[ave(df$disp == "ng", df$filen, FUN = all), ])
Upvotes: 1