Reputation: 2690
I have the following sample dataframe in R:
Test <- data.frame("Individual"=c("John", "John", "Alice", "Alice", "Alice", "Eve", "Eve","Eve","Jack"), "ExamNumber"=c("Test1", "Test2", "Test1", "Test2", "Test3", "Test1", "Test2", "Test3", "Test3"))
Which Gives:
Individual ExamNumber
1 John Test1
2 John Test2
3 Alice Test1
4 Alice Test2
5 Alice Test3
6 Eve Test1
7 Eve Test2
8 Eve Test3
9 Jack Test3
However I want to remove any Individual who does not have all three test to result in:
Individual ExamNumber
1 Alice Test1
2 Alice Test2
3 Alice Test3
4 Eve Test1
5 Eve Test2
6 Eve Test3
Upvotes: 0
Views: 101
Reputation: 12703
Using base R
ind_eq3 <- names( which( with( Test, by( Test,
INDICES = list(Individual),
FUN = function(x) length(unique(x$ExamNumber)) == 3) ) ) )
with(Test, Test[ Individual %in% ind_eq3, ] )
# Individual ExamNumber
# 3 Alice Test1
# 4 Alice Test2
# 5 Alice Test3
# 6 Eve Test1
# 7 Eve Test2
# 8 Eve Test3
Using data.table
library('data.table')
setDT(Test)[ ,
j = .SD[length( unique(ExamNumber) ) == 3, ],
by = 'Individual']
Upvotes: 2
Reputation: 2434
Here is another way using dplyr
to check whether all three tests exist within groups:
library(dplyr)
Test %>%
group_by(Individual) %>%
filter(all(c("Test1", "Test2", "Test3") %in% ExamNumber)) %>%
ungroup()
# A tibble: 6 × 2
Individual ExamNumber
<fctr> <fctr>
1 Alice Test1
2 Alice Test2
3 Alice Test3
4 Eve Test1
5 Eve Test2
6 Eve Test3
Upvotes: 3
Reputation: 32538
You can use ave
to group by Individual and check if the count for each group is 3 using NROW
Test[ave(1:nrow(Test), Test$Individual, FUN = NROW)==3,]
# Individual ExamNumber
#3 Alice Test1
#4 Alice Test2
#5 Alice Test3
#6 Eve Test1
#7 Eve Test2
#8 Eve Test3
And here is a slightly more robust approach using same idea but with split
Test[order(Test$Individual),][unlist(lapply(split(Test, Test$Individual), function(a)
rep(all(unique(Test$ExamNumber) %in% a$ExamNumber), NROW(a)))),]
Upvotes: 2