Reputation: 65
I have a dataframe similar to this:
x <- data.frame("A" = c(11:24),
"B" = c(25,25,25,25,25,37,37,16,16,16,16,16,42,42),
"C" = c(1:3,1:2,1:2,1:3,1:2,1:2))
A B C
11 25 1
12 25 2
13 25 3
14 25 1
15 25 2
16 37 1
17 37 2
18 16 1
19 16 2
20 16 3
21 16 1
22 16 2
23 42 1
24 42 2
I want to keep only the rows where each value in B has at least one of all values (1-3) in C. So my result would look like:
A B C
11 25 1
12 25 2
13 25 3
14 25 1
15 25 2
18 16 1
19 16 2
20 16 3
21 16 1
22 16 2
I can't seem to get the right keywords in my search for answers.
Upvotes: 2
Views: 881
Reputation: 1784
Another option is to use data.table
to count unique C's for each B and then filter your data to only contain B's that have 3 distinct C's
library(data.table)
setDT(x)
x[B %in% x[,length(unique(C)),by=B][V1==3,B]]
Upvotes: 1
Reputation: 887951
We can use all
after grouping by 'B'
library(dplyr)
x %>%
group_by(B) %>%
filter(all(1:3 %in% C))
# A tibble: 10 x 3
# Groups: B [2]
# A B C
# <int> <dbl> <int>
# 1 11 25 1
# 2 12 25 2
# 3 13 25 3
# 4 14 25 1
# 5 15 25 2
# 6 18 16 1
# 7 19 16 2
# 8 20 16 3
# 9 21 16 1
#10 22 16 2
Upvotes: 1