Reputation: 7517
In my data.frame below, I wonder how to subset a whole cluster of study
that has any outcome
larger than 1
in it?
My desired output is shown below. I tried subset(h, outcome > 1)
but that doesn't give my desired output.
h = "
study outcome
a 1
a 2
a 1
b 1
b 1
c 3
c 3"
h = read.table(text = h,h=T)
DESIRED OUTPUT:
"
study outcome
a 1
a 2
a 1
c 3
c 3"
Upvotes: 1
Views: 35
Reputation: 887158
Modify the subset
-
outcome > 1
%in%
on the 'study' to create the final logical expression in subset
subset(h, study %in% study[outcome > 1])
-output
study outcome
1 a 1
2 a 2
3 a 1
6 c 3
7 c 3
If we want to limit the number of 'study' elements having 'outcome' value 1, i.e. the first 'n' 'study', then get the unique
'study' from the first expression of subset, use head
to get the first 'n' 'study' values and use %in%
to create logical expression
n <- 3
subset(h, study %in% head(unique(study[outcome > 1]), n))
Or can be done with a group by approach with any
library(dplyr)
h %>%
group_by(study) %>%
filter(any(outcome > 1)) %>%
ungroup
Upvotes: 1