Reputation: 57
Supopose I have a dataframe df
> df
ID Year Weight
1 Brown 1960 5.0
2 Green 1990 3.0
3 Yellow 1961 4.8
4 Green 1994 7.0
5 Green 1993 6.0
6 Brown 1964 8.0
7 Yellow 1960 4.6
If I want to subset all IDs with a weight greater than or equal to 5, I will simply say:
> df[df$Weight >= 5, ]
ID Year Weight
1 Brown 1960 5
4 Green 1994 7
5 Green 1993 6
6 Brown 1964 8
Unfortunately, the Green under year 1990 has been left out because the weight is less than 5. Is there a way that can retain all IDs as long as one of their weights is greater than or equal to 5?
Desired output
> output
ID Year Weight
1 Green 1990 3
2 Green 1993 6
3 Green 1994 7
4 Brown 1960 5
5 Brown 1964 8
Many thanks!
Upvotes: 2
Views: 95
Reputation: 2330
Converting to a data.table
:
> library(data.table)
> setDT(df)
> df[ID %in% df[Weight>5, ID]]
ID Year Weight
1: Brown 1960 5
2: Green 1990 3
3: Green 1994 7
4: Green 1993 6
5: Brown 1964 8
Upvotes: 2
Reputation: 389275
Using dplyr
, we can group_by
ID
and use filter
library(dplyr)
df %>% group_by(ID) %>% filter(any(Weight > 5))
# ID Year Weight
# <chr> <dbl> <dbl>
#1 Brown 1960 5
#2 Green 1990 3
#3 Green 1994 7
#4 Green 1993 6
#5 Brown 1964 8
Or with data.table
library(data.table)
setDT(df)
df[, .SD[any(Weight > 5)], ID]
Upvotes: 2
Reputation: 102710
Here is a base R solution with ave()
and subset()
dfout <- subset(df, as.logical(with(df,ave(Weight, ID, FUN = function(x) any(x>=5)))))
such that
> dfout
ID Year Weight
1 Brown 1960 5
2 Green 1990 3
4 Green 1994 7
5 Green 1993 6
6 Brown 1964 8
Upvotes: 1
Reputation: 522712
We can use dplyr
here, and retain only rows per ID
such that one member of the group has a weight of 5 or higher:
temp <- df %>%
group_by(ID) %>%
mutate(Min_Weight = max(Weight))
output <- temp[temp$Min_Weight >= 5, ]
output[order(output$ID), ]
ID Year Weight Min_Weight
<chr> <dbl> <dbl> <dbl>
1 Brown 1960 5 8
2 Brown 1964 8 8
3 Green 1990 3 7
4 Green 1994 7 7
5 Green 1993 6 7
Data:
df <- data.frame(ID=c("Brown", "Green", "Yellow", "Green", "Green", "Brown", "Yellow"),
Year=c(1960, 1990, 1961, 1994, 1993, 1964, 1960),
Weight=c(5.0, 3.0, 4.8, 7.0, 6.0, 8.0, 4.6), stringsAsFactors=FALSE)
Upvotes: 2