Reputation: 3
I have a data set in R that has 4 columns: size of turtle hatchlings, number of nests, years, beach.
I want to create a new data frame, excluding the nests for which I measured less than 10 hatchlings. So I need to exclude rows based on the length of the column Size, for unique combinations of "Year", "Beach" and "Nest". Thank you.
Upvotes: 0
Views: 153
Reputation: 887951
We can use data.table
. Convert the 'data.frame' to 'data.table' (setDT(df1)
), grouped by 'Year', 'Beach', 'Nest', we subset the groups where the length
of unique
elements of "Hatchling_Number" is greater than or equal to 10
library(data.table)
setDT(df1)[, if(uniqueN(Hatchling_Number)>=10) .SD, by = .(Year, Beach, Nest)]
or in case there are no duplicate "Hatchling_Number" per each group, we can use .N >=10
for subsetting.
setDT(df1)[, if(.N >=10) .SD, by = .(Year, Beach, Nest)]
Upvotes: 2