Reputation: 2102
I have a big dataframe where I need to erase rows according to a condition given in each level of a factor (country). I have data for a variable through different years, but where there are duplicated years, I need to go with just one of them. Here is a minimal dataframe:
datos <- data.frame(Country = c(rep("Australia", 4), rep("Belgium", 4)),
Year = c(2010, 2011, 2012, 2012, 2010, 2011, 2011, 2012),
method = c("Method1", "Method1", "Method1", "Method2", "Method1",
"Method1", "Method2", "Method1"))
Now I want R to do the following:
"For each country, in case that there is a repeated Year
, erase the row where method
is equal to Method1
".
Upvotes: 0
Views: 57
Reputation: 389355
Using dplyr
, we can group_by
Country
and Year
and filter
negate the rows where number of rows for each group is greater than 1 and method == "Method1
.
library(dplyr)
datos %>%
group_by(Country, Year) %>%
filter(!(n() > 1 & method == "Method1"))
# Country Year method
# <fct> <dbl> <fct>
#1 Australia 2010 Method1
#2 Australia 2011 Method1
#3 Australia 2012 Method2
#4 Belgium 2010 Method1
#5 Belgium 2011 Method2
#6 Belgium 2012 Method1
Using the same logic with base R ave
datos[!with(datos, ave(method == "Method1", Country, Year,
FUN = function(x) length(x) > 1 & x)), ]
# Country Year method
#1 Australia 2010 Method1
#2 Australia 2011 Method1
#4 Australia 2012 Method2
#5 Belgium 2010 Method1
#7 Belgium 2011 Method2
#8 Belgium 2012 Method1
Upvotes: 4