David Jorquera
David Jorquera

Reputation: 2102

Drop rows according to condition on different columns

I have a big dataframe where I need to erase rows according to a condition given in each level of a factor (country). I have data for a variable through different years, but where there are duplicated years, I need to go with just one of them. Here is a minimal dataframe:

datos <- data.frame(Country = c(rep("Australia", 4), rep("Belgium", 4)), 
         Year = c(2010, 2011, 2012, 2012, 2010, 2011, 2011, 2012), 
         method = c("Method1", "Method1", "Method1", "Method2", "Method1", 
                    "Method1", "Method2", "Method1"))

Now I want R to do the following:

"For each country, in case that there is a repeated Year, erase the row where method is equal to Method1".

Upvotes: 0

Views: 57

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389355

Using dplyr, we can group_by Country and Year and filter negate the rows where number of rows for each group is greater than 1 and method == "Method1.

library(dplyr)
datos %>%
  group_by(Country, Year) %>%
  filter(!(n() > 1 & method == "Method1"))

#  Country    Year method 
#  <fct>     <dbl> <fct>  
#1 Australia  2010 Method1
#2 Australia  2011 Method1
#3 Australia  2012 Method2
#4 Belgium    2010 Method1
#5 Belgium    2011 Method2
#6 Belgium    2012 Method1

Using the same logic with base R ave

datos[!with(datos, ave(method == "Method1", Country, Year, 
                   FUN = function(x)  length(x) > 1 & x)), ]

#    Country Year  method
#1 Australia 2010 Method1
#2 Australia 2011 Method1
#4 Australia 2012 Method2
#5   Belgium 2010 Method1
#7   Belgium 2011 Method2
#8   Belgium 2012 Method1

Upvotes: 4

Related Questions