Deleting multiple Rows based on other Colums Value in R

Question

I hope someone can help me find the right direction for my Problem Let's say we have a data frame like this

year	Plant
2009	Monstera
2010	Monstera
2011	Monstera
2012	Monstera
2014	Monstera
2009	Pilea
2010	Pilea
2011	Pilea
2011	Philodendron
2012	Philodendron
2013	Philodendron

I want to remove all rows of a plant if the year starts 2009 but want to stop if one year is skipped the final data frame should look like this

year	Plant
2014	Monstera
2011	Philodendron
2012	Philodendron
2013	Philodendron

I the forum I found some information on this problem in excel, however I can't get it to work since I'm an absolute programming and R beginner.

Here are my code Ideas which currently don't work

list1<-list(unique(plants))

For (i in list1){
     if (dataset$year==2009){
     while i 
     -[c(year==2009)]
     ....
 break
  } else {
    ....

I know its not much but I really tried and I hope someone can help

Thank you!

Ben · Accepted Answer

If I understand the logic correctly, you could try this approach.

Using the dplyr package, put your dataset into groups, based on the Plant as well as consecutive years (where there is a difference of 1 year between rows, such as 2009, 2010, 2011...).

Then, keep or filter the rows of data where the first year of each group is not 2009.

The final ungroup and select will remove the made-up Group column so your results only include year and Plant.

library(dplyr)

dataset %>%
  group_by(Plant, Group = c(0, cumsum(diff(year) != 1))) %>%
  filter(first(year) != 2009) %>%
  ungroup() %>%
  select(-Group)

Output

   year Plant       
          
1  2014 Monstera    
2  2011 Philodendron
3  2012 Philodendron
4  2013 Philodendron

Deleting multiple Rows based on other Colums Value in R

Answers (1)

Related Questions