Reputation: 29
I hope someone can help me find the right direction for my Problem Let's say we have a data frame like this
year | Plant |
---|---|
2009 | Monstera |
2010 | Monstera |
2011 | Monstera |
2012 | Monstera |
2014 | Monstera |
2009 | Pilea |
2010 | Pilea |
2011 | Pilea |
2011 | Philodendron |
2012 | Philodendron |
2013 | Philodendron |
I want to remove all rows of a plant if the year starts 2009 but want to stop if one year is skipped the final data frame should look like this
year | Plant |
---|---|
2014 | Monstera |
2011 | Philodendron |
2012 | Philodendron |
2013 | Philodendron |
I the forum I found some information on this problem in excel, however I can't get it to work since I'm an absolute programming and R beginner.
Here are my code Ideas which currently don't work
list1<-list(unique(plants))
For (i in list1){
if (dataset$year==2009){
while i
-[c(year==2009)]
....
break
} else {
....
I know its not much but I really tried and I hope someone can help
Thank you!
Upvotes: 0
Views: 37
Reputation: 30474
If I understand the logic correctly, you could try this approach.
Using the dplyr
package, put your dataset
into groups, based on the Plant
as well as consecutive years (where there is a difference of 1 year between rows, such as 2009, 2010, 2011...).
Then, keep or filter
the rows of data where the first
year of each group is not 2009.
The final ungroup
and select
will remove the made-up Group
column so your results only include year
and Plant
.
library(dplyr)
dataset %>%
group_by(Plant, Group = c(0, cumsum(diff(year) != 1))) %>%
filter(first(year) != 2009) %>%
ungroup() %>%
select(-Group)
Output
year Plant
<int> <chr>
1 2014 Monstera
2 2011 Philodendron
3 2012 Philodendron
4 2013 Philodendron
Upvotes: 0