Reputation: 311
I have the following table and would like to filter based on the criteria below
Replicating the data first:
dt1 <- data.frame(ID = c("a", "a", "a", "a", "a","a","a","a",
"b","b","b","b","b","b","b","b",
"c","c","c","c","c","c","c","c",
"d","d","d","d","d","d","d","d"), value = c(0,0,1,1,2,0,0,1,
1,1,1,2,2,2,2,2,
1,1,1,1,1,3,3,3,
0,2,2,2,2,2,2,3))
Now, I would like to create a condition by ID such that it meets the following condition:
1) There is an increase in value (>=1)
2) The increase in value stays the same
3) Min starting of the increase should be within the last 3 consecutive rows (basically ID:"D" does not qualify)
According to the above table, only B and C qualify
I have done the following so far but it not working for me properly especially the 3rd criteria.
dt1 %>% group_by(ID) %>% mutate(change = value -lag(value))
%>% filter(all(change %in% c(2,1,0,NA), na.rm = T))
Upvotes: 1
Views: 244
Reputation: 887951
An option would be to group by 'ID', filter
the groups that have only increasing adjacent elements and not any value decreasing, then filter the groups with frequency of the 'value' greater than or equal to 3 for all
the elements
library(tidyverse)
library(data.table)
dt1 %>%
group_by(ID) %>%
filter(n_distinct(cumsum(c(1, diff(value) < 0))) == 1) %>%
filter(all(table(rleid(value))>=3))
# A tibble: 16 x 2
# Groups: ID [4]
# ID value
# <fct> <dbl>
# 1 b 1
# 2 b 1
# 3 b 1
# 4 b 2
# 5 b 2
# 6 b 2
# 7 b 2
# 8 b 2
# 9 c 1
#10 c 1
#11 c 1
#12 c 1
#13 c 1
#14 c 3
#15 c 3
#16 c 3
Upvotes: 2