Reputation: 144
I have a panel data with the following structure:
ID Month Action
1 1 0
1 2 0
1 3 1
1 4 1
2 1 0
2 2 1
2 3 0
2 4 1
3 1 0
3 2 0
3 3 0
4 1 0
4 2 1
4 3 1
4 4 0
where each ID has one row for each month, action indicates if this ID did this action in this month or not, 0 is no, 1 is yes.
I need to find the ID that has continuously had action=1 once they started the action (it does not matter in which month they started, but once started, in the following months the action should always be 1). I also wish to record all the rows that belong to these IDs in a new data frame.
How can I do this in R?
In my example, ID=1 consistently had action=1 since Month 3, so the final data frame I'm looking for should only have the rows belong to ID=1.
ID Month Action
1 1 0
1 2 0
1 3 1
1 4 1
Upvotes: 1
Views: 49
Reputation: 19005
You could do something like:
library(dplyr)
df %>%
group_by(ID) %>%
filter(all(diff(Action)>=0) & max(Action)>0) -> newDF
This newDF
includes only the ID
s where (a) the Action is never decreasing (i.e., no 1=>0) and (b) there is at least one Action==1
).
ID Month Action
<int> <int> <int>
1 1 1 0
2 1 2 0
3 1 3 1
4 1 4 1
Upvotes: 3
Reputation: 389135
A base R approach using ave
where we check if all the numbers after first occurrence of 1 are all 1. The addition of any
condition is to remove enteries with all 0's.
df[with(df, as.logical(ave(Action, ID, FUN = function(x) {
inds = cumsum(x)
any(inds > 0) & all(x[inds > 0] == 1)
}))), ]
# ID Month Action
#1 1 1 0
#2 1 2 0
#3 1 3 1
#4 1 4 1
Or another option with same logic but in a little concise way would be
df[with(df, ave(Action == 1, ID, FUN = function(x)
all(x[which.max(x):length(x)] == 1)
)), ]
# ID Month Action
#1 1 1 0
#2 1 2 0
#3 1 3 1
#4 1 4 1
Upvotes: 2