Wendy
Wendy

Reputation: 144

Select all the rows belong to the groups that meet several conditions

I have a panel data with the following structure:

ID  Month  Action
1    1       0
1    2       0
1    3       1
1    4       1
2    1       0
2    2       1
2    3       0
2    4       1
3    1       0
3    2       0
3    3       0
4    1       0
4    2       1
4    3       1
4    4       0

where each ID has one row for each month, action indicates if this ID did this action in this month or not, 0 is no, 1 is yes.

I need to find the ID that has continuously had action=1 once they started the action (it does not matter in which month they started, but once started, in the following months the action should always be 1). I also wish to record all the rows that belong to these IDs in a new data frame.

How can I do this in R?

In my example, ID=1 consistently had action=1 since Month 3, so the final data frame I'm looking for should only have the rows belong to ID=1.

ID  Month  Action  
 1      1       0  
 1      2       0  
 1      3       1  
 1      4       1  

Upvotes: 1

Views: 49

Answers (2)

C8H10N4O2
C8H10N4O2

Reputation: 19005

You could do something like:

library(dplyr)

df %>% 
  group_by(ID) %>%
  filter(all(diff(Action)>=0) & max(Action)>0) -> newDF

This newDF includes only the IDs where (a) the Action is never decreasing (i.e., no 1=>0) and (b) there is at least one Action==1).

     ID Month Action
  <int> <int>  <int>
1     1     1      0
2     1     2      0
3     1     3      1
4     1     4      1

Upvotes: 3

Ronak Shah
Ronak Shah

Reputation: 389135

A base R approach using ave where we check if all the numbers after first occurrence of 1 are all 1. The addition of any condition is to remove enteries with all 0's.

df[with(df, as.logical(ave(Action, ID, FUN = function(x) {
       inds = cumsum(x)
       any(inds > 0)  & all(x[inds > 0] == 1) 
}))), ]

#  ID Month Action
#1  1     1      0
#2  1     2      0
#3  1     3      1
#4  1     4      1

Or another option with same logic but in a little concise way would be

df[with(df, ave(Action == 1, ID, FUN = function(x) 
   all(x[which.max(x):length(x)] == 1)
)), ]


#  ID Month Action
#1  1     1      0
#2  1     2      0
#3  1     3      1
#4  1     4      1

Upvotes: 2

Related Questions