bvowe
bvowe

Reputation: 3394

Check if value meets condition and is between two dates

DATA = data.frame(STUDENT = c(1,1,1,2,3,3,3,3),
                  PASS = c(1, 0, 1, 0, 0, 0, 0, 1),
                  DATE = c('2019-12-01', '2020-03-11', '2020-06-17', '2020-02-09', 
                           '2020-10-12', '2020-11-01', '2020-12-07', '2020-06-05'),
                  WANT = c(1,1,1,0,0,0,0,0))

I have DATA which has STUDENT PASS DATE and I wish to create the VARIABLE WANT such that:

For every STUDENT, find the MAXIMUM(DATE) and then check if that student has any values PASS = 1 at any time between MAXIMUM(DATE) and when 'DATE' is 6 months before the value of MAXIMUM(DATE), if so then WANT = 1 otherwise WANT equals to zero.

I try this with no success

DATA$WANT = with(WANT, ifelse(DATE[PASS == 1] < max(DATE) & DATE[PASS == 1] >= max(DATE) %m-% months(6),1,0)

Upvotes: 0

Views: 86

Answers (1)

Jamie
Jamie

Reputation: 1965

This checks to see if DATE is less than max(DATE) minus 6 months. If so, then WANT is 0. Otherwise check to see if PASS == 1. If it is, then WANT is 1, otherwise it's 0 grouped by STUDENT.

Then I group by STUDENT to see if, WANT == 1 exists. If so then assign 1 to all rows for each STUDENT.

library(data.table)
library(lubridate)

data = setDT(data)[, DATE := as_date(DATE)][order(STUDENT, DATE)]

data[, WANT := ifelse(DATE < (max(DATE) %m-% months(6)), 0, 
                  ifelse(PASS == 1, 1, 0)), by = STUDENT]

data[, WANT := ifelse(any(WANT == 1), 1, 0), by = STUDENT]
   STUDENT PASS       DATE WANT
1:       1    1 2019-12-01    1
2:       1    0 2020-03-11    1
3:       1    1 2020-06-17    1
4:       2    0 2020-02-09    0
5:       3    1 2020-06-05    0
6:       3    0 2020-10-12    0
7:       3    0 2020-11-01    0
8:       3    0 2020-12-07    0

Upvotes: 2

Related Questions