GNMO11
GNMO11

Reputation: 2259

R 12 month lookback column from separate dataframe

I have a dataframe of ID's and what months they were active that looks like:

ID     MONTH     LOCATION    ACTIVE
1    2019-01-01     A          1
1    2019-02-01     A          1
1    2019-03-01     A          1
1    2019-04-01     A          1
2    2020-02-01     B          1                 
2    2020-03-01     B          1                 

I also have a dataframe of logins that are floored to the first of the month that looks like:

ID     LOGIN    LOCATION   
1    2018-01-01     A     
1    2018-02-01     A     
2    2019-12-01     B

What I would like to do is add a column to the first dataframe which indicates if the user had a login within 12 months of that month. Which would look like:

ID     MONTH     LOCATION    ACTIVE    LOGIN_IN_PAST_12_MONTHS
1    2019-01-01     A          1                 1
1    2019-02-01     A          1                 1
1    2019-03-01     A          1
1    2019-04-01     A          1
2    2020-02-01     B          1                 1
2    2020-03-01     B          1                 1
2    2020-04-01     B          1                 1

Upvotes: 2

Views: 28

Answers (1)

akrun
akrun

Reputation: 887601

We may use a join on the 'ID', 'LOCATION', get the difftime between the 'MONTH' and 'LOGIN' columns (assuming they are Date class), create a logical condition that if the difference is less than a year

library(data.table)
setDT(df1)[df2, LOGIN_IN_PAST_12_MONTHS := +(difftime(MONTH, LOGIN, 
       units = "weeks") <= 53), on = .(ID, LOCATION), by = .EACHI]

-output

df1
   ID      MONTH LOCATION ACTIVE LOGIN_IN_PAST_12_MONTHS
1:  1 2019-01-01        A      1                       1
2:  1 2019-02-01        A      1                       1
3:  1 2019-03-01        A      1                       0
4:  1 2019-04-01        A      1                       0
5:  2 2020-02-01        B      1                       1
6:  2 2020-03-01        B      1                       1

data

df1 <- structure(list(ID = c(1L, 1L, 1L, 1L, 2L, 2L), MONTH = structure(c(17897, 
17928, 17956, 17987, 18293, 18322), class = "Date"), LOCATION = c("A", 
"A", "A", "A", "B", "B"), ACTIVE = c(1L, 1L, 1L, 1L, 1L, 1L)), row.names = c(NA, 
-6L), class = "data.frame")

df2 <- structure(list(ID = c(1L, 1L, 2L), LOGIN = structure(c(17532, 
17563, 18231), class = "Date"), LOCATION = c("A", "A", "B")), row.names = c(NA, 
-3L), class = "data.frame")

Upvotes: 3

Related Questions