NewToPython
NewToPython

Reputation: 37

Find the previous date that meets a condition in r

I am trying to calculate the days since meeting using the lag function. I want the days since meeting to reset to 0 when there is meeting and calculate the number of days since the meeting from that point. Any help on this is greatly appreciated.

Group_c                ActivityDate     meeting    days since meeting
003U000000Q6LlIIAV  1/1/2019    1   0
003U000000Q6LlIIAV  1/2/2019    0   1
003U000000Q6LlIIAV  1/3/2019    0   2
003U000000Q6LlIIAV  1/4/2019    0   3
003U000000Q6LlIIAV  1/5/2019    0   4
003U000000Q6LlIIAV  1/6/2019    0   5
003U000000Q6LlIIAV  1/7/2019    0   6
003U000000Q6LlIIAV  1/8/2019    0   7
003U000000Q6LlIIAV  1/9/2019    1   0
003U000000Q6LlIIAV  1/10/2019   0   1
003U000000Q6LlIIAV  1/11/2019   0   2
003U000000Q6LlIIAV  1/12/2019   0   3
003U000000Q6LlIIAV  1/13/2019   0   4


DSM <- DSM %>% 
  arrange(Group__c,ActivityDate) %>% 
  group_by(Group__c) %>% 
  mutate(DaysSinceMeeting=   ActivityDate-lag(ActivityDate[Meeting>0]))

Upvotes: 2

Views: 193

Answers (1)

Maurits Evers
Maurits Evers

Reputation: 50668

I don't think lag is necessary here, since you're interested in a date difference relative to an initial date (rather than between successive dates). Here is an option using difftime

df %>%
    group_by(Group_c) %>%
    mutate(MeetingGrp = cumsum(meeting)) %>%
    group_by(Group_c, MeetingGrp) %>%
    mutate(
        ActivityDate = as.Date(ActivityDate, format = "%m/%d/%Y"),
        DaysSinceMeeting = difftime(ActivityDate, first(ActivityDate), units = "days"))
## A tibble: 13 x 6
## Groups:   Group_c, MeetingGrp [2]
#   Group_c     ActivityDate meeting days.since.meet… MeetingGrp DaysSinceMeeting
#   <fct>       <date>         <int>            <int>      <int> <time>
# 1 003U000000… 2019-01-01         1                0          1 0 days
# 2 003U000000… 2019-01-02         0                1          1 1 days
# 3 003U000000… 2019-01-03         0                2          1 2 days
# 4 003U000000… 2019-01-04         0                3          1 3 days
# 5 003U000000… 2019-01-05         0                4          1 4 days
# 6 003U000000… 2019-01-06         0                5          1 5 days
# 7 003U000000… 2019-01-07         0                6          1 6 days
# 8 003U000000… 2019-01-08         0                7          1 7 days
# 9 003U000000… 2019-01-09         1                0          2 0 days
#10 003U000000… 2019-01-10         0                1          2 1 days
#11 003U000000… 2019-01-11         0                2          2 2 days
#12 003U000000… 2019-01-12         0                3          2 3 days
#13 003U000000… 2019-01-13         0                4          2 4 days

Explanation: We create a group label MeetingGrp for every block of dates that start with meeting == 1; then group by Group_c and MeetingGrp and calculate the days since the first meeting using difftime.


Sample data

df <- read.table(text =
    "Group_c                ActivityDate     meeting    'days since meeting'
003U000000Q6LlIIAV  1/1/2019    1   0
003U000000Q6LlIIAV  1/2/2019    0   1
003U000000Q6LlIIAV  1/3/2019    0   2
003U000000Q6LlIIAV  1/4/2019    0   3
003U000000Q6LlIIAV  1/5/2019    0   4
003U000000Q6LlIIAV  1/6/2019    0   5
003U000000Q6LlIIAV  1/7/2019    0   6
003U000000Q6LlIIAV  1/8/2019    0   7
003U000000Q6LlIIAV  1/9/2019    1   0
003U000000Q6LlIIAV  1/10/2019   0   1
003U000000Q6LlIIAV  1/11/2019   0   2
003U000000Q6LlIIAV  1/12/2019   0   3
003U000000Q6LlIIAV  1/13/2019   0   4", header = T)

Upvotes: 2

Related Questions