Reputation: 801
I have data like this:
Incident.ID.. = c(rep("INCFI0000029582",4), rep("INCFI0000029587",4))
date = c("2014-09-25 08:39:45", "2014-09-25 08:39:48", "2014-09-25 08:40:44", "2014-10-10 23:04:00", "2014-09-25 08:33:32", "2014-09-25 08:34:41", "2014-09-25 08:35:24", "2014-10-10 23:04:00")
status = c("assigned", "in.progress", "resolved", "closed", "assigned", "resolved", "resolved", "closed")
date.diff=c (3, 56, 1347796,0 ,69 ,43, 1348116, 0)
df = data.frame(Incident.ID..,date, status, date.diff, stringsAsFactors = FALSE)
df
Incident.ID.. date status date.diff
1 INCFI0000029582 2014-09-25 08:39:45 assigned 3
2 INCFI0000029582 2014-09-25 08:39:48 in.progress 56
3 INCFI0000029582 2014-09-25 08:40:44 resolved 1347796
4 INCFI0000029582 2014-10-10 23:04:00 closed 0
5 INCFI0000029587 2014-09-25 08:33:32 assigned 69
6 INCFI0000029587 2014-09-25 08:34:41 resolved 43
7 INCFI0000029587 2014-09-25 08:35:24 resolved 1348116
8 INCFI0000029587 2014-10-10 23:04:00 closed 0
And I would like to pick only the rows with status "resolved" for a certain Incident.ID.. when it's not followed by the status of the same Incident.ID.. "closed" (there might be rows with only "resolved" or only "closed" -rows, so that's why the Incident.ID.. has to be the same when making the comparison).
For example here in this example data, only this row would be picked:
6 INCFI0000029587 2014-09-25 08:34:41 resolved 43
So how could I please accomplish this?
Upvotes: 3
Views: 69
Reputation: 3194
library(data.table) #using the development version of data.table
setDT(df)[, .SD[status == "resolved" & shift(status, type = "lead") != "closed"], by = Incident.ID..]
Incident.ID.. date status date.diff
1: INCFI0000029587 2014-09-25 08:34:41 resolved 43
P.S. updated as per comment of @David
Upvotes: 3
Reputation: 70256
Here's a simple approach using dplyr to group the data by Incident ID and then filter (select rows) using the "lead" function to look into the next row:
library(dplyr)
df %>%
group_by(Incident.ID..) %>%
filter(status == "resolved" & lead(status) != "closed") # you can add %>% ungroup() if required
#Source: local data frame [1 x 4]
#Groups: Incident.ID..
#
# Incident.ID.. date status date.diff
#1 INCFI0000029587 2014-09-25 08:34:41 resolved 43
Upvotes: 3