Reputation: 301
I have a dataframe [df] like this:
id device date speed incident
1 B3 2020-04-15 08:00 23 0
2 B3 2020-04-15 09:00 21 0
3 B3 2020-04-15 10:00 54 1
4 B3 2020-04-15 11:00 52 1
5 B3 2020-04-15 12:00 24 0
6 B3 2020-04-15 13:00 12 0
7 B3 2020-04-16 09:00 51 1
8 B3 2020-04-16 10:00 16 0
9 B3 2020-04-16 11:00 20 0
10 B3 2020-04-16 12:00 21 0
11 B3 2020-04-16 13:00 19 0
I would like to know whether there is a way of subset the data so that only rows where incident = 1 remain with the one before and after the incident and assign an id for each incident group.
The preferred result would look like this:
id device date pressure warning group
2 B3 2020-04-15 09:00 21 0 1
3 B3 2020-04-15 10:00 54 1 1
4 B3 2020-04-15 11:00 52 1 1
5 B3 2020-04-15 12:00 24 0 1
6 B3 2020-04-15 13:00 12 0 2
7 B3 2020-04-16 09:00 51 1 2
8 B3 2020-04-16 10:00 16 0 2
Thank you very much for any suggestions.
Upvotes: 0
Views: 42
Reputation: 389275
Here is one base R approach :
#Get row numbers where incident = 1
ones <- which(df$incident == 1)
#Create groups of consecutive ones
inds <- split(ones,cumsum(c(TRUE, diff(ones) > 1)))
#subset the dataframe by taking -1, +1 of inds in each list
#Create a group column and combine the data in one dataframe
do.call(rbind, Map(function(x, y)
transform(df[c(min(x) - 1, x, max(x) + 1), ], group = y),
inds, names(inds)))
# id device date speed incident group
#1.2 2 B3 2020-04-1509:00 21 0 1
#1.3 3 B3 2020-04-1510:00 54 1 1
#1.4 4 B3 2020-04-1511:00 52 1 1
#1.5 5 B3 2020-04-1512:00 24 0 1
#2.6 6 B3 2020-04-1513:00 12 0 2
#2.7 7 B3 2020-04-1609:00 51 1 2
#2.8 8 B3 2020-04-1610:00 16 0 2
Upvotes: 1